I just wrote a simple converter for my Blog Image, and I started to see some of the infrastructure that exists in MT more clearly. The script traverses the files in my locally published blog directory, and does all the proper substitutions. It creates the new tree of files which can then eventually be posted to my public site.
What I discovered is that there is an HTTP Cookie that is used for comments, so all the URLs that reference comments need to be excluded from the global replacement. I'll be checking how this works as soon as I have a proper version of the site available. The web server machine still lacks a proper install of MT, which is waiting until I have solved the most basic issues. Then I will describe the entire step-by-step process to get MT installed with my real goals in mind.
There are a few more features in MT that now are coming to light as problems for publishing to another server. The main one I just spotted just now is that the Ping features are fairly useless unless MT is already on the public server. I'd have to create a parallel version of the Ping functionality that was activated when I was pushing files up to my public server.
The other issue is that comments are a big pain in the butt. I'd have to leave URLs to my local MT install in the published versions of my HTML files. As long I don't have a huge slam of comments on a blog entry, that would work, but it does kinda kill my anonymity if I felt I needed it.
Since I am planning to put the Blog Image on a public server, my first concern is to normalize the site treatment of URLs. I noticed in the blog configuration for MT has some URLs that it uses to prefix all the generated URLs. These generated URLs are used to reference the blog archive, but they are also used in the RSS and RDF feeds. The first problem to solve here is that the use of relative URLs is a bad idea if the XML feeds are to be propagated outside of the site. I figured this out by first trying to put in a blank URL for the blog, which is an error, apparently. Then I got tricky with the URL, using "http:./". This is a valid URL, but is missing the domain name part of a standard full URL. This translates to: "use the http protocol to reference a page in the same host and path as the referrer." Then I rebuilt the blog site, and everything looked good until I had a peek at the XML feeds. All the HTML files had proper links, all relative to my blog URL, but the XML feeds had the same relative paths the HTML had! Yowch! Ok, so that is just not going to work. I'll need to post-process the pages or the XML to have the correct URLs. An MT plug-in might exist, but I might just have to write a script that creates a public blog image from the local copy. Ptooey.
When I've been in professional (read corporate) situations where web development is being done, the problem is always about how to simulate the production environment. Some web development tools (like Fireworks) simulate the web server environment so that it intercepts the top-level domain names, which allows the developer to use full domain name URLs on the pages being developed. Otherwise, a browser would be going to the public IP address of the production server. One typical solution to this problem is to add the name of the public server to the local hosts file, and presuming that the hosts file has precedence over DNS, the local server will get all the traffic for the domain. This lets developers use less sophisticated tools (cheaper too) to edit files on the site and still be able to test eveything just like it would work in production. The problem is that you then need to disable to local host file version of the domain name to see the real production site, unless you have another machine to view it with. Still, this can be terribly confusing at times when you think you know what you are looking at but are totally wrong.
So this brings me to the shortcoming in MT, which is that I ought to be able to decouple the Blog Image from the site it is published to, but my XML feeds should have the full URL of the blog. The reason I want this is because I am not really a local consumer of the XML feeds, but I am a consumer of the HTML.
Ok, so it looks like the better way to go for now is to write a Blog Image converter. That way I have a distinctive string that I can use Perl to replace. I would like MT to have an alternate URL prefix that it would use to create a Blog Image I can just FTP someplace.
Hosting is a bit of a pain, because you really are a slave to the fast backbone connection. Those of us who have fast machines locally (much more common these days) can certainly handle the load of a local web server, and 100BaseT intranets can now easily be joined to DSL connectivity (which is what I use at home), firewalls and all. The problem is that I really can't use my ADSL connection to provide for many people at once. So god help me if a local web server became (*gasp*) popular. Then my upstream speed (~128kb) is serving everyone's downstream speed (~700kb for ADSL), and choking my normal browsing experience. That would be bad. I don't have the steady cash right now to pay for some of the faster DSL speeds available these days, so I have to limit my exposure. Further more, if I really did have a faster DSL connection, I'd have to re-architect my local intranet completely. My web server box would have to be connected to a switch, not a hub, and the DSL would have to go through a 10/100 Hub that is connected to to a gateway machine that is dual-homed. Then I would keep the web server on a different subnet that has separate traffic from all my other machines. In the data center of the public web server (where I intend to publish my blog), they have fast switches and routers that can handle the load, and no locally generated traffic to compete with. So for those of us seeking a low-cost solution to Hosting-at-Home, the solution is sadly complex. If I got IDSL, then my upstream speed would be better, but then my downstream would suffer. Sigh.
Over the next few days, I'll be documenting the way I have set up Moveable Type to imitate Radio User Land. The main goal is to achieve the following:
This will create a proper environment for creating and publishing content. The next step is to integrate MT with an RSS Aggregator (Ampheta Desk) so that I can more easily cross-post the feeds that I am reading. This may require some tricky Perl programming, but I believe that it can be done.
The first step in reaching these goals is to describe how to install MT from scratch. To get there, I have to describe some issues concerning Hosting, DNS, Web Servers, CGI, and URLs.
I finished an installation of MT today, but it was a bit of a run-around. The documentation gave me a hint, but the thing is such an ocean to dive into that I was lost for a while. Fortunately, I have written systems that have similar capabilities, so I was able to recognize the issues that needed to be solved and solve them. I'll be scrapping the installation several times and scripting all the steps I took to get to a functioning install. I'll be setting it up to meet my goals (described above), so there should be a decent amount of explanation along the way about why you'd want to do everything the way I suggest.
The goal of this whole saga is not to make anyone who reads it into a system administrator or a webmaster. It is to generate a more abstract kind of knowledge about the critical semantics at the technical level that make MT easy to use for regular users. Right now it is hard to see how any regular user could possibly struggle through the process.
Let me also state up front that I am doing this on a Red Hat Linux 8.0 installation, on a Dual Pentium-II 400Mhz machine. The initial description will be in terms of that, but I presume that anyone using the same Open-Source and Free-Ware tools ought to be able to do the same. I'm hoping that on Win32, you'd at least be able to use Cygwin to do everything. I hear that on Mac OS X Perl is installed a bit differently, so I'd like to ultimately compile all the instructions necessary to accomodate variant like that as well.
This is my very first entry. I am really just testing.