February 07, 2003

URL Issues

Since I am planning to put the Blog Image on a public server, my first concern is to normalize the site treatment of URLs. I noticed in the blog configuration for MT has some URLs that it uses to prefix all the generated URLs. These generated URLs are used to reference the blog archive, but they are also used in the RSS and RDF feeds. The first problem to solve here is that the use of relative URLs is a bad idea if the XML feeds are to be propagated outside of the site. I figured this out by first trying to put in a blank URL for the blog, which is an error, apparently. Then I got tricky with the URL, using "http:./". This is a valid URL, but is missing the domain name part of a standard full URL. This translates to: "use the http protocol to reference a page in the same host and path as the referrer." Then I rebuilt the blog site, and everything looked good until I had a peek at the XML feeds. All the HTML files had proper links, all relative to my blog URL, but the XML feeds had the same relative paths the HTML had! Yowch! Ok, so that is just not going to work. I'll need to post-process the pages or the XML to have the correct URLs. An MT plug-in might exist, but I might just have to write a script that creates a public blog image from the local copy. Ptooey.

When I've been in professional (read corporate) situations where web development is being done, the problem is always about how to simulate the production environment. Some web development tools (like Fireworks) simulate the web server environment so that it intercepts the top-level domain names, which allows the developer to use full domain name URLs on the pages being developed. Otherwise, a browser would be going to the public IP address of the production server. One typical solution to this problem is to add the name of the public server to the local hosts file, and presuming that the hosts file has precedence over DNS, the local server will get all the traffic for the domain. This lets developers use less sophisticated tools (cheaper too) to edit files on the site and still be able to test eveything just like it would work in production. The problem is that you then need to disable to local host file version of the domain name to see the real production site, unless you have another machine to view it with. Still, this can be terribly confusing at times when you think you know what you are looking at but are totally wrong.

So this brings me to the shortcoming in MT, which is that I ought to be able to decouple the Blog Image from the site it is published to, but my XML feeds should have the full URL of the blog. The reason I want this is because I am not really a local consumer of the XML feeds, but I am a consumer of the HTML.

Ok, so it looks like the better way to go for now is to write a Blog Image converter. That way I have a distinctive string that I can use Perl to replace. I would like MT to have an alternate URL prefix that it would use to create a Blog Image I can just FTP someplace.

Hosting is a bit of a pain, because you really are a slave to the fast backbone connection. Those of us who have fast machines locally (much more common these days) can certainly handle the load of a local web server, and 100BaseT intranets can now easily be joined to DSL connectivity (which is what I use at home), firewalls and all. The problem is that I really can't use my ADSL connection to provide for many people at once. So god help me if a local web server became (*gasp*) popular. Then my upstream speed (~128kb) is serving everyone's downstream speed (~700kb for ADSL), and choking my normal browsing experience. That would be bad. I don't have the steady cash right now to pay for some of the faster DSL speeds available these days, so I have to limit my exposure. Further more, if I really did have a faster DSL connection, I'd have to re-architect my local intranet completely. My web server box would have to be connected to a switch, not a hub, and the DSL would have to go through a 10/100 Hub that is connected to to a gateway machine that is dual-homed. Then I would keep the web server on a different subnet that has separate traffic from all my other machines. In the data center of the public web server (where I intend to publish my blog), they have fast switches and routers that can handle the load, and no locally generated traffic to compete with. So for those of us seeking a low-cost solution to Hosting-at-Home, the solution is sadly complex. If I got IDSL, then my upstream speed would be better, but then my downstream would suffer. Sigh.

Posted by David Cymbala at February 7, 2003 03:22 PM
Comments
Post a comment