Archive for April, 2004

Googlebot and RSS

Thursday, April 22nd, 2004

Dave Winer is upset because he thinks Google’s Googlebot web crawler has started looking for Atom and RSS 1.0 files, while excluding RSS 2.0. However, a quick look at my logs reveals that Googlebot is crawling my RSS 2 feed just fine:

64.68.82.143 - - [22/Apr/2004:02:15:25 -0400] "GET /xml/rss2.xml HTTP/1.0"
200 12891 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

I don’t see any requests for /atom.xml in my logs. There have been a few requests for /index.rdf, but that’s not suspicious since that file did exist on my server until a couple of months ago, and was linked to from my MovableType-generated home page until I edited the template.

It looks to me like Googlebot is just doing what a web crawler should do: Crawling all files linked to from the main page. If Dave’s anonymous correspondant is seeing hits on /index.rdf and /atom.xml, it probably means that his pages contain links to those files. Googlebot isn’t going to guess that a file called /index.xml exists – if you want Googlebot to crawl it, link to it!

One thing I don’t understand is that Dave’s correspondant says “It’s the first time I’ve seen googlebots looking for these files”. Possible explanations:

  • Googlebot did look for them before, but he never noticed until today.
  • He recently added links to these files.

Update, 1:00 PM

From the comments below, it seems clear that Googlebot is indeed asking some sites for /index.rdf and /atom.xml, even though it hasn’t seen any links to those files, and even when the site itself links to an /index.xml file. Interesting.

Out of curiosity, I ran a few queries to try and figure out how many feeds with common names Google has indexed:

Filename Query Hits
index.rdf filetype:rdf index 188,000
rss.xml filetype:xml rss 323,000
atom.xml filetype:xml atom 11,100

Hiding File Extensions in Movable Type

Thursday, April 15th, 2004

Brad Choate has a good post on how he uses Moveable Type to generate documents with a file extension, but then removes that file extension from his URls. I’ve tried to accomplish the same thing on fettig.net. Most of what I do is the same as Brad, but there are a few differences in my technique, which I’ll describe here.

I use the following mod_rewrite rule to send a client-side redirect response if a visitor tries to access a URI with a .html, .php, or .cgi extension:

# file.* -> file
RewriteCond %{THE_REQUEST} GET\ /.*\.(html|cgi|php)\  [NC]
RewriteRule ^(.*)(\.[A-Za-z0-9]*)$ http://%{HTTP_HOST}/$1 [L,R]

This doesn’t address the issue of the Content-Location header that Brad is worried about, but it does make sure a visitors see an extension-less URI in their location bars, which they will then use for linking or bookmarking purposes.
For example, try clicking on this link:
http://www.fettig.net/2004/03/switching_to_emacs.html.

Like brad, I’ve also had to go through all all my MT templates to strip the extension. The only differenct is that I encapsulated the regular expression logic in a little plugin called “strip_extension.pl”. Here’s the code:

use strict;
use MT::Template::Context;

MT::Template::Context->add_global_filter(strip_ext => sub {
        my $s = shift;
        $s =~ s/\.[A-Za-z0-9]+\Z//g;
        return $s;
});

Then in my templates, I replace all instances of <$MTEntryLink$> with <$MTEntryLink strip_ext="1"$>.

Using your OS X fonts in The Gimp

Saturday, April 3rd, 2004

If you’ve installed TrueType fonts under your user account in OS X, you can make them available to Gimp.app (or, probably, any other gtk2 application) by linking your ~/Library/Fonts folder to ~/.fonts, which is where Gimp’s font library looks for user-specific fonts. In other words, just type “ln -s ~/Library/Fonts ~/.fonts” in a terminal window, and the next time you run Gimp all your user fonts will be available.

This is, as far as I can tell, an exclusive tip for Fettig.net readers! I just figured it out, and haven’t seen it anywhere else online.

Update, 4/22

Thanks to Alf Eaton at HubLog for correcting an error: I mistakenly wrote ~/System/Fonts instead of ~/Library/Fonts. I’ve corrected that now. Alf also discovered that you can add both system and user-level fonts to the Gimp by using sub-directories of ~/.fonts. For example:

mkdir ~/.fonts
ln -s /Library/Fonts ~/.fonts/sys
ln -s ~/Library/Fonts ~/.fonts/user

Pinstripe for Thunderbird, Quicksilver

Thursday, April 1st, 2004

Thunderbird is an excellent mail client, but since I started running it under OS X it’s felt a little out of place – the flat-looking icons, the almost-but-not-quite native controls. It just felt awkward and uncomfortable, so much so that I even thought about using Apple’s Mail.app until I realized what a pain it would be to transfer all my mail and settings. But now, with the just-released Pinstripe theme, Thunderbird has a whole new look, and it’s absolutely gorgeous:

In real life it’s even nicer. If you’re using OS X, grab the latest build and give it a try – it’s truly amazing.

Also, while I’m on the subject of OS X apps: Quicksilver is just as good as everyone says it is. For the uniniated, here’s how it works. I think “I want to run FireFox” – but it’s not in the Dock! So I hit Control-space, and Quicksilver appears. I type ‘F-I’. The Firefox logo appears. I hit enter. Firefox is running. Total time: 2 seconds. No more searching through the Applications folder. Hooray!