Pete Freitag Pete Freitag

Sitemap hint in robots.txt

web

Just a quick tip for those of you that are building XML sitemaps for your web sites. You can now add a line to your robots.txt file to include a pointer to your sitemap file, it would look like this:

Sitemap: http://www.example.com/sitemap.xml

This will allow your sitemap to be picked up by several search engines automatically. I first noticed this about a month ago, not sure how long this feature has been around.


Like this? Follow me ↯

Sitemap hint in robots.txt was first published on June 13, 2007.

If you like reading about sitemaps, google, seo, robots, or crawlers then you might also like:

Comments

What about compatibility? IIRC, unless the RFC has changed, only User-Agent: and Disallow: are expected. And although Allow: is supposedly supported by Googlebot, even Google's robots.txt validator tells me it not to put it (?!).
Where should Sitemap: be? At the beginning? End? Anywhere?
I figure it's best to put it in a meta tag of the root index page, honestly.
by Keilaron on 06/21/2007 at 12:30:16 PM UTC
Keilaron, the robots.txt RFC allows for "extensions":

extension = token : *space value [comment] CRLF
by Pete Freitag on 06/21/2007 at 2:30:19 PM UTC
Indeed, I stand corrected - and in fact, I see that Allow is even in the RFC as well. How odd - Just about every reference I've seen out there only mentions Disallow. Thanks for the info!
by Keilaron on 06/21/2007 at 3:36:39 PM UTC
Your Welcome, I had to look it up myself so I learned something too!
by Pete Freitag on 06/21/2007 at 5:51:27 PM UTC
Some bots don't recognize this yet so its safer to put at the bottom of your robots.txt file like @ http://www.askapache.com/seo/updated-robotstxt-for-wordpress.html
by Mr. Apache on 08/10/2007 at 10:25:09 AM UTC
It's been around for many years now... know more at gianiji.com.
by T Singh on 06/09/2008 at 11:28:01 PM UTC