The Proper Content Type for XML Feeds

web

RSS Feeds have a content type problem. Most people end up serving them with the content-type: text/xml. But this practice is frowned upon for several reasons. The main reason people don't like text/xml is because its very vague, there are content types such as application/rss+xml, application/rdf+xml, and application/atom+xml that describe the content of your feed much better than text/xml does. We should be using these types for our feeds.

The problem, however with the more descriptive content types is that Firefox and IE prompt you to download the XML file instead of displaying it in the browser like it would a text/xml document.

download dialog in ie download dialog in Firefox

So what I have decided to do, is to serve the feeds as text/xml if the user agent contains Mozilla. So for IE, Firefox, and Safari 1.x my feed will be served in text/xml other clients will get the proper application/rss+xml MIME type. Here's my code for this:

<cfif cgi.user_agent contains "Mozilla">
	<cfheader name="Content-Type" value="text/xml">
<cfelse>
	<cfheader name="Content-Type" value="application/rss+xml">
</cfif>

I realize that this is not a perfect solution, it may cause browser plugins to have to do some extra work to determine if the document is an RSS, RDF or Atom Feed. Additionally if aggregators are including Mozilla in their user agent, they will get text/xml. But I'm not going to risk loosing potential subscribers over this issue, as some bloggers have reported to be the case when switching.

So I will serve a variable content-type at least until bug 256379 is fixed in a production release of FireFox (or if IE beats them I guess :). You can vote for that bug in bugzilla if you find the save dialog to be annoying when you click on RSS feeds.

I also hope that IE7 is will serve the rss related content-types as it would a text/xml doc by default. Scoble, can you make sure IE7 deals with this? (apparently Robert Scoble will read your post if you put his name in it...)

Tim Bray has pointed out why its important for people to get their act together:

  1. To manage the traffic load we're going to have to do some caching. Fortunately, RSS contains some publication and expiry-date data to help intermediate software do this, but to do this it has to recognize the data as RSS and read this stuff. This isn't going to happen until RSS gets served with the proper Media-type.
  2. When someone writes RSS-reader code to live in the Web Browser, it's going to need a consistent Media-type to be able to recognize RSS.

Yet Another Community System cites some of the problems with text/xml such as the character set issues:

The default character set, which must be assumed in the absence of a charset parameter, is US-ASCII or ISO-8859-1 for all MIME types prefixed by text, depending of the Request for Comment you are considering. Of course, having two different specifications is confusing to the software industry. But also, no one of these two charsets can support complex foreign charsets as those used in Asia. On the other hand, implementors and users of XML parsers tend to assume that the default charset is provided by the XML encoding declaration or BOM.


Related Entries

26 people found this page useful, what do you think?

Trackbacks

Trackback Address: 381/3E9186353FBC2A8C5610943B5A5FED9E

Comments

On 06/13/2005 at 5:46:00 PM EDT Adam Ness wrote:
1
I disagree... There's a XML schema and/or DTD associated with the document. An RSS-capable browser should be capable of recognizing RSS, RDF or ATOM data based on that. Likewise, the XML encoding marker or BOM is sufficient for determining whether RSS is properly formatted or not.

Why should it be necessary to specify new, incompatible MIME types to solve problems that have already been solved within the XML files?

On 06/13/2005 at 6:06:47 PM EDT Pete Freitag wrote:
2
Adam, Where I see the benefits of a new media type is for http middleware such as caching, or proxies. If they can quickly determine the type of document from its headers with out actually inspecting or parsing it, this is good for performance. So having mime types for RSS, RDF, and ATOM would improve performance of the applications that use them.

I agree however that we shouldn't be declaring new MIME types for every possible variant of XML. But RSS, RDF, and ATOM are quite popular, and I think that justifies creating having a new mime type.

On 06/13/2005 at 6:11:54 PM EDT Adam Ness wrote:
3
HTTP already has fairly extensive and powerful cache management headers in existence. I don't think that having a different set of cache rules for syndication content is necessarily a good idea, but I can see that there might be some desire for the flexibility in rare cases.

On 06/13/2005 at 6:26:28 PM EDT Pete Freitag wrote:
4
RSS does allow for caching features that are not available in HTTP caching, such as SkipDays or SkipHours. If your only updating a feed during business hours, then its handy to specify that its not going to change at night or on the weekend.

Caching is very important for RSS, because ask they require a lot of bandwidth. My RSS feed gets more than 8 times as many hits as my home page. RSS use is only going to grow in the coming years.

Adam, what disadvantages are there to adding these MIME types?

On 06/13/2005 at 7:26:51 PM EDT Adam Ness wrote:
5
SkipDays and SkipHours can easily be represented by sending the appropriate "Expires" header on your RSS. e.g. If you don't update on the weekend, just set your Expires header to Monday Morning.

Additionally, the SkipDays and SkipHours information is embedded in the RSS file itself, which means your cacheing server must be parsing the XML data. If it is, then it can recognize the DTD or Schema information just as easily as it can recognize the SkipDays/SkipHours information.

The disadvantages to registering additional MIME types are multiple: Crowding of the MIME namespace, competing standards (Is it application/rss+xml or x-application/rss or text/rss, or text/rss-xml?) Duplication of existing functionality, Incompatibility with browsers such as IE and Mozilla which don't know the MIME type that you're talking about, inability to pass the data through certain types of filtering firewalls/proxies, incompatibility with some fully standards compliant xmlhttp libraries which expect results in text/xml, and a variety of others. I don't see the benefits outweighing the negatives.

On 06/13/2005 at 7:39:46 PM EDT Pete Freitag wrote:
6
Good points Adam, I suppose you could indeed implement SkipHours and SkipDays like functionality with HTTP caching. It would be a bit more difficult however.

Also the caching server would only have to parse the XML if it matched the RSS mime type, that's why I state that as an advantage. With just a text/xml mime type you would have to check every XML document served.

On 06/14/2005 at 4:55:44 PM EDT Roger Benningfield wrote:
7
Adam: The ship has already sailed. application/atom+xml is being built into Apache, application/rss+xml is already a "running code" standard, and that's that.

Secondly, you won't find much of anyone associating DTDs or schemas with syndication files. There will be a non-normative RelaxNG schema for Atom, but everyone will ignore it.

On 06/28/2005 at 1:21:50 AM EDT Randy Charles Morin wrote:
8
IMHO, browsers should not display angled brackets. When you click on an application/rss+xml file, the reference should be passed to your RSS reader and you should be prompt to subscribe. It's called 'Universal Subscription Mechanism'.

On 01/24/2006 at 9:00:23 PM EST Bricolage wrote:
9
Politics aside. After reading the spec and the w3c recomendation on Content Type declaration in rss (and likes), i was strugling to figure out what was going on (with firefox behaviour). Now i've implemented this hack in my Zope server and so far so good. Thanks.

On 02/14/2006 at 1:44:35 PM EST clint wrote:
10
I would be interested to see an example of how the DTML/TAL looks for a Zope version of this.

On 03/18/2006 at 8:10:05 PM EST Jean Moniatte wrote:
11
From a ColdFusion perspective, is there any difference between cfcontent type="text/xml" and cfheader name="Content-Type" value="text/xml"

Thanks.

On 07/17/2006 at 8:48:16 AM EDT webber wrote:
12
Works with Opera! Why did you ignore the only fully standards compliant browser there currently is?

On 06/04/2007 at 5:39:35 AM EDT Peter Laman wrote:
13
I'm new to RSS. I don't find in your article where I should specify the content-type in the rss file??

On 06/21/2007 at 2:36:17 PM EDT Keilaron wrote:
14
This may sound stupid, but as someone who hasn't kept up much I've been a little curious:

HTML - text/html XML - text/xml XHTML? application/html+xml Why application? RSS and Atom do it too, as you've written in your post. But I haven't seen Why explained on any site...

On 09/07/2007 at 7:41:50 PM EDT joeblow wrote:
15
Thanks for this . . .

I just spent an hour trying to track down why I was getting "XML file does not appear to have any style information" in FF rather than "Subscribe to this feed"

Content-Type: application/rss+xml does the trick

Content-Type: text/xml does not

On 10/29/2007 at 2:42:39 AM EDT Rajapandian wrote:
16
Hai, my rss contains error i not able to find where is the error anyone know please find .

<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0"> <channel> <title>Radiant : upto-250000</title> <link>http://www.property.com</link> <description>Latest Properties upto-250000</description> <language>en-gb</language>

<copyright>Copyright 2007, Properties Ltd.</copyright> <pubDate>Mon, 39 Oct 2007 7:39:07</pubDate> <lastBuildDate>Mon, 39 Oct 2007 7:39:07</lastBuildDate> <image> <title>Radiant</title> <url>http://www.property.com/mortgage/images/logo.jpg</url> <link>http://www.property.com</link>

</image> <item> <title>&#163;25000 Lorem ipsum, Lorem ipsum</title> <link>http://www.property.com/mortgage/index.php?option=rssdet&amp;id=2&amp;type=1&amp;range=1</link> <description><![CDATA[ <img src="http://www.property.com/mortgage/rss/img1/2.jpg" width="102" height="78" /> Lorem ipsum <br><br> ]]></description> <pubDate>Mon, 39 Oct 2007 7:39:07</pubDate>

</item> </channel> </rss>

On 10/30/2007 at 5:01:28 PM EDT Keilaron wrote:
17
Wow... I actually started investigating this, then realised it was spam!

On 12/18/2007 at 3:36:47 PM EST Adun wrote:
18
hello, recently I'm learning about ajax + asp.net 2.0, and now, I want to make a ajax rss reader, and problems appear when I try to get rss data from rss feed url with javascript, like http://rss.sina.com.cn/finance/gjcj.xml my code to do that is like

rssReqXml = getXmlDoc(); rssReqXml.onreadystatechange=viewContent; rssReqXml.open("GET", escape(url),true); rssReqXml.send( null);

and I always get some response message saying "bad request", it doesn't work even when I set the contenttype to text/xml, so, what do think might be the problem, and how to make work, thank you

On 01/09/2008 at 12:52:00 PM EST Cyan wrote:
19
Isn't text/xml deprecated in favour of application/xml?

On 02/14/2008 at 8:50:53 AM EST Kenneth Rainey wrote:
20
Peter--

Thanks for posting this. After adding this to the top of my cfm file, my feed validates with http://feedvalidator.org.

On 03/04/2008 at 11:13:18 AM EST Aron wrote:
21
It's possible to use application/atom+xml or application/rss+xml in any case independend of the user agent. Just set an additional header information: Content-disposition: inline; filename="feed.xml"

On 06/02/2008 at 10:43:45 PM EDT chipp wrote:
22
Hi, I feel like I should share my experience.

I'm using Content-type: application/xml for my RDF Site Summary (RSS 1.0) and it works as expected on IE 7, Mozilla Firefox 2, Opera 9, Safari 3 (for Windows), GreatNews 1.0, iGoogle and Feed on Feeds 0.5 (online aggregator).

Greetings

On 12/19/2008 at 11:58:13 PM EST wareDrawida wrote:
23
? root ??? ????, ??????? ???????

On 05/21/2009 at 9:31:43 PM EDT Milan wrote:
24
Hi, can any one guide me, i have created an rss feed, which displays perfectly in text/xml content type, but on application/rss+xml or application/atom+xml it does not display any thing in my browser, what would be the problem. i am not able to directly click and subscribe to the created rss feed using outlook.

On 02/24/2010 at 11:49:42 PM EST maddddddddddd wrote:
25
isn't it "Content-type" and not "content-type" or "Content-Type" as shown above?

On 07/02/2010 at 2:27:41 AM EDT Keilaron wrote:
26
Content-Type is usually what's written, but in reality HTTP headers are case-insensitive.

On 11/23/2010 at 11:58:20 AM EST weargolisless wrote:
27
That is another interesting and thought provoking post , altough a little too advanced for those of us begginners in forex. I would be glad to see something on the basics everyone should know be it a beginner or aprofessional.

On 08/07/2011 at 11:32:22 PM EDT awailtswifief wrote:
28
????????????, ??? ?????? ??????. ??? ??? OGLI.org

On 08/10/2011 at 1:22:06 PM EDT Trigioreive wrote:
29
???, ??? ???????, ??? ? ???? ?????????? ???? ??????????, ???????, ???? ?????, ??????? ???????: «???? ?????????? ????» — ??? ???????. ???? ?? ????? ?????????? ?????? domashnee.Org

Post a Comment




  



Spell Checker by Foundeo

Recent Entries



foundeo


did you hack my cf?