8 Ways to Save Bandwidth on your RSS Feed
One of the things you will notice after you have published an rss feed is that it will consume a lot of the bandwidth. For example on Spendfish.com 18% of the requests are for RSS feeds. This is no wonder since feed readers may download your feed several times a day even if nothing has changed.
I've put together a list of ways you can save bandwidth and reduce the number of requests to your RSS feed (which also saves server processing).
1 - Use the
ttl tag goes directly inside the
channel tag in your RSS feed. It stands for time to live, and should hold the number of minutes a RSS reader should wait before requesting your feed again. Most blog software defaults this setting to
60, which means that your feed will be downloaded every hour by clients that obey this setting. We can increase this to 3 hours by adding the following:
Read more about
2 - Add a
skipDays tag also goes inside the
channel tag and should contain several child
day tags. As you might guess this tells RSS readers to skip downloading your feed on the specified days. For example if you don't publish content on the weekends:
<skipDays> <day>Saturday</day> <day>Sunday</day> </skipDays>
3 - Add a
Same idea as the
skipDays tag, but allows you to specify which hours during the day your feed should not be downloaded. The hours are specified in 0-23 using GMT. For example since I'm in NY, and I typically don't post things to my blog very early in the morning I could add:
<skipHours> <hour>7</hour> <hour>8</hour> </skipDays>
More info about
4 - Support
Many RSS readers and clients send an
If-Modified-Since header in their request to your RSS feed. This is one of the ways clients make what's called a conditional HTTP GET, you can return a
304 Not Modified HTTP response code (and omit the request body) if the RSS feed has not changed since the date specified in the
If-Modified-Since header. If the content has changed you simply return the normal
200 status code.
The header sent by the client might look something like this:
If-Modified-Since: Tue, 10 Jul 2007 21:19:55 GMT
Most clients will pass in the value you specify in the
Last-Modified header, so you should make sure that header is being populated. More info here. ColdFusion If-Modified-Since example here.
5 - Support
If-None-Match HTTP headers
ETag header is a HTTP response header that you can send back in your RSS feed response. It stands for entity tag and should be a unique value representing the content, you could do a MD5 hash of your RSS feed content, or simply use a date time of the last change. Clients will send this back in a
If-None-Match header, if this header contains your current ETag then you can return a
304 status code. More info about
6 - Don't send the request body on HTTP HEAD requests
Here's what HTTP 1.1 has to say about the
The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response.
HEAD method should only return HTTP headers, and no request body (so you don't need to return your entire RSS for these types of requests). You will find that several aggregators and readers will make
HEAD requests for your feed (including MXNA), it's a simple way to save some bandwidth.
7 - Limit the number of items you publish
This one is kind of obvious, but it is a very easy way to save bandwidth in your rss feed. If you are publishing 15 items or articles in your feed, if you lower that to 10 you can save a good amount of bandwidth.
8 - Publish Partial Content feeds
Another obvious way to save bandwidth - don't publish full articles in your feed. By publishing just the first few sentances of content you can also save a good amount of bandwidth. Your readers may not be too happy about this one however.
Do you have any other techniques that work? I considered adding
Cache-Control, but I wonder if that would actually take away the savings you would get from a Conditional Get?
I have to thank Charlie Arehart for giving me the idea for this blog entry. After my Working with RSS in ColdFusion presentation at cfunited he suggested that I write a blog entry on ways to reduce the bandwidth consumption of your RSS feed. After doing the research for this entry I realize that this topic could the subject of an entire presentation!
Like this? Follow me ↯Tweet Follow @pfreitag
8 Ways to Save Bandwidth on your RSS Feed was first published on July 12, 2007.
If you like reading about rss, bandwidth, performance, if-modified-since, etag, cache, caching, http, or feeds then you might also like:
- The MySQL Query Cache
- Cache Template in Request Setting Explained
- Foundeo's 2007 End of the Year Sale
- Yahoo Pipes Generates Invalid RSS Feeds
- MySQL Optimization Hints
- Zookoda For Sale
- Apple still likes their RSS icon
- Prevent Caching with HTTP Headers
And yes, a talk on the topic would be a good idea. Perhaps no single CF user group would care to hear it, but the world of CF (and indeed all) bloggers would benefit. How about recording one with Connect (or Captivate/Camtasia/CamStudio) and then sharing it with the world? :-)
Support for RFC3229+feed instance manipulation can shave tons of bandwidth with really huge feeds... Planet-type aggregated feeds, primarily.
For those who don't know, RFC3229 provides a way to deliver deltas via HTTP. In the case of Atom/RSS, it means the client sends along an "A-IM: feed" header with it's GET request. The server sees this, then checks If-None-Match and uses it to derive a subset of entries that are then returned to the client.
The idea is that the client only receives new entries that it hasn't seen before. Issues to bear in mind:
(1) Increased CPU utilization... instead of sending out one (presumably cached) feed to all clients, you're cooking up individual feeds (and database hits) for each reader.
(2) If your feed items contain lots of constantly changing meta info (comment counts and so on), then you're either going to have to keep resending updated entries (thus voiding the benefit of RFC3229), or drop the metadata.
(3) There's a temptation to just send feed deltas to everyone, requested or not... bad idea. There are still plenty of widget-style aggregators out there that expect to receive a full feed on every request.
What do you think ?
The last "skipDays" should be "skipHours"
The last "skipDays" should be "skipHours"
I mean everytime there is an earthquake, a flood, an oil spill - there's always a group of heartless people who rip off tax payers.
This is in response to reading that 4 of Oprah Winfreys "angels" got busted ripping off the system. Shame on them!