A hot topic at the moment is the bandwidth usage by news aggregators. According to my own site statistics, aggregator hits on my various RSS feeds now outweigh browser hits on my HTML pages by a factor of 2 to 1. My bandwidth usage tops 250MB/day on a slow day. My free bandwidth limit is 3GB/month. You do the math.
I have already trimmed my HTML templates as much as I can without sacrificing structural quality. I have reduced my main RSS feed to 5 items. I specify in my main RSS feed that it should only be read every 3 hours, and secondary feeds only once a day. My server is configured to return both ETag and Last-Modified headers for all my RSS feeds. The missing link is to get news aggregators to actually support these options.
Most of the RSS subscribers are whacking me every hour, which is actually costing me cash money in excess bandwidth charges.
You would save on bandwidth if RSS aggregators made conditional rather than unconditional GET requests.
What do we really want, instead? Smart aggregators. What a smart aggregator would do isn’t quite clear to me, though.
It is worth noting that there is an existing solution that significantly reduces bandwidth without affecting latency or content.
HTTP allows you to say to a server in a single query: “If this document has changed since I last looked at it, give me the new version. If it hasn’t just tell me it hasn’t changed and give me nothing.” This mechanism is called “Conditional GET”, and it would reduce 90% of those significant 24,000 byte queries into really trivial 200 byte queries.
For those of you under the impression that all that is is all that has ever been, here are some enlightening links from the height of the dot-com boom:
But Push from the big guys is just the beginning. Microsoft, PointCast, America Online and more than 20 other companies recently announced support for the Channel Definition Format (CDF), which has the potential to turn every Web site into a Push publisher. The CDF would be an agreed-upon format on Web pages which would actually create an automated Pull from clients.
After the application has been installed, PointCast has another impact. Users configure the content they’d like to see, and the PointCast client just starts downloading it from the PointCast server on the Internet. Network managers noticed this, too. Many immediately had corporate Internet service provider (ISP) links pegged at 100 percent saturation.
Another serious problem with using push clients is that they require a great deal of bandwidth because they are continuously sending large volumes of information to each user’s desktop. In an office where too many PCs are online and using continuous information delivery services, other users sending email or downloading files may experience slower response due to network traffic jams. The makers of push clients have been alerted to this problem and future versions of software promise to reduce their impact on the system by staggering information delivery.
Everything old is new again.
§
I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)
§
© 2001–present Mark Pilgrim