dive into mark

You are here: dive into markArchivesOctober 2002In praise of evolvable formats

Tuesday, October 8, 2002

In praise of evolvable formats

Clay Shirky: In Praise of Evolvable Systems. This entire article could be rewritten to explain RSS. In fact, let’s do that.

If it were April Fool’s Day, the Net’s only official holiday, and you wanted to design a novelty format to slip by the W3C as a joke, it might look something like RSS 0.9x/2.0:

RSS 0.9x and 2.0 are the Whoopee Cushion and Joy Buzzer of syndication formats. For anyone who has tried to accomplish anything serious with metadata, it’s pretty obvious that of the various implementations of a worldwide syndication format, we have the worst one possible.

Except, of course, for all the others.

The problem with that list of RSS deficiencies is that it is also a list of necessities — RSS has flourished in a way that no other syndication format has, not despite many of these qualities but because of them. The very weaknesses that make RSS so infuriating to serious practitioners also make it possible in the first place.

Furthermore, its almost babyish XML syntax, so far from any serious computational framework (Where are the namespaces? Where is the Document Type Description? Why is the aggregators’ enforcement of conformity so lax?), made it possible for anyone wanting an RSS feed to write one. The effects of this ease of implementation only become clear when you compare it to the attempts over the years to do RSS right — most notably RSS 1.0 in the year 2000. RSS 1.0 had three main benefits:

  1. Backward compatible with RSS 0.90, which was never widely deployed, and which fell into obscurity as soon as (the much simpler) RSS 0.91 was introduced.
  2. Based on RDF (specifically a serialization called RDF/XML), a spec which, at the time and to this day, continues to change or threaten to change. Two years later, there are no major languages or development platforms that ship with parsers to consume RDF, although many (Perl, Python, .NET) have third-party RDF parsers in various states of development and conformance. (The release version is generally out of date; CVS access is recommended. You get the idea.) Meanwhile, RDF/XML production tools are so inconsistent that even RDF experts recommend not using RDF tools to produce an RSS 1.0 feed if you want it to actually be read by any major RSS aggregator. Despite the two-year-old promise of better tools, it is now the year 2002, and I built my RSS 1.0 feed — in the most sophisticated personal publishing system in the world — by manually typing a mishmash of template tags and angle brackets into a TEXTAREA of an HTML form.
  3. Extensible through namespaces, which, as mentioned above, have been haphazardly and poorly incorporated into RSS 2.0, where they appear to be flourishing.

Evolvable formats — those that proceed by being adapted and extended in a thousand small ways — have three main characteristics that are germane to their eventual victories over strong, centrally designed formats.

  1. Only solutions that produce partial results with imperfect tools can succeed. My RSS feed is an XML document produced by a template that I built in a TEXTAREA, and consumed by hundreds of parsers around the world that know nothing of XML and hack apart my feed with regular expressions. The world is littered with formats that would have worked if only everyone had better tools. If everyone in the world had a perfect RDF parser at their disposal, it would be trivial to produce and consume all the world’s metadata in RDF. Without such perfect tools, both production and consumption instantly become nightmares. There is no middle ground.
  2. What is, is wrong. Because evolvable formats have always been adapted to earlier conditions and are always being further adapted to present conditions, they are always behind the times. RSS was being stretched with long descriptions, optional titles, and entity-encoded HTML even before such practices were codified in the spec, and long before all consumers could handle them. No evolving format is ever perfectly in sync with the challenges it faces.
  3. Finally, Orgel’s Rule, named for the evolutionary biologist Leslie Orgel — Evolution is cleverer than you are. As with the list of RSS’s obvious deficiencies above, it is easy to point out what is wrong with any evolvable system at any point in its life. No one seeing RSS 1.0 and RSS 0.91 side-by-side could doubt that RSS 1.0 had the superior technology, that it did things right. However, the ability to understand what is missing at any given moment does not mean that one person or a small central group can design a better system in the long haul.

Designed formats start out strong and improve logarithmically. Evolvable formats start out weak and improve exponentially. RSS 2.0 is not the perfect syndication format, just the best one that’s also currently practical. Infrastructure built on evolvable formats will always be partially incomplete, partially wrong and ultimately better designed than its competition.

Filed under , ,

Respond privately

I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)



Recent Stuff For You, Special Price Stay Here
  • Greasemonkey Hacks
Good Stuff Buy The Cow Go Away
Dive Into Python
Powered by Google Drink The Milk Don't Steal

 

posts / comments
© 2001-8 Mark Pilgrim