In private email, in response to my response to Matt Bridges, Dave asks: “What format do you want it in?”

There is no perfect format; it depends on your goals. RSS is good for syndication; OPML for outliners. When I write my book, I write in DocBook XML, as do the technical writers at Sun, KDE, FreeBSD, and Linux Documentation Project. Each for its purpose. RSS is transient (new content republished in the same place) and is only read by automated aggregators. DocBook is elaborate, detailed, complicated, designed for people for whom (and projects for which) such details matter. DocBook is used as input to a common set of scripts and stylesheets that have been developed and honed over years; it is the uber-format, the single source from which all other formats flow. HTML, PDF, WinHelp, plain text, Word, postscript — all generated from my (anyone’s) DocBook document.

Weblogs are different. Weblogs are made to be linked to. Every paragraph, every thought, every notion has a single, published, permanent URL in the weblog’s archives. That same thought may be published in other formats as well, and that’s fine; each format for its purpose. But those alternate formats are not what people link to, so they’re not what people read today; they’re not what search engines see people linking to, so they’re not what people will find and read tomorrow. What people link to and read is whatever is at that single, published, permanent URL. Therefore, that content (not the content hidden away in a backend database or CMS or XML document, but the permanent, published, linked content) should be as accessible as possible, to as many types of people and programs as possible, for as long as possible.

My solution? Well-structured semantic XHTML. It’s a floor wax and a dessert topping. It’s XML (easily machine-readable, opens the door to XSLT — I’ve done this and will write more about it later). It’s also HTML, or close enough that it displays in every browser in the world (even Netscape 4). And it’s well-structured (degrades by itself, all the way down to Lynx) and semantic (my tags really mean things: h1, h2, h3, p, blockquote, em, code — Google cares). Add CSS, and modern browsers render it beautifully (important today, less so tomorrow, irrelevant in the long run). It’s not a perfect solution (too much variation, not enough detail for the kind of semantic processing that DocBook allows) but given the requirement of a single, public, published, permanent URL for everyone and everything, I think it’s the best compromise.

§

Respond privately

I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)



§

firehosecodemusicplanet

© 2001-8 Mark Pilgrim