HTML 5, XHTML 2, and the Future of the Web is making the rounds. It’s an excellent summary for people who haven’t been paying attention. Along the same lines, here’s a presentation I gave in December 2005 to a bunch of Firefox developers at Mozilla Corporation headquarters: White lights lead to red lights. The “future of the web” part comes in about halfway through.
(Yes, I am aware these slides violate all the rules of Presentation Zen. I wrote them on the plane. Hence the title.)
The circumstance surrounding this speaking engagement is itself a funny story, one which I may share someday. For now, let’s just say that I’ve been keeping this presentation under my hat for a long time, and now I don’t have to.
I shall now proceed to play Super Paper Mario all weekend.
§
After reading that article (no, I haven’t been paying attention) I find myself looking at my XHTML tags (served as text/html of course, why would I risk complete parsing malfunction?) and thinking, “Why did I do this?”
I think the answer lies somewhere between:
“It’s newer, therefore it has to be an improvement.”
and
“XML will obviously save the web.”
After reading over the working draft for HTML5, albeit not from cover to cover, but almost, I was still unclear why it makes sense to move away from a traditional SGML derivative. But now that it’s sunken in a bit, that HTML5 vs. XHTML2.0 article really hits home.
Of course it makes sense to say, “Hey, everybody’s doing this. Why do we have to pretend… that they’re not?”
— Ryan ![]()
Where can i find the “How to write a novel in 30 days” and “How to write a resume”? i tried googling for them with mark pilgrim attached but no avail.
— Manuel ![]()
My “why move away from SGML?” reason is the way that every time I have to explain to someone that their Mozilla bug in invalid because HTML is actually an SGML application, with bizarre underlying rules they’ve never seen, I finish up by saying “if you want to see the actual spec that I’ve been told says that, you can buy a copy for 230 Swiss francs.”
I find myself looking at my XHTML tags […] and thinking, “Why did I do this?”
There are two kinds of fool. One says, “This is old, and therefore good.” And one says “This is new, and therefore better.” —John Brunner, in The Shockwave Rider
Your ideas about expertise and teaching are very interesting, and reflect my own beliefs. At work, I’ve often had to help my co-workers with technology that I only learned last week. Heck, I am planning to teach a Ruby on Rails seminar at work in the summer, and I don’t even *know* RoR yet.
I don’t know if I can afford to be an expert in things anymore. Consider the web app stack: HTML, CSS, Javascript on the front end, PHP/JSP/.net/Ruby/Python/Perl in the middle, and some type of database in the back. You can spend your time mastering everything, or you can just get something done already.
My mantra is starting to become, “If I’m writing code, something has probably gone wrong.” This isn’t totally true, but it’s true-ish and certainly gets a conversation going. Integration is becoming a much more important skill than knowing the entire PHP API or becoming a CSS expert.
— Trent ![]()
Sorry, folks. If HTML5 is not well-formed XML, I don’t see how it could catch on. The entire argument is invalid. IE doesn’t support XHTML, so let’s tweak the web to suit IE. Yeah, right. Rather, I’d tweak the web so that it uses semantic markup and consists of well-formed XML so that:
1. It’d be less of a pain in the ass to process HTML in computers (and there’s no denying that semantic, well formed XML would be easier to create and process)
2. Make VALIDATION much easier. I don’t see a problem if your page blows up if you forget to close a tag. Just close the god damn tag and move on with your life. You’ll later be thankful for it.
I also find it ironic that the article about HTML5 is presented in XHTML 1.0
DMB, from the article:
Very little content on the web is valid HTML 4.01; most of it is invalid and ill-formed, but browsers still have to parse it
[...]
A document with an XML well formedness error will only display details of the error, but no content. On pages where some of the content is out of the control of XML tools with well-designed handling of different character encodings—where users may comment or post, or where content may come from the outside in the form of trackbacks, ad services, or widgets, for example—there’s always a risk of a well-formedness error. Tag-soup parsing browsers will do their best to display a page, in spite of any errors, but when XML parsing any error, no matter how small, may render your page completely useless.
And you say:
I don’t see a problem if your page blows up if you forget to close a tag. Just close the god damn tag and move on with your life. You’ll later be thankful for it.
You may not “see a problem” with it, but imagine how many people do care that their page being served looks like this:
XML Parsing Error:
Programmers look at the web as a programming platform, and as such are easy to point out that in a programming language a syntax error needs to be corrected or all is lost. But not everyone adding content to the web is a programmer, and not everyone adding content to the web should have to be.
Incidentally that page is being served as XHTML1.0, but with a MIME type of text/html. It doesn’t pass validation. Presumably had it been served as application/xhtml+xml none of us would even be reading it. Is that OK?
— Ryan ![]()
I don’t care if it passes validation as long as it’s a well formed, semantic piece of XML. If your page blows up, it’s your damn fault, just like if your program/cgi script blows up because you forgot to handle an error condition. If anything Web should be gravitating towards more structure, not less!
Thanks for the link to ‘Presentation Zen’. In return, here is a link to ‘Beyond Bullets‘, which you may find useful (the best stuff IMO is in the 2005 archives).
And FYA, you may enjoy this presentation: ‘Taming Yesterday’s Nightmares for a Better Tomorrow‘.
DMB: I dare you to create a site that provides user-generated content that maintains XML well-formedness. I tried to do this and my users got so irritated at their inability to style their content simply enough (“OMG!! y am i getting so many errors?”) that I ended up dropping the idea and serving it as text/html again.
The serialization format of the document doesn’t make one iota of difference to either the coder or the end-user. It only matters to the user agent, and as long as the parsing rules are sane, it still doesn’t matter that much.
Charles: That doesn’t make HTML5 any less insane. Just because someone can’t learn to close a freaking tag doesn’t mean the idea of strict XML well formedness is wrong or worthless. Serialization format makes a heck of a lot of difference if you want to do anything with your content after it’s created. Say, if three years from now someone comes up with a search engine that could understand your page better, it’d help if this search engine didn’t choke on the mess an unscrupulous designer pooped out. A good safeguard against this is well formedness and strict enforcement of it to the point of not rendering the pages at all if they’re broken. If you receive a broken Word document, you don’t complain when Word refuses to open it.
HTML5 just happens to have an HTML syntax that auto-corrects. HTML5 can also be written in XML. Both generate almost identical DOM trees which is where all the structure, semantics and such are that you care about. They’re not in the syntax.
I’m no HTML/XML expert (at least not in this crowd…) but isn’t the problem that while HTML5 (or 4…) offloads the very hard work of parsing tag soup onto 5 or 10 teams of very serious web browser developers, XHTML 1 (or 2…) offloads the moderately hard work of well-formedness onto millions upon millions of webmasters/myspacers/bloggers/hand-coding hobbyists?
The problem for the people who would benefit most from XML is that the HTML beneficiaries outnumber you by a factor of some millions to one, and they’re the XML-loving people’s CLIENT BASE.
So they probably won’t stand for it.
I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)
§
© 2001–9 Mark Pilgrim