The Feed Validator has been updated to version 1.11. In a futile attempt to placate the conspiracy theorists, we are now publishing complete changelogs. The most important bug we fixed in this release was one that caused the validator not to notice when selected required elements in Atom feeds were missing.

Also, my ultra-liberal feed parser has been updated to version 2.5. Major new feature in this release: better handling of various HTTP statuses, and exposing the HTTP status code and real URL (after any redirection) of feeds retrieved over HTTP. The parser can now pass all of the aggregator client HTTP tests.

Also, with the consent of all contributors, I have changed the license of the ultra-liberal feed parser from the GNU/GPL to the Python license. The Python license is GPL-compatible, so if you were using and distributing the parser as part of a GPL program, you can continue to do so. However, you may now also use it in other programs, including closed-source programs.

Update: fixed another few bugs with the feed parser, including an important one with decoding inline XML, which can be used in Atom feeds, and some RSS feeds.

§

Eleven comments here (latest comments)

  1. You might want to check out your revision history, it’s messed up a bit because of html.

    — sparticus #

  2. Yeah, the first rev had invalid HTML (missing end quote on an attribute). That’s what I fixed.

    — Mark #

  3. Some time ago we send you some patches for a previous version of the liberal parser.

    This included code for getting the encoding of the source document, what makes it easier to handle feeds that use unicode characters.

    It also included a fix for a bug in the ‘inchannel’ logic. When an ‘image’ or ‘textInput’ element is present, the parser returns the title, description and link of these element instead of the title and link of the channel.

    Running the parser on http://stuff.vandervossen.net/test/tweakers2.rss will give an example of this behaviour.

    We would be quite willing to update these fixes to the new version of the parser. The last two times we send you patches we received no response. Can you let me know whether you are interested?

    — Thijs van der Vossen #

  4. Cool. Good to know Validates. ;-)

    — Paul Michael Smith #

  5. bah typo. Should be – Good to know my feed validates.

    — Paul Michael Smith #

  6. Thijs: I have a record of several emails on or around April 27 asking about textInput, but I have no record of any discussion of character encoding, nor any patch. Please resend and I will incorporate it.

    — Mark #

  7. Radio Free Blogistan (trackback)
  8. Sorry, forgot to mention that the patches were sent by one of my employees, manfred or m.stienstra [at] fngtps [dot] com. But never mind, we’ll add them to V2.5 sometime tomorrow.

    — Thijs van der Vossen #

  9. To my understanding, one of the goals for the new syndication format is to get away from all the sloppy, invalid and therefore hard-to-parse RSS feeds we have now.

    Adding support for this new format to a _liberal_ parser seems a bit weird to me.

    Shouldn’t parsers for the new format be required to just give up when the feeds turns out to be invalid, like XML processors are required to?

    — Thijs van der Vossen #

  10. I don’t give a damn what XML processors are supposed to do. All parsers should be as liberal as possible. There are no exceptions to Postel’s law.

    Similarly, all publishers and validators should be as strict as possible. I’m actively working on both ends.

    That said, I have been toying with a 3.0 rewrite that would use a real XML parser if available and possible, and silently setting a flag and falling back on the existing code if XML parsing fails.

    — Mark #

  11. Ok. Fair enough.

    We’re doing just that for http://www.syndicatie.nl and it works like a charm. A little over 90% of the feeds we track can be parsed using a proper XML parser.

    — Thijs van der Vossen #

Respond privately

I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)



§

firehosecodeplanet

© 2001–present Mark Pilgrim