I am working on a major upgrade to my feed parser, and now is as good a time as any for a public beta release.
Download Universal Feed Parser 3.0 beta 15 (2004-02-11).
Changes from 2.x:
bozo bit in the result, and store the first XML parsing error in bozo_exception. Then it will automatically fall back to the 2.x-style parser based on regular expressions. Some people seem to be laboring under the misapprehensions that (a) well-formedness is an indication of data quality, and (b) the client is the correct place to enforce data quality. It has been my experience that well-formedness is not a strong predictor of data quality; most of the feeds that fail to validate are well-formed crap, and the most of the rest would be valid if not for a single transient well-formedness error. But whatever, if you’re the sort of person who insists on punishing your own users for the mistakes of others, you may now do so by checking the bozo bit.feedparsertest.py to test the feed parser on your system. It’s been tested under Windows, Mac OS X, and Debian Linux, on Python 2.1, 2.2, and 2.3. Previous versions were not well-tested on Python 2.1, much to the dismay of those running Debian stable. This version was tested very well, a process which shook out a surprising number of obscure bugs that probably never affected you.title, tagline, summary, info, and copyright. Also support for base64-encoded binary data.Universal Feed Parser, to emphasize the parser’s content normalization features and de-emphasize its
parse at all costsnature. It would be nice if 3.0-final had better documentation, especially on the content normalization features, so you could easily see which data from which feed types and versions ends up where.
§
I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)
§
© 2001–9 Mark Pilgrim