3.0 beta 22 of my Universal Feed Parser is out. This release fixes all known bugs, and I hope it will be the last beta before 3.0 final. After all, this is getting a bit ridiculous.

The release makes a significant change: if XML parsing fails due to character encoding problems, the parser will attempt to auto-determine the character encoding and re-parse with a real XML parser. This is noted in the results as results['bozo'] = 1 and results['bozo_exception'] = feedparser.CharacterEncodingOverride. results['encoding'] will contain the encoding that was actually used to parse the feed (not the original declared encoding).

This release makes another significant change: Unicode support for ill-formed feeds. All individual data values will be returned as Unicode strings if they can be converted using the document’s character encoding. I had a flash of insight and suddenly the entirety of Python’s Unicode support became clear to me. I coded madly for several hours until it faded. It’s entirely possible that that’s just the LSD talking, but thanks to the magic of open source, everyone can now share in my good trip.

This release also makes significant changes to internal classes. If you were subclassing or accessing these classes, your code will likely break. If you were just using the public parse() function, you will not notice any change.

My change reporting history has been lax throughout the 3.0 beta process, so I went back and recreated it from file timestamps, comments, and judicious use of diff. Full user documentation is coming next.

3.0b3 – 1/23/2004 – MAP
3.0b4 – 1/26/2004 – MAP
3.0b5 – 1/26/2004 – MAP
3.0b6 – 1/27/2004 – MAP
3.0b7 – 1/28/2004 – MAP
3.0b8 – 1/28/2004 – MAP
3.0b9 – 1/29/2004 – MAP
3.0b10 – 1/31/2004 – MAP
3.0b11 – 2/2/2004 – MAP
3.0b12 – 2/6/2004 – MAP
3.0b13 – 2/8/2004 – MAP
3.0b14 – 2/8/2004 – MAP
3.0b15 – 2/11/2004 – MAP
3.0b16 – 2/12/2004 – MAP
3.0b17 – 2/13/2004 – MAP
3.0b18 – 2/17/2004 – MAP
3.0b19 – 3/15/2004 – MAP
3.0b20 – 4/7/2004 – MAP
3.0b21 – 4/14/2004 – MAP
3.0b22 – 4/19/2004 – MAP

§

Respond privately

I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)



§

firehosecodeplanet

© 2001–9 Mark Pilgrim