Dive Into Python 3 was commissioned in January 2009 by Apress, who published the original Dive Into Python in 2004. Upon agreeing to contract terms, I registered a ten-year lease on diveintopython3.org and immediately published a draft table of contents.
The original DiP was written in DocBook XML. As I’ve mentioned before, I chose DocBook XML because I wanted to learn XML and XSL, and DocBook seemed to be Just The Thing for technical documentation. There was also a bit of self-grandeur involved. I was writing a book For The Ages, so it was important that it be in a Format Of Forever. And in the short term, I could transform The Format Of Forever into useful (but lowly) Output Formats, so I could do unimportant things like publish it online.
For The Ages turned out to be about 10 years. The Format Of Forever is still going strong, but Python itself changed so quickly that it didn’t matter.
Oh, and there was one other little thing that happened between 2000 and 2009: search stopped sucking and took over the web. Kids today may not remember, but it used to be hard to find stuff on the web. Once you found it, you wanted to download it so you could read it offline.
Remember being “offline”?
Anyway, I now realize that there were some hidden assumptions behind my design decisions in 2000. Some of those assumptions turned out to be wrong, or at least not-completely-right. Sure, a lot of people downloaded DiP, but it still pales in comparison to the number of visitors I got from search traffic. In 2000, I fretted about my “home page” and my “navigation aids.” Nobody cares about any of that anymore, and I have nine years of access logs to prove it.
So, I am writing DiP3 in pure HTML and, modulo some lossless minimizations, publishing exactly what I write. This makes the proofreading feedback cycle faster — instead of “building” the HTML output, I just hit Ctrl-R. I expected it to make some things more complicated, but they turn out not to matter very much.
Some examples:
Furthermore, I am no longer under the illusion that this book will be useful forever. Python will either continue to evolve or it will die; either way, static documentation has a shelf life. Today’s cutting edge code is tomorrow’s mainstream code is next year’s legacy code. DiP’s shelf life was about 10 years. I am supremely confident that the HTML I’m writing today will still be readable 10 years from now, and after that it won’t matter because I’ll have to rewrite the whole damn book anyway.
See you in 2020 for Dive Into Python 4!
§
For curiosity’s sake: how well does HTML as The Format work for the paper version?
How does publishing into PDF / print work if your source is HTML? Do those PDF2HTML things actually work? I thought all prints used LaTeX, which isn’t so bad from one of my encounters, unless you like diagrams and tables.
AFAIK, publishing to paper is about the same as it was in 2004. I take my HTML and import it into MS Word, where the copyeditor complains that my Word document doesn’t use to proper styles. I backport all the changes I care about back to my original HTML (previously DocBook XML) on my own time. In the final proofing phases, everything is done in PDF with Adobe’s editing tools. Again, any changes I care to backport are backported.
In this regard, DocBook vs. HTML makes no difference whatsoever. Nobody else in the “biz” uses either.
— Mark ![]()
> Nobody else in the “biz” uses either.
It depends upon your publisher. At the very least Manning Publishing allows authors to use Docbook (which is what I’m using for my current book).
With Prince XML and a bit of additional CSS the HTML could easily work as The Format for the paper version too.
Check out Pandoc, which converts surprisingly well between LaTeX (print), HTML (web), Markdown (source).
Heh. When I did my book (Practical Common Lisp) for Apress, I used a home-brew Markdown like scheme which I then used to generate PDFs for my own pen-and-paper editing, HTML to put on the web, and RTF which I submitted to Apress. I got a comment about how my Word documents were the cleanest they had ever seen. Which, of course, was because all my styles had been unerringly applied by software, rather than a human being. I’m curious how you edit your HTML–some Wysiwyg editor or in a text editor? I think I tried using raw HTML for a while but editing it in Emacs was too annoying.
Mark, looking forward to this one. Fwiw, buying “free” books tends to be worth every penny.
“The Format Of The Now.”
Missed that the first time. Suggestion: The Format Of The Long Now.
Mmm, I’m not sure sure “Nobody else in the “biz” uses either.” is a very accurate statement. Also announced today: http://bitbucket.org/bos/hgbook/src/ Which is all in docbook and is used to generate http://hgbook.red-bean.com/read/ and allows for public comments. The DocBook is also used to generate the PDF that will go directly to the printer. Some publishers may still be using word for everything and be doing layout by hand. Not all of them are.
> I’m curious how you edit your HTML
On Windows, Emacs + http://www.martyn.se/code/emacs/darkroom-mode/
On Linux, Emacs-GTK + Alt-F11 (fullscreen mode)
— Mark ![]()
That’s a creative use of XML Entities. Extra points if you can come up with an actual use case for XML External Entities…. good luck.
God this just sounds horrid to me. The Format of Now is “wiki text”, and LaTeX. HTML *is* an output format. The thing about wiki-style text markups (asciidoc, markdown, dokuwiki, wikimedia, etc, etc, holy crap there are a lot of them), is that they allow to to just write *text*. LaTeX is the king of this because if you just use it as simple text, you get a nice looking print document, but if you need something more complicated it can do it. Honestly though if I’m just writing text, it’s easier to use markdown and convert to LaTeX or HTML depending on needs. If it’s print-centric, use LaTeX from the get-go. The more transparent the markup language, the better. HTML is terrible for this with all its brackets and tags.
I prefer Markdown to HTML for editing these days; it doesn’t get in the way (much) when I need to stick random HTML in there, and it’s more readable in source format. Jottit is a convenient way to see your Markdown changes rendered to HTML as you edit.
Flying Saucer is a fairly competent HTML-to-PDF renderer that is shaping up to be a good free alternative to Prince. It’s not 100% there yet, and e.g. loading non-default fonts is a bit of a pain, but it’s not bad. Beats OpenOffice Writer, which isn’t terrible either.
HTML5 – the format of the soon.
That’s some great-looking HTML indeed (both the markup and the page design).
I have met the nice people from http://en.flossmanuals.net/
and they reached to the same concept and i agree
use html for the book , also they have an modified wiki engine but with the
content in html format ,
what was amazing for me was that they could generate an book in 2 days (in an sprint) and the output is what you see there
I have recomended already to the firebird project
— mariuz ![]()
Pandoc’s extended/enhanced/whatever Markdown format is the best I’ve found after long years of trying different doc markup formats. After discovering it, and finding out how reliable pandoc is, I now use it for pretty much everything. Does very nice conversion to HTML and LaTeX (for pdf).
Any reason why you didn’t want to do it like the Django book did?
http://www.djangobook.com/en/2.0/
I kind of think their feedback / annotation system was a pretty good way to make the book better with each iteration…
— bex ![]()
Can someone explain why plain text or HTML formatted text would ever be converted to PDF ?
The whole point of PDF was to allow the conversion of less portable documents to more portable documents.
But that doesn’t mean that PDF is the MOST portable. It’s not. Plain ascii text is simpler and more portable than PDF, and yet I am constantly barraged by PDF files that contain nothing other than a page or two of plain ascii text. Further, in the above discussion, converting HTML to PDF is discussed (!). Again, PDF is very portable, but HTML is more portable.
Please do convert your MS Word and graphical layouts and (insert less portable, proprietary format here) files to PDF. Please stop converting plain text and HTML to PDF.
Nobody anywhere has an operating environment that can read PDF, but cannot read plain text or HTML.
I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)
§
© 2001–9 Mark Pilgrim