Cool Tools: pdf2html by Derek Willis.
From the post:
A PDF does one thing very well: it presents an accurate image that can be viewed on just about any device. Unfortunately, PDFs also cause grief for anyone who wants to use the data they contain. Governments, in particular, have a habit of releasing PDFs when the information would be more useful and accessible as a spreadsheet. The tools for extracting text from PDFs can be flaky, but Lu Wang’s pdf2htmlEX project solves this problem. Pdf2htmlEX takes PDFs and converts them into HTML5 documents while preserving the layout and appearance of the original.
This looks very cool!
Of course, moving from HTML5 is left as an exercise for the reader.