Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 10, 2013

Cool Tools: pdf2html

Filed under: HTML5,PDF — Patrick Durusau @ 5:37 pm

Cool Tools: pdf2html by Derek Willis.

From the post:

A PDF does one thing very well: it presents an accurate image that can be viewed on just about any device. Unfortunately, PDFs also cause grief for anyone who wants to use the data they contain. Governments, in particular, have a habit of releasing PDFs when the information would be more useful and accessible as a spreadsheet. The tools for extracting text from PDFs can be flaky, but Lu Wang’s pdf2htmlEX project solves this problem. Pdf2htmlEX takes PDFs and converts them into HTML5 documents while preserving the layout and appearance of the original.

This looks very cool!

Of course, moving from HTML5 is left as an exercise for the reader. 😉

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress