Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

August 31, 2013

Working with PDFs…

Filed under: Linux OS,PDF — Patrick Durusau @ 3:43 pm

Working with PDFs Using Command Line Tools in Linux by William J. Turkel.

From the post:

We have already seen that the default assumption in Linux and UNIX is that everything is a file, ideally one that consists of human- and machine-readable text. As a result, we have a very wide variety of powerful tools for manipulating and analyzing text files. So it makes sense to try to convert our sources into text files whenever possible. In the previous post we used optical character recognition (OCR) to convert pictures of text into text files. Here we will use command line tools to extract text, images, page images and full pages from Adobe Acrobat PDF files.

A great post if you are working with PDF files.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress