Working with PDFs…

Working with PDFs Using Command Line Tools in Linux by William J. Turkel.

From the post:

We have already seen that the default assumption in Linux and UNIX is that everything is a file, ideally one that consists of human- and machine-readable text. As a result, we have a very wide variety of powerful tools for manipulating and analyzing text files. So it makes sense to try to convert our sources into text files whenever possible. In the previous post we used optical character recognition (OCR) to convert pictures of text into text files. Here we will use command line tools to extract text, images, page images and full pages from Adobe Acrobat PDF files.

A great post if you are working with PDF files.

Comments are closed.