This is a tip sent by WebUpd8 reader Stone Cut, on extracting images and text from PDF files. It’s different from his previous tip and useful for other cases.
Firstly, install the necessary utilities:
sudo apt-get install poppler-utils
sudo yum install poppler-utils
For other Linux distributions, search for poppler-utils in your package manager.
This command will extract all the images from “pdffile.pdf” and put them in the /home/<username>/pdfimages/directory:
pdfimages -j pdffile.pdf ~/pdfimages/
Please note, that this command will only extract real text. If your PDF contains images with text printed on them then this won’t work – please refer to my older tip for these sorts of files: How To Extract All Text From PDFs (Including Text In Images).
|This post was written by WebUpd8 reader Stone Cut (thank you very much!).|