Introduction There are several ways to mine tables and other content from a pdf, using R. After a lot of trial & error, here’s how I managed to extract global exam results from an international, massive, yearly examination, the EDAIC.
This is my first use case of “pdf mining” with R, and also a fairly simple one. However, more complex and very fine examples of this can be found elsewhere, using both pdftools and tabulizer packages.
I am about to go on a short holiday, so I was tidying the code lines I had scattered around before leaving… And I found this: a minimal EPS to PDF converter, which is barely a LaTeX template.
It is intended for transforming an .EPS graph to the .PDF format. You can copy & paste this whole code into a blank text file (but with .TEX extension) and run it with a TeX editor.