Introduction There are several ways to mine tables and other content from a pdf, using R. After a lot of trial & error, here’s how I managed to extract global exam results from an international, massive, yearly examination, the EDAIC. This is my first use case of “pdf mining” with R, and also a fairly simple one. However, more complex and very fine examples of this can be found elsewhere, using both pdftools and tabulizer packages.

Continue reading

Author's picture

aurora-mareviv

Anesthesiologist, MD, postdoc. Utter Rstats geek

Universidade de Santiago de Compostela

Spain