PDFDocument vs. SimplePDFViewer

pdfreader provides 2 different interfaces for PDFs:

What is the difference?

  • knows nothing about interpretation of content-level PDF operators

  • knows all about PDF file and document structure (types, objects, indirect objects, references etc.)

  • can be used to access any document object: XRef table, DocumentCatalog, page tree nodes (aka Pages), binary streams like Font, CMap, Form, Page etc.

  • can be used to access raw objects content (raw page content stream for example)

  • has no graphical state

  • uses PDFDocument as document navigation engine

  • can render document content properly decoding it and interpreting PDF operators

  • has graphical state

Use PDFDocument to navigate document and access raw data.

Use SimplePDFViewer to extract content you see in your favorite viewer (Adobe Acrobat Reader, hehe :-).

Let’s see several usecases.