Skip to content

PLAYA 0.1.2: Initial release

Compare
Choose a tag to compare
@dhdaines dhdaines released this 20 Nov 05:20
· 177 commits to main since this release

Here's a first release, in case you want to use this. Reasons you might do so include:

  • Faster than pdfminer.six (about 20% or so)
  • Much friendlier APIs than PDFPageAggregator, PDFResourceManager, PDFPage, etc, etc.
  • Many outstanding pdfminer.six bugs have been fixed

Why would you not want to use this?

  • PyPI package name is not actually playa because somebody else took that name 13 years ago.
  • May be more or less tolerant of broken PDFs than pdfminer.six, and has no "strict mode" to be absolutely intolerant.
  • Doesn't let you extract image data (this is not always useful since PDFs tend to use compositing and thus you should use a real PDF renderer like pypdfium2 if you want to reliably extract images)
  • Is not (or ain't) a layout analyzer, so no LAParams, TextBox, and so on.
  • API subject to change and refinement.
  • Does not have abstractions. You do not have the flexibility to subclass everything and build a PDF renderer on top of PLAYA.
  • Probably contains bugs.
  • Definitely lacks documentation.