Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

dhdaines / playa Public

Notifications You must be signed in to change notification settings
Fork 0
Star 9

Code
Issues 3
Pull requests 1
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Releases: dhdaines/playa

Releases · dhdaines/playa

PLAYA-PDF 0.2.9: Final (really) 0.2 release

12 Feb 13:40

dhdaines

Compare

Choose a tag to compare

Loading

PLAYA-PDF 0.2.9: Final (really) 0.2 release Latest

Latest

What's Changed

fix: Support the all-important empty name object
feat!: Break the CLI again (ZeroVer YOLO) to better support page ranges
feat: Support some limited and lossy text extraction in the CLI
feat: Add necessary .doc property to page list
fix: Correct type annotations for page list

Full Changelog: v0.2.8...v0.2.9

Assets 2

Loading

All reactions

PLAYA-PDF 0.2.8: Much Improved Parallelism (M. I. P.)

22 Jan 13:45

dhdaines

Compare

Choose a tag to compare

Loading

PLAYA-PDF 0.2.8: Much Improved Parallelism (M. I. P.)

What's Changed

fix: accept None for max_workers by @dhdaines in #49
Avoid using (non-serializable) weak references in worker processes by @dhdaines in #50
Enable resolution of indirect object references when using worker processes by @dhdaines in #51

Full Changelog: v0.2.7...v0.2.8

Contributors

dhdaines

Assets 2

Loading

All reactions

PLAYA-PDF 0.2.7: Definitive 0.2.x release

07 Jan 17:21

dhdaines

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

PLAYA-PDF 0.2.7: Definitive 0.2.x release

What's Changed

Remove most uses of Typing.cast by @dhdaines in #37
Optimize text placement (some dare call it "rendering") by @dhdaines in #38
Fix font size and rotated/skewed bounding boxes by @dhdaines in #39
fix: deprecate layout in CLI right away and do other useful stuff by @dhdaines in #40
Correctly implement ToUnicode according to the PDF standard and not that bogus technical note (that the PDF standard refers to...) by @dhdaines in #41
feat: support slices and tuples in page list by @dhdaines in #42
Optimize text extraction a bit more by @dhdaines in #43
Make text less Lazy 😥 by @dhdaines in #47
Treat marked content sections (more) correctly
fix: recognize junk before header and compensate (fixes: #46) by @dhdaines in #48

Full Changelog: v0.2.6...v0.2.7

Contributors

dhdaines

Assets 2

Loading

All reactions

PLAYA-PDF 0.2.6: New year, new acronym

30 Dec 18:30

dhdaines

Compare

Choose a tag to compare

Loading

PLAYA-PDF 0.2.6: New year, new acronym

What's Changed

ci: test on windows and mac by @dhdaines in #33
Support parallel operations over pages by @dhdaines in #36
Partially correct the handling of some types of CMaps (not fully correct though)

Full Changelog: v0.2.5...v0.2.6

Contributors

dhdaines

Assets 2

Loading

All reactions

PLAYA-PDF 0.2.5: Bug fixes and improvements

15 Dec 18:08

dhdaines

Compare

Choose a tag to compare

Loading

PLAYA-PDF 0.2.5: Bug fixes and improvements

What's Changed

Fix various bugs in the lazy API
- Add specialized __len__ methods to ContentObject classes
- Clarify iteration over ContentObject
Fix installation of playa-pdf[crypto]
Fix attribute classes in structure tree elements
Deprecate "user" device space to avoid confusion with user space
Parse embedded CMaps (mostly)
Update pdfplumber support
Add parser for object streams and iterator over all indirect objects
in a document

Full Changelog: v0.2.4...v0.2.5

Assets 2

Loading

All reactions

v0.2.4

03 Dec 04:08

dhdaines

Compare

Choose a tag to compare

Loading

v0.2.4

What's Changed

Add (and fix) 3rd party test suites, primariy pdf.js by @dhdaines in #26
Try much harder to read even very broken PDFs
Try somewhat harder to not produce empty TextObject (still a work in progress)

Full Changelog: v0.2.3...v0.2.4

Contributors

dhdaines

Assets 2

Loading

All reactions

PLAYA-PDF 0.2.3: Release early and often (before vacation)

28 Nov 22:09

dhdaines

Compare

Choose a tag to compare

Loading

PLAYA-PDF 0.2.3: Release early and often (before vacation)

What's Changed

Require a newline before EI to fix various inline images by @dhdaines in #25
Refactoring the CMap parser missed a very important corner case (which somehow mypy did not flag?)
structtree property did not actually exist on Document and Page (oops!)

Full Changelog: v0.2.2...v0.2.3

Contributors

dhdaines

Assets 2

Loading

All reactions

PLAYA-PDF 0.2.2: Make it go fast again

28 Nov 03:59

dhdaines

Compare

Choose a tag to compare

Loading

PLAYA-PDF 0.2.2: Make it go fast again

What's Changed

Resolve filters before checking if it isn't a list by @dhdaines in #22
Verify that we don't have pdfminer.six#1059 (and warn about it) by @dhdaines in #23
Optimize cmaps by @dhdaines in #24

Full Changelog: v0.2.1...v0.2.2

Contributors

dhdaines

Assets 2

Loading

All reactions

PLAYA-PDF 0.2.1: Fix some bugs

27 Nov 04:01

dhdaines

Compare

Choose a tag to compare

Loading

PLAYA-PDF 0.2.1: Fix some bugs

What's Changed

Fix the RLE implementation by @dhdaines in #19 (originally pdfminer/pdfminer.six#1055 by @helpmefindaname)
Report the actual device space bounding box for rotated text by @dhdaines in #20
Prevent endless looping on bogus stream length and other EOFs by @dhdaines in #21

Full Changelog: v0.2...v0.2.1

Contributors

dhdaines and helpmefindaname

Assets 2

Loading

All reactions

PLAYA-PDF 0.2: Break all the APIs

26 Nov 03:36

dhdaines

Compare

Choose a tag to compare

Loading

PLAYA-PDF 0.2: Break all the APIs

What's Changed

Support TIFF predictor on image streams by @dhdaines in #18 (originally from pdfminer/pdfminer.six#1058 by @helpmefindaname)
Support different "device spaces" (screen, page, and default user space)
expose form XObjects on Page to allow getting only their contents
expose form XObject IDs in LayoutDict
make TextState conform to PDF spec (leading and line matrix) and document it
expose more of TextState in LayoutDict (render mode in particular)
do not try to map characters with no ToUnicode and no Encoding
properly support Pattern color space (uncolored tiling patterns) the
way pdfplumber expects it to work
support marked content points as ContentObjects
document ContentObjects
make a proper schema for LayoutDict, document it, and communicate it to Polars
separate color values and patterns in LayoutDict

Full Changelog: v0.1.2...v0.2

Contributors

dhdaines and helpmefindaname

Assets 2

Loading

All reactions

Previous 1 2 Next

Previous Next

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.