-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error "Stream has ended unexpectedly" on getDocumentInfo with certain pdf file(s) #1288
Comments
@exiledkingcc / @MartinThoma The object get_data() returns (without any error/warning ???) an empty string as the decompression fails. Your opinion/help will be welcomed. |
i have made a pull request for this. |
The specification says: To understand the algorithm below, it is necessary to treat the O and U strings in the Encrypt dictionary as made up of three sections. The first 32 bytes are a hash value (explained below). The next 8 bytes are called the Validation Salt. The final 8 bytes are called the Key Salt. So /U and /O should be 48-bytes data, but for the PDF file which causes #1288 , /O 's length is 127-bytes. The redundant data are zeros. Fixes #1288
The fix is in Thank you everybody 🙏 |
I'm the guy from here and followed the call and having still issues with an encrypted pdf. I'm trying to extract metadata from this file. Advantage over pypdf3 is that the cover can be extracted without problem from the problematic files with pyPDF2.
The file can be opened from a "normal" pdf reader application and at least some of the metadata can be seen
Environment
Which environment were you using when you encountered the problem?
$ python -m platform Linux-5.15.0-46-generic-x86_64-with-glibc2.35 $ python -c "import PyPDF2;print(PyPDF2.__version__)" 2.10.3 PyCryptodome-3.15.0 is installed also
Code + PDF
(having the below mentioned pdf downloaded and renamed to
encrypt.pdf
)Share here the PDF file(s) that cause the issue. The smaller they are, the
better. Let us know if we may add them to our tests!
https://cloud.3dissue.net/24308/24333/24567/65779/Position_4.21-211104-DE-web-20211203082446.pdf
I'm not the owner/creator of the pdf so I recommend not to use them for automatic tests
Traceback
This is the complete Traceback I see:
The text was updated successfully, but these errors were encountered: