Handle encyrpted pdf of other than "algorithm 1 or 2" #19

OzzieIsaacs · 2022-04-09T10:31:57Z

I'm using version 1.06 of pyPDF3.
Currently pyPDF3 is only capable of handling "algorithm 1 and 2 encrypted pdfs". I would love to have also the other algorithms decodes. For my use case decrypting the document info header would be sufficient.

An encrypted file can be downloaded from here:
https://cloud.3dissue.net/24308/24333/24567/65779/Position_4.21-211104-DE-web-20211203082446.pdf

The following code sample demonstrates the problem (having the above mentioned pdf downloaded and renamed to encrypt.pdf):

from PyPDF3 import PdfFileReader
with open('encrypt.pdf', 'rb') as f:
    pdf_file = PdfFileReader(f)
    doc_info = pdf_file.getDocumentInfo()

This code throws the error:
PyPDF3.utils.PdfReadError: file has not been decrypted

Adding an additional decrypt statement like this:

from PyPDF3 import PdfFileReader
with open('encrypt.pdf', 'rb') as f:
    pdf_file = PdfFileReader(f)
    if pdf_file.isEncrypted:
        pdf_file.decrypt('')
    doc_info = pdf_file.getDocumentInfo()

leads to:
NotImplementedError: only algorithm code 1 and 2 are supported. This PDF uses code 5

For your reference the content of the relevant variables in this case:

I found "qpdf" which is able to handle the encryption of this files. The decryption algorithm of qpdf can be found in file https://raw.githubusercontent.com/qpdf/qpdf/main/libqpdf/QPDF_encryption.cc

Would be great if somebody could catch up from here and implement the decryption in pypdf3.

The text was updated successfully, but these errors were encountered:

MartinThoma · 2022-06-30T22:59:31Z

@OzzieIsaacs PyPDF2 recently added support for modern decryption ;-)

OzzieIsaacs mentioned this issue Aug 27, 2022

Error "Stream has ended unexpectedly" on getDocumentInfo with certain pdf file(s) py-pdf/pypdf#1288

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle encyrpted pdf of other than "algorithm 1 or 2" #19

Handle encyrpted pdf of other than "algorithm 1 or 2" #19

OzzieIsaacs commented Apr 9, 2022

MartinThoma commented Jun 30, 2022

Handle encyrpted pdf of other than "algorithm 1 or 2" #19

Handle encyrpted pdf of other than "algorithm 1 or 2" #19

Comments

OzzieIsaacs commented Apr 9, 2022

MartinThoma commented Jun 30, 2022