Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: encode_pdfdocencoding() always returns bytes #2440

Merged

Conversation

sbourlon
Copy link
Contributor

@sbourlon sbourlon commented Feb 5, 2024

Issue: #2434

In the function encode_pdfdocencoding, cast its return value from bytearray to bytes to match its function signature.

This casting is necessary because bytearray is duck type compatible with bytes in mypy, however this library expects only bytes in its Encryption class.

@sbourlon
Copy link
Contributor Author

sbourlon commented Feb 5, 2024

I don't have any Windows machine to troubleshoot the test 🤷

@stefan6419846
Copy link
Collaborator

Thanks for the PR. Are you able to add a corresponding test here as well?

@sbourlon
Copy link
Contributor Author

sbourlon commented Feb 6, 2024

I can add a test function that verifies that encode_pdfdocencoding() always returns bytes.

Would it be enough?

@stefan6419846
Copy link
Collaborator

Yes, this should be sufficient in my opinion.

@sbourlon sbourlon force-pushed the fix-base-func-encode_pdfdocencoding branch from 095ac87 to e94f369 Compare February 6, 2024 19:05
Copy link

codecov bot commented Feb 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (3fb63f7) 94.43% compared to head (8299131) 94.43%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2440   +/-   ##
=======================================
  Coverage   94.43%   94.43%           
=======================================
  Files          49       49           
  Lines        8008     8008           
  Branches     1616     1616           
=======================================
  Hits         7562     7562           
  Misses        276      276           
  Partials      170      170           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tests/test_generic.py Outdated Show resolved Hide resolved
@sbourlon sbourlon changed the title BUG: encode_pdfdocencoding() always returns bytes [WIP] BUG: encode_pdfdocencoding() always returns bytes Feb 7, 2024
@sbourlon sbourlon changed the title [WIP] BUG: encode_pdfdocencoding() always returns bytes BUG: encode_pdfdocencoding() always returns bytes Feb 7, 2024
@sbourlon sbourlon force-pushed the fix-base-func-encode_pdfdocencoding branch from 404c0f6 to e3f1680 Compare February 7, 2024 08:53
In the function encode_pdfdocencoding, cast its return value
from bytearray to bytes to match its function signature.

This casting is necessary because bytearray is duck type
compatible with bytes in mypy, however this library expects only
bytes in its Encryption class.
@sbourlon sbourlon force-pushed the fix-base-func-encode_pdfdocencoding branch from e3f1680 to 25d5c93 Compare February 7, 2024 08:57
@sbourlon
Copy link
Contributor Author

sbourlon commented Feb 7, 2024

I think we are good now.

BTW, I noticed that the tests test_iss1723 and test_iss1767 have the same test file name but with different URLs. Can this create a conflict? See https://github.com/py-pdf/pypdf/blob/main/tests/test_writer.py#L1405-L1423

This was referenced Feb 7, 2024
@stefan6419846
Copy link
Collaborator

The test file name in test_iss1767 seems to be wrong and should be fixed. Looking at the logs, it seems like the issue arises from the fact that the filenames are identical and thus it depends on which test we execute first and thus which file gets cached.

For this reason, we probably do not need the initial download, but it should be sufficient to just use different/correct target names.

@sbourlon
Copy link
Contributor Author

sbourlon commented Feb 7, 2024

What about naming the file test_iss1767.pdf to be consistent with the naming pattern of test_iss1723?

@stefan6419846
Copy link
Collaborator

This is what I would propose as well.

@sbourlon
Copy link
Contributor Author

sbourlon commented Feb 7, 2024

Sent a separate PR to fix the naming conflict: #2445

@sbourlon
Copy link
Contributor Author

sbourlon commented Feb 7, 2024

Is this PR good for approval?

tests/test_writer.py Outdated Show resolved Hide resolved
@sbourlon sbourlon force-pushed the fix-base-func-encode_pdfdocencoding branch from 25d5c93 to 8299131 Compare February 9, 2024 19:03
Copy link
Collaborator

@stefan6419846 stefan6419846 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@stefan6419846 stefan6419846 merged commit efe7790 into py-pdf:main Feb 20, 2024
15 checks passed
@MartinThoma
Copy link
Member

Nice! Thanks for taking care of it - and nice to see that this process works technically and in practice 🎉 Made my day :-)

@sbourlon
Copy link
Contributor Author

You are very welcome. Thank you for maintaining this library.

@stefan6419846
Copy link
Collaborator

@MartinThoma It seems like your message has been a little bit misleading, but yes, the recent changes with the write permissions have worked 👍

stefan6419846 added a commit that referenced this pull request Mar 3, 2024
## What's new

Generating name objects (`NameObject`) without a leading slash
is considered deprecated now. Previously, just a plain warning
would be logged, leading to possibly invalid PDF files. According
to our deprecation policy, this will log a *DeprecationWarning*
for now.

### New Features (ENH)
- Add get_pages_from_field  (#2494) by @pubpub-zz
- Add reattach_fields function (#2480) by @pubpub-zz
- Automatic access to pointed object for IndirectObject (#2464) by @pubpub-zz

### Bug Fixes (BUG)
- Missing error on name without leading / (#2387) by @Rak424
- encode_pdfdocencoding() always returns bytes (#2440) by @sbourlon
- BI in text content identified as image tag (#2459) by @pubpub-zz

### Robustness (ROB)
- Missing basefont entry in type 3 font (#2469) by @pubpub-zz

### Documentation (DOC)
- Improve lossless compression example (#2488) by @j-t-1
- Amend robustness documentation (#2479) by @j-t-1

### Developer Experience (DEV)
- Fix changelog for UTF-8 characters (#2462) by @stefan6419846

### Maintenance (MAINT)
- Add _get_page_number_from_indirect in writer (#2493) by @pubpub-zz
- Remove user assignment for feature requests (#2483) by @stefan6419846
- Remove reference to old 2.0.0 branch (#2482) by @stefan6419846

### Testing (TST)
- Fix benchmark failures (#2481) by @stefan6419846
- Broken test due to expired test file URL (#2468) by @pubpub-zz
- Resolve file naming conflict in test_iss1767 (#2445) by @sbourlon

[Full Changelog](4.0.2...4.1.0)
stefan6419846 added a commit that referenced this pull request Mar 3, 2024
## What's new

Generating name objects (`NameObject`) without a leading slash
is considered deprecated now. Previously, just a plain warning
would be logged, leading to possibly invalid PDF files. According
to our deprecation policy, this will log a *DeprecationWarning*
for now.

### New Features (ENH)
- Add get_pages_from_field  (#2494) by @pubpub-zz
- Add reattach_fields function (#2480) by @pubpub-zz
- Automatic access to pointed object for IndirectObject (#2464) by @pubpub-zz

### Bug Fixes (BUG)
- Missing error on name without leading / (#2387) by @Rak424
- encode_pdfdocencoding() always returns bytes (#2440) by @sbourlon
- BI in text content identified as image tag (#2459) by @pubpub-zz

### Robustness (ROB)
- Missing basefont entry in type 3 font (#2469) by @pubpub-zz

### Documentation (DOC)
- Improve lossless compression example (#2488) by @j-t-1
- Amend robustness documentation (#2479) by @j-t-1

### Developer Experience (DEV)
- Fix changelog for UTF-8 characters (#2462) by @stefan6419846

### Maintenance (MAINT)
- Add _get_page_number_from_indirect in writer (#2493) by @pubpub-zz
- Remove user assignment for feature requests (#2483) by @stefan6419846
- Remove reference to old 2.0.0 branch (#2482) by @stefan6419846

### Testing (TST)
- Fix benchmark failures (#2481) by @stefan6419846
- Broken test due to expired test file URL (#2468) by @pubpub-zz
- Resolve file naming conflict in test_iss1767 (#2445) by @sbourlon

[Full Changelog](4.0.2...4.1.0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants