Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop disallowed control characters and strip blank characters #179

Merged
merged 1 commit into from
Jul 30, 2024

Conversation

benoit74
Copy link
Collaborator

@benoit74 benoit74 commented Jul 10, 2024

Fix #159
Fix openzim/warc2zim#128

  • Automatically drop all control characters in metadata except \r, \n and \t
  • Automatically strip blank characters , \r, \n and \t in metadata

@benoit74 benoit74 self-assigned this Jul 10, 2024
@benoit74 benoit74 marked this pull request as ready for review July 10, 2024 14:35
@benoit74 benoit74 requested a review from rgaudin July 10, 2024 14:35
Copy link

codecov bot commented Jul 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (1eddabc) to head (dbf7718).

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #179   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           32        32           
  Lines         1452      1459    +7     
  Branches       251       254    +3     
=========================================
+ Hits          1452      1459    +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@rgaudin rgaudin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor details ; please add abn entry in the CHANGELOG as well

src/zimscraperlib/zim/creator.py Outdated Show resolved Hide resolved
src/zimscraperlib/zim/creator.py Outdated Show resolved Hide resolved
src/zimscraperlib/zim/creator.py Show resolved Hide resolved
@benoit74 benoit74 requested a review from rgaudin July 29, 2024 15:50
@benoit74 benoit74 merged commit 7dac807 into main Jul 30, 2024
8 checks passed
@benoit74 benoit74 deleted the drop_control_characters branch July 30, 2024 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metadata does not automatically drops control characters ZIM entry title should not have control characters
2 participants