Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade dependencies especially zimscraperlib 3.x #170

Merged
merged 11 commits into from
Mar 25, 2024
Merged

Conversation

benoit74
Copy link
Collaborator

@benoit74 benoit74 commented Mar 19, 2024

The primary goal of this PR was to upgrade to zimscraperlib 3.x to benefit from VideoLowWebm preset v2.

However, while working on it, it became clear that other things had to be fixed / where low hanging fruits.

This PR will probably induce a v3 of the scraper, the fact that we now check metadata by default is clearly a breaking change.

Make some progress on #145
Fix #169
Fix #152
Fix #148

Changes

  • Upgrade dependencies and migrate to hatch-openzim
  • Add basic support for disabling metadata check if needed
    • We could imagine more advanced behavior where we disable only the settings which make sense according to user input
  • Add support for long description for ted2zim
    • long description is not available for ted2zim-multi because description is already not available anyway, they are always automatically computed
  • Display FFMPEG logs in case of failure
  • Cleanup incomplete/failed video download file
  • Add support for long description
  • Validate ZIM metadata as early as possible
  • Fix ZIM language metadata
    • it is now a list instead of a single value
    • we ony fallback to "eng" when "we don't know the language", not when there are many languages
    • when there are many languages, the CSV list is not yet ordered properly, we only have a hack to put "eng" first if present, since it is the most present language for sure in 99% of the cases (see for issue removing the hack)
  • Fix computation of automatic description and long description (through zimscraperlib automated truncations)

@benoit74 benoit74 self-assigned this Mar 19, 2024
@benoit74
Copy link
Collaborator Author

Nota : CI is failing for now, we need to wait for merge + release of openzim/python-scraperlib#147

Copy link

codecov bot commented Mar 25, 2024

Codecov Report

Attention: Patch coverage is 0% with 46 lines in your changes are missing coverage. Please review.

Project coverage is 0.00%. Comparing base (8451330) to head (dde529e).

❗ Current head dde529e differs from pull request most recent head 1ab9caf. Consider uploading reports for the commit 1ab9caf to get more accurate results

Files Patch % Lines
src/ted2zim/scraper.py 0.00% 32 Missing ⚠️
src/ted2zim/multi/scraper.py 0.00% 5 Missing ⚠️
src/ted2zim/processing.py 0.00% 5 Missing ⚠️
src/ted2zim/entrypoint.py 0.00% 2 Missing ⚠️
src/ted2zim/multi/entrypoint.py 0.00% 1 Missing ⚠️
src/ted2zim/utils.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff          @@
##            main    #170   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files          7       7           
  Lines        984    1013   +29     
  Branches     229     215   -14     
=====================================
- Misses       984    1013   +29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@rgaudin rgaudin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good ; see my suggested change before merging

src/ted2zim/scraper.py Outdated Show resolved Hide resolved
src/ted2zim/scraper.py Outdated Show resolved Hide resolved
@benoit74 benoit74 merged commit d57d160 into main Mar 25, 2024
5 checks passed
@benoit74 benoit74 deleted the upgrade_deps branch March 25, 2024 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants