Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utility to get PDF info for proper titles on PDF entries #168

Closed
benoit74 opened this issue Jun 18, 2024 · 0 comments · Fixed by #182
Closed

Add utility to get PDF info for proper titles on PDF entries #168

benoit74 opened this issue Jun 18, 2024 · 0 comments · Fixed by #182
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@benoit74
Copy link
Collaborator

Content of PDF documents is not indexed for suggestions, while on some ZIM it is the "core" of the ZIM.

Having a utility in scraperlib to extract PDF info and get the document title would probably help.

See openzim/warc2zim#290 for one use-case.

@benoit74 benoit74 added the enhancement New feature or request label Jun 18, 2024
@benoit74 benoit74 added this to the 3.5.0 milestone Jun 20, 2024
@benoit74 benoit74 modified the milestones: 3.5.0, 4.0.0 Jul 10, 2024
@benoit74 benoit74 self-assigned this Jul 11, 2024
@benoit74 benoit74 modified the milestones: 4.0.0, 3.5.0 Jul 15, 2024
@benoit74 benoit74 modified the milestones: 3.5.0, 4.0.0 Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant