Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functions to read OPDS streams - languages and categories #967

Merged
merged 5 commits into from
Jul 26, 2023

Conversation

juuz0
Copy link
Collaborator

@juuz0 juuz0 commented Jul 11, 2023

Fixes #968
Requirement for: kiwix/kiwix-desktop#960

@juuz0 juuz0 changed the title Functions to read OPDS strams - languages and categories Functions to read OPDS streams - languages and categories Jul 11, 2023
@codecov
Copy link

codecov bot commented Jul 11, 2023

Codecov Report

Patch coverage: 41.86% and project coverage change: +0.01% 🎉

Comparison is base (903dcd4) 38.87% compared to head (385931f) 38.89%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #967      +/-   ##
==========================================
+ Coverage   38.87%   38.89%   +0.01%     
==========================================
  Files          56       58       +2     
  Lines        3974     4006      +32     
  Branches     2187     2204      +17     
==========================================
+ Hits         1545     1558      +13     
- Misses       1097     1099       +2     
- Partials     1332     1349      +17     
Files Changed Coverage Δ
src/library_dumper.cpp 61.29% <ø> (+4.14%) ⬆️
src/tools/opdsParsingTools.cpp 40.62% <40.62%> (ø)
src/tools/languageTools.cpp 45.45% <45.45%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kelson42
Copy link
Collaborator

@juuz0 Ready to review? Code coverage does not pass, new functions should be tested.

@juuz0
Copy link
Collaborator Author

juuz0 commented Jul 11, 2023

@kelson42 added tests, CI/macOS seems to be a CI issue - failing for other PRs too

Yes, ready for review

Copy link
Member

@mgautierfr mgautierfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure those functions are put at the right place.
The Manager has been created to manage a library. The Library being a "simple" container of Books.

But here, parseLanguages and parseCategories do not manage everything. They are "just" parsing a feed.
This could be simple free function instead of method of Manager. And if we want them to be part of Manager (to make it something that "parse content"), it should be static method to allow the user to call them without creating a Manager instance first.


If fact, they are the equivalent of Library::getBooksLanguages or Library::getBooksCategories, but for a abstract "remote Library". Maybe there is a new class (potentially sharing the same common interface than Library) to create to correctly handle remote library.

src/manager.cpp Outdated Show resolved Hide resolved
@kelson42
Copy link
Collaborator

@juuz0 Can you please create the corresponding issue to explain what is the problem you face with libkiwix?

@juuz0
Copy link
Collaborator Author

juuz0 commented Jul 12, 2023

@kelson42 #968 created one

include/library_dumper.h Outdated Show resolved Hide resolved
@juuz0
Copy link
Collaborator Author

juuz0 commented Jul 12, 2023

If fact, they are the equivalent of Library::getBooksLanguages or Library::getBooksCategories, but for a abstract "remote Library". Maybe there is a new class (potentially sharing the same common interface than Library) to create to correctly handle remote library.

Having a remote library class felt like it should handle the download requests to /catalog/v2/languages too.
That's being done in kiwix-desktop right now (similar to how a book catalog is downloaded in kiwix desktop but parsed in libkiwix).

I put these functions in kiwix/tools.h for now - since these are tools for just parsing (no download requests etc).
But if we still want to go with the Remote Library class, let me know!

@juuz0 juuz0 force-pushed the opdsFilters branch 2 times, most recently from c4b2a6d to 09739c8 Compare July 12, 2023 14:44
@juuz0
Copy link
Collaborator Author

juuz0 commented Jul 12, 2023

Rebased

@kelson42
Copy link
Collaborator

@juuz0 @mgautierfr Please talk together, it's not good to see this PR somehow stuck since a week.

@juuz0
Copy link
Collaborator Author

juuz0 commented Jul 21, 2023

@mgautierfr Review needed!

Copy link
Member

@mgautierfr mgautierfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few small fixes.
And tests are failing.

include/tools.h Outdated Show resolved Hide resolved
src/tools/languageTools.cpp Outdated Show resolved Hide resolved
src/tools/languageTools.cpp Outdated Show resolved Hide resolved
@juuz0
Copy link
Collaborator Author

juuz0 commented Jul 21, 2023

And tests are failing.

Fixed. On my PC, the value was "und" for getLanguageSelfName(""). On the CI, it was ""
I made it compulsory to return "und" for empty string.

@juuz0
Copy link
Collaborator Author

juuz0 commented Jul 21, 2023

Rebased after the recent merge

@juuz0 juuz0 mentioned this pull request Jul 23, 2023
Copy link
Member

@mgautierfr mgautierfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. On my PC, the value was "und" for getLanguageSelfName(""). On the CI, it was ""
I made it compulsory to return "und" for empty string.

I'm a bit disturbed by this.
Why getLanguageSelfName return to different things. This means that ICU return different things for icu::Local::getISOLanguages. We must understand why.
From where is "und" coming on your PC ?

src/tools/languageTools.cpp Outdated Show resolved Hide resolved
@juuz0
Copy link
Collaborator Author

juuz0 commented Jul 25, 2023

Why getLanguageSelfName return to different things. This means that ICU return different things for icu::Local::getISOLanguages. We must understand why.

@mgautierfr
I want to mention that the CI failed only for ubuntu focal earlier (which uses ICU 66).

After some investigation, I found that this happens when a icu::Locale is created using the und languageCode.
Such locale will return

  • "" for getISO3Language()
  • und for getDisplayName()

I suspect the ubuntu-focal behaviour was unwanted (see ticket) and fixed in a later ICU version (ICU 69.1. ubuntu jammy uses 70.1)
https://unicode-org.atlassian.net/browse/ICU-21406

I think we should keep the mapping so it stays consistent across all 3 ubuntu builds
Also, check #967 (comment)

@mgautierfr
Copy link
Member

I think we should keep the mapping so it stays consistent across all 3 ubuntu builds

ok. I agree. And "Undetermined" is indeed better.

These files were overlooked during a merge of another PR
Added a new function to read languages stored in an OPDS feed
Added a function to load categories stored in an OPDS stream
Added tests on a sample OPDS language and categories stream
This is a general utility which other ports can get use of.
Added tests
@mgautierfr mgautierfr merged commit 9c91fc7 into main Jul 26, 2023
13 checks passed
@mgautierfr mgautierfr deleted the opdsFilters branch July 26, 2023 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

libkiwix should be able to parse catalog/v2/languages and catalog/v2/categories
3 participants