Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internationalization of kiwix-serve's backend #679

Merged
merged 28 commits into from
Apr 14, 2022
Merged

Conversation

veloman-yunkan
Copy link
Collaborator

@veloman-yunkan veloman-yunkan commented Jan 16, 2022

Addresses the C++ part of kiwix/kiwix-tools#59

This PR introduces facilities for supporting internationalized textual output from the kiwix-serve backend (C++ code).

  • Message templates (in mustache format) are stored in JSON files under static/i18n/ (one file per language). Those JSON files are compiled into C++ code by a new script kiwix-compile-i18n.
  • Access to messages from C++ code is through the kiwix::ParameterizedMessage class. ParameterizedMessage(msgId, [msgParams]) creates a message object with the specified id and (optional) parameters. The text of the message for the desired language must be obtained via the ParameterizedMessage::getText(const std::string& language) method.
  • In this version of the PR, user language control is only via an explicit userlang query parameter. Adding support for user language selection via cookies and/or the Accept-Language header should be easy.

@veloman-yunkan veloman-yunkan force-pushed the kiwix-serve-i18n branch 2 times, most recently from cb659d9 to ffb5373 Compare January 16, 2022 19:54
@codecov
Copy link

codecov bot commented Jan 16, 2022

Codecov Report

Merging #679 (927c125) into master (e22e073) will increase coverage by 0.54%.
The diff coverage is 95.55%.

@@            Coverage Diff             @@
##           master     #679      +/-   ##
==========================================
+ Coverage   60.55%   61.09%   +0.54%     
==========================================
  Files          56       58       +2     
  Lines        3732     3789      +57     
  Branches     2060     2080      +20     
==========================================
+ Hits         2260     2315      +55     
- Misses       1471     1473       +2     
  Partials        1        1              
Impacted Files Coverage Δ
src/server/request_context.h 100.00% <ø> (ø)
src/server/response.h 100.00% <ø> (ø)
src/server/i18n.cpp 91.17% <91.17%> (ø)
src/server/internalServer.cpp 82.54% <95.23%> (+0.25%) ⬆️
src/server/i18n.h 100.00% <100.00%> (ø)
src/server/request_context.cpp 86.73% <100.00%> (+0.86%) ⬆️
src/server/response.cpp 91.22% <100.00%> (+0.23%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e22e073...927c125. Read the comment docs.

@veloman-yunkan veloman-yunkan force-pushed the kiwix-serve-i18n branch 2 times, most recently from 0a6b256 to 7f9f562 Compare January 18, 2022 18:00
@veloman-yunkan veloman-yunkan changed the title Internationalization of kiwix-serve [WIP] Internationalization of kiwix-serve Jan 19, 2022
@veloman-yunkan
Copy link
Collaborator Author

@mgautierfr This PR is becoming pretty big. I think that it makes sense for you to throw a quick glance at it lest I move too far ahead with an approach that you may criticize in the end. Please note that intermediate commits contain some temporary solutions that get replaced in the end. The current state reflects my vision of how error messages and templates should be translated in the C++ code.

@kelson42
Copy link
Collaborator

kelson42 commented Feb 3, 2022

@veloman-yunkan Thx for this PR trying to tackling the complex problem of internationalisation. Unfortunately @mgautierfr is pretty busy these weeks and it might take a two weeks before you get a feedback from him. That said, I believe 5os PR desserves an explication comment, 5is is really too big to be obvious.

@veloman-yunkan
Copy link
Collaborator Author

@kelson42

I am planning to split this PR into at least two parts

  1. Introduce unit-tests (with necessary infrastructure) that check the content of the HTML responses that are going to be internationalized. The infrastructure should support easily adding new test points for a non-English language.
  2. Add internationalization. The unit tests from step 1 will help to make sure that the default output (for English) is not affected and new test points (added relying on facilities introduced in 1.) will prove that internationalization works.

@veloman-yunkan veloman-yunkan changed the base branch from master to unittest_for_suggestions February 11, 2022 12:20
@veloman-yunkan veloman-yunkan changed the base branch from unittest_for_suggestions to unittests_for_404_html_page February 11, 2022 15:27
@mgautierfr mgautierfr force-pushed the unittests_for_404_html_page branch 2 times, most recently from 80ecc77 to 34d069e Compare February 16, 2022 13:25
Base automatically changed from unittests_for_404_html_page to master February 16, 2022 13:38
@veloman-yunkan veloman-yunkan force-pushed the kiwix-serve-i18n branch 2 times, most recently from e8d8b63 to 3d576df Compare March 11, 2022 18:56
@veloman-yunkan veloman-yunkan changed the base branch from master to safer_testing_of_html_responses March 11, 2022 19:00
Base automatically changed from safer_testing_of_html_responses to master March 18, 2022 06:03
@veloman-yunkan veloman-yunkan changed the base branch from master to testing_of_ft_search_unavailable_page March 18, 2022 12:37
The (failing) tests now demonstrate that some text in the taskbar is not
translated. Will fix in the next commit.
The new test fails since the "Go to the main page" button is not yet
internationalized.
The new test fails since the "Go to random page" button is not yet
internationalized.
At this point a potential issue has been revealed. Now we produce
the final HTML via 2-level template expansion

1. Render parameterized messages
2. Render the HTML template

In which templates we should use double mustache "{{}}" (HTML-escaping)
tags and where we may use triple mustache "{{{}}}" (non-escaping) tags?
In the absence of the "userlang" query parameter in the URL, the value
of the "Accept-Language" header is used. However, it is assumed that
"Accept-Language" specifies a single language (rather than a comma
separated list of languages possibly weighted with quality values).

Example:

Accept-Language: fr
// should work

Accept-Language: fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5
// The requested language will be considered to be
// "fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5".
// The i18n code will fail to find resources for such a language
// and will use the default "en" instead.
@mgautierfr
Copy link
Member

Rebased-fixup on master.

I agree with the PR as it is now. I've few concern/improvement on it but it can be handle later, I will open issues for them.

@mgautierfr mgautierfr self-requested a review April 13, 2022 14:45
@mgautierfr mgautierfr merged commit c43c637 into master Apr 14, 2022
@mgautierfr mgautierfr deleted the kiwix-serve-i18n branch April 14, 2022 13:21
veloman-yunkan added a commit that referenced this pull request Apr 14, 2022
The story of search_results.css

static/skin/search_results.css was extracted from
static/templates/no_search_result.html before the latter was dropped.

static/templates/no_search_result.html in turn seems to be a copied and
edited version of static/templates/search_result.html.

In the context of exploratory work on the internationalization of
kiwix-serve (PR #679) I noticed duplication of inline CSS across those
two templates and intended to eliminated it. That goal was not fully
accomplished (static/templates/search_result.html remained untouched)
because by that time PR #679 grew too big and the efforts were diverted
into splitting it into smaller ones. Thus search_results.css slipped
into one of those small PRs, without making much sense because nothing
really justifies preserving custom CSS in the "Fulltext search unavailable"
error page.

At the same time, it served as the only case where a link to a cacheable
resource is generated in C++ code (rather than found in a template).
This poses certain problems to the handling of cache-ids. A workaround
is to expel the URL into a template so that it is processed by
`kiwix-resources`. This commit merely demonstrates that solution. But
whether it should be preserved (or rather the "Fulltext search
unavailable" page should be deprived of CSS) is questionable.
@Abijeet
Copy link

Abijeet commented Apr 21, 2022

I realise the srtings file format seems not really standart. @Nikerabbit Can TW bot indgest it?

Yes I think we should be able to ingest it. I've created this task to track the work.

Can you confirm that the translatewiki bot will have access to commit to this repository?

@kelson42
Copy link
Collaborator

@Abijeet Thank you!

mgautierfr pushed a commit that referenced this pull request Apr 26, 2022
The story of search_results.css

static/skin/search_results.css was extracted from
static/templates/no_search_result.html before the latter was dropped.

static/templates/no_search_result.html in turn seems to be a copied and
edited version of static/templates/search_result.html.

In the context of exploratory work on the internationalization of
kiwix-serve (PR #679) I noticed duplication of inline CSS across those
two templates and intended to eliminated it. That goal was not fully
accomplished (static/templates/search_result.html remained untouched)
because by that time PR #679 grew too big and the efforts were diverted
into splitting it into smaller ones. Thus search_results.css slipped
into one of those small PRs, without making much sense because nothing
really justifies preserving custom CSS in the "Fulltext search unavailable"
error page.

At the same time, it served as the only case where a link to a cacheable
resource is generated in C++ code (rather than found in a template).
This poses certain problems to the handling of cache-ids. A workaround
is to expel the URL into a template so that it is processed by
`kiwix-resources`. This commit merely demonstrates that solution. But
whether it should be preserved (or rather the "Fulltext search
unavailable" page should be deprived of CSS) is questionable.
veloman-yunkan added a commit that referenced this pull request May 2, 2022
The story of search_results.css

static/skin/search_results.css was extracted from
static/templates/no_search_result.html before the latter was dropped.

static/templates/no_search_result.html in turn seems to be a copied and
edited version of static/templates/search_result.html.

In the context of exploratory work on the internationalization of
kiwix-serve (PR #679) I noticed duplication of inline CSS across those
two templates and intended to eliminated it. That goal was not fully
accomplished (static/templates/search_result.html remained untouched)
because by that time PR #679 grew too big and the efforts were diverted
into splitting it into smaller ones. Thus search_results.css slipped
into one of those small PRs, without making much sense because nothing
really justifies preserving custom CSS in the "Fulltext search unavailable"
error page.

At the same time, it served as the only case where a link to a cacheable
resource is generated in C++ code (rather than found in a template).
This poses certain problems to the handling of cache-ids. A workaround
is to expel the URL into a template so that it is processed by
`kiwix-resources`. This commit merely demonstrates that solution. But
whether it should be preserved (or rather the "Fulltext search
unavailable" page should be deprived of CSS) is questionable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants