Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests are failing on AlpineLinux #792

Closed
kmmndr opened this issue Jun 28, 2022 · 15 comments
Closed

Tests are failing on AlpineLinux #792

kmmndr opened this issue Jun 28, 2022 · 15 comments
Assignees
Milestone

Comments

@kmmndr
Copy link

kmmndr commented Jun 28, 2022

I'm trying to upgrade to 11.0.0 on AlpineLinux (Musl-libc) but there is one test failing:

...
[==========] 52 tests from 3 test suites ran. (2159 ms total)
[  PASSED  ] 51 tests.       
[  FAILED  ] 1 test, listed below:
[  FAILED  ] LibraryServerTest.catalog_v2_languages
                                                                                       
 1 FAILED TEST                                                                         
stderr:                                                                                
===== Unhandled error : Entry redirect_loop.html is a redirect entry.

Full error log : testlog.txt

This issue wasn't on last release 10.1.1

How to reproduce

Create the following Dockerfile

FROM alpine:edge

ENV pkgver 11.0.0

WORKDIR /buildir
RUN wget https://github.com/kiwix/libkiwix/archive/$pkgver.tar.gz \
 && tar xvfz $pkgver.tar.gz

WORKDIR /buildir/libkiwix-$pkgver

RUN apk add --no-cache build-base

RUN apk add --no-cache \
   curl-dev \
   gtest-dev \
   libmicrohttpd-dev \
   libzim-dev \
   meson \
   mustache \
   pugixml-dev \
   python3 \
   samurai \
   xapian-core-dev


# build
RUN meson . build --prefix /usr \
 && ninja -C build

Run docker build -t kiwix-lib-test . && docker run --rm -it kiwix-lib-test ninja -C build meson-test

@kelson42
Copy link
Collaborator

Again a problem linked to musl? See openzim/libzim#671

@kelson42 kelson42 added this to the 11.1.0 milestone Jun 28, 2022
@veloman-yunkan veloman-yunkan self-assigned this Jun 29, 2022
@veloman-yunkan
Copy link
Collaborator

The test LibraryServerTest.catalog_v2_languages fails with the following diff suggesting that under Alpine the ICU data misses language self-names:

@@ -25,5 +25,5 @@
   </entry>
   <entry>
-    <title>French</title>
+    <title>fran\xC3\xA7" "ais</title>
     <dc:language>fra</dc:language>
     <thr:count>1</thr:count>
@@ -35,5 +35,5 @@
   </entry>
   <entry>
-    <title>Russian</title>
+    <title>\xD1\x80\xD1\x83\xD1\x81\xD1\x81\xD0\xBA\xD0\xB8\xD0\xB9</title>
     <dc:language>rus</dc:language>
     <thr:count>1</thr:count>
@@ -44,3 +44,4 @@
     <id>12345678-90ab-cdef-1234-567890abcdef</id>
   </entry>
-</feed>\n
+</feed>
+"

@kelson42
Copy link
Collaborator

@veloman-yunkan Can we make a proper/dedicated test to secure ICU data are OK?

@veloman-yunkan
Copy link
Collaborator

The setup process in the Docker file installs as an implicit dependency icu-data-en which is a

Stripped down ICU data with only en_US/GB locale and no legacy charset converters

Full ICU data can be installed with

apk add icu-data-full

whereupon the tests pass

@veloman-yunkan
Copy link
Collaborator

@veloman-yunkan Can we make a proper/dedicated test to secure ICU data are OK?

Yes though with some approximation (for example it will only check that the self-name of fra is français rather than French).

@kelson42
Copy link
Collaborator

@veloman-yunkan There is no API in libicu to directly know with which dataset it is loaded? My goal to close this ticket is that we have a test which provides a clearer error message.

@veloman-yunkan
Copy link
Collaborator

veloman-yunkan commented Jun 29, 2022

@kelson42 I am not an expert in ICU but I think that the library is dataset agnostic. I think it can be configured to load multiple data files from different locations. What matters is the content of the loaded data.

@kmmndr
Copy link
Author

kmmndr commented Jun 29, 2022

Thanks for your help, installing icu-data-full package solved my problem 👍

@veloman-yunkan
Copy link
Collaborator

@kelson42 Can we close this issue? Or you would like to have the proposed unit test added?

@kelson42
Copy link
Collaborator

kelson42 commented Jul 3, 2022

@veloman-yunkan I would indeed like to have a test clearly stating that the libicu data is missing (if this is the case).

@veloman-yunkan
Copy link
Collaborator

@veloman-yunkan I would indeed like to have a test clearly stating that the libicu data is missing (if this is the case).

@kelson42 ICU documentation at the following links corroborates my earlier assessment that satisfying your requirement in a general way would be quite complex.

Below information is intended to document approaches that I tried (since I feel now that I will end up with a check that focuses only on the specific manifestation of missing ICU data that caused the unit test failure in this ticket).

The following simple heuristics failed:

TEST(stringTools, ICU)
{
  int32_t icuLocaleCount = 0;
  icu::Locale::getAvailableLocales(icuLocaleCount);
  // NOTE: 797 is the number of locales included in the icu-data-full (71.1-r2)
  // NOTE: package under alpine
  ASSERT_GE(icuLocaleCount, 797);
}

Above code worked on apline (using the system provided icu package) but failed in my build environment (native_static build environment created by kiwix_build) - for some reason the icuLocaleCount was 0.

TEST(stringTools, ICU) 
{ 
  int32_t icuISOLanguageCount = 0; 
  for (const auto* p = icu::Locale::getISOLanguages(); *p != NULL; ++p) { 
    ++icuISOLanguageCount; 
  } 
 
  // NOTE: 593 is the number of languages included in icu4c v58.2 (which is 
  // NOTE: the current version of ICU used by kiwix-build) 
  ASSERT_GE(icuISOLanguageCount, 593); 
}

Above version passes both under alpine and my build environment. The problem was that the unit test also passed under alpine when the icu-data-full package was not installed (i.e. locale data was present only for en_GB/US).

@mgautierfr
Copy link
Member

Please note that we patch ICU to remove some data we don't want kiwix/kiwix-build@fbb84e9 and associated pr kiwix/kiwix-build#44.
However this change is pretty old. I remember I based myself on some documentation I found but sadly, I've not noted which one.
The same way, we probably discussed in live about the PR and never write down the discussion 😞

@kelson42
Copy link
Collaborator

kelson42 commented Jul 6, 2022

Libicu data can be quite big. If you have them in a multiarch package, the problem is even worse. Consider then a low-end mobile device/bandwith... like Android... then the problem is obvious. This is why it has been requested to limit the data to the strict minimun necessary. I think, we had an empirical approach to decide what to keep.

@kelson42
Copy link
Collaborator

kelson42 commented Jul 20, 2022

@veloman-yunkan OK, so you suggest we close this ticket and don't do any further improvement?

@veloman-yunkan
Copy link
Collaborator

@kelson42 Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants