Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glibc-using-nix-installed or guix-installed hledger won't handle non-ascii data, even with LANG set #1033

Closed
simonmichael opened this issue May 31, 2019 · 15 comments
Labels
A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. docs Documentation-related. i18n Internationalisation/localisation-related. platform:nix

Comments

@simonmichael
Copy link
Owner

simonmichael commented May 31, 2019

Reported by @vifon: hledger installed via the nix command on the download page fails if the journal contains non-ascii characters. Reproduced on Gentoo and Ubuntu GNU/Linux; does not happen on mac.

Eg:

$ nix-env -i -f https://github.com/NixOS/nixpkgs/archive/fe41fd.tar.gz -A hledger; curl -s https://raw.githubusercontent.com/simonmichael/hledger/master/examples/unicode.journal | ~/.nix-profile/bin/hledger -f- print
...
hledger: <stdin>: hGetContents: invalid argument (invalid byte sequence)
$ echo $LANG
en_US.utf8
$ locale -a | grep US
en_US.iso885915
en_US.utf8
$ curl -s https://raw.githubusercontent.com/simonmichael/hledger/master/examples/unicode.journal | ~/.local/bin/hledger -f- print
(works)
@simonmichael simonmichael added A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. platform:nix docs Documentation-related. labels May 31, 2019
@simonmichael simonmichael changed the title nix-installed hledger won't handle non-ascii data, even with locale set on Linux, nix-installed hledger won't handle non-ascii data, even with locale set May 31, 2019
simonmichael added a commit that referenced this issue May 31, 2019
@simonmichael
Copy link
Owner Author

@peti, any idea what might be the cause ?

@peti
Copy link
Contributor

peti commented Jun 1, 2019

The Nix glibc does not know any locales (other than C). Locale data resides in a separate package, glibcLocales, and glibc can only find that package through an environment variable

LOCALE_ARCHIVE="$(nix-build --no-out-link "<nixpkgs>" -A glibcLocales)/lib/locale/locale-archive"

that needs to be set at run-time. Alternatively, that variable can also point to the locale archive of the native host OS:

LOCALE_ARCHIVE="/usr/lib/locale/locale-archive"

This separation exists so that Nix users can choose a custom set of locales they'd like to install without affecting glibc. If glibc would have a reference to the locale data, then any change to the locate data would trigger a complete re-build of the entire system.

@simonmichael
Copy link
Owner Author

Wow, good info, thanks. What do you think is the best approach for reliable install commands that take this into account ? This worked on Linux:

$ export LOCALE_ARCHIVE="$(nix-build --no-out-link '<nixpkgs>' -A glibcLocales)/lib/locale/locale-archive"
$ nix-env -i -f https://github.com/NixOS/nixpkgs/archive/fe41fd.tar.gz -A hledger hledger-web hledger-ui

but not on Mac:

$ nix-build --no-out-link '<nixpkgs>' -A glibcLocales
error: expression does not evaluate to a derivation (or a set or list of those)

@simonmichael
Copy link
Owner Author

Also, I'm curious why it doesn't require this extra step on Mac.

@peti
Copy link
Contributor

peti commented Jun 1, 2019

Nix doesn't use glibc on Darwin. I believe Darwin builds use the native libc, so the whole "re-building the world" problem doesn't exist on that OS.

@samm81
Copy link

samm81 commented Apr 10, 2020

just wanted to chime in and say this fixed my issue on Ubuntu 18.04! thank you 😁

@tbm
Copy link
Contributor

tbm commented Jul 26, 2020

I just ran into this issue. It would be good if the file line (and character position) would be displayed so at least users can check what hledger doesn't like about the file.

@SqrtMinusOne
Copy link

Same on Guix, although it's easily resolved by:

guix install glibc-locales

And adding

export GUIX_LOCPATH=$HOME/.guix-profile/lib/locale

somewhere to .profile

@Xitian9
Copy link
Collaborator

Xitian9 commented Mar 9, 2022

This is really a nix issue, rather than a hledger issue. Perhaps we can close this?

@tbm
Copy link
Contributor

tbm commented Mar 9, 2022

This is not a Nix issue. I experienced it on Debian. I think it happens when the locales are not installed.

While it needs to be fixed by the user, I suggested something that hledger can do:

It would be good if the file line (and character position) would be displayed so at least users can check what hledger doesn't like about the file

Also something about installing locales could be put into the hledger docs.

@simonmichael
Copy link
Owner Author

simonmichael commented Mar 9, 2022

There are multiple issues (or a multi-headed issue), including:

  1. GHC-compiled programs handling non-ascii data require a utf8-aware locale to be configured, or will die with various unhelpful messages. (https://gitlab.haskell.org/ghc/ghc/-/issues/17755)
  2. Nix-installed programs on platforms using glibc require locale to be configured in a special nix way (not $LANG). (comment)
  3. GUIX-installed programs have a similar issue and special guix way of setting locale (comment)
  4. hledger install docs (https://hledger.org/install.html#utf-8-locale-required, https://hledger.org/install.html#check-your-locale) don't mention the special nix way
  5. hledger doesn't include special handling to try to improve on the default behaviour (possibly related: Data.Text.IO should not be used #1619 ?)

@simonmichael simonmichael added the i18n Internationalisation/localisation-related. label Mar 9, 2022
@simonmichael simonmichael changed the title on Linux, nix-installed hledger won't handle non-ascii data, even with locale set on Linux, nix-installed hledger won't handle non-ascii data, even with LANG set Mar 9, 2022
@simonmichael simonmichael changed the title on Linux, nix-installed hledger won't handle non-ascii data, even with LANG set glibc-using-nix-installed or guix-installed hledger won't handle non-ascii data, even with LANG set Mar 9, 2022
Xitian9 added a commit to Xitian9/hledger that referenced this issue Mar 10, 2022
Xitian9 added a commit to Xitian9/hledger that referenced this issue Mar 10, 2022
Xitian9 added a commit to Xitian9/hledger that referenced this issue Mar 10, 2022
Xitian9 added a commit to Xitian9/hledger that referenced this issue Mar 10, 2022
Xitian9 added a commit to Xitian9/hledger that referenced this issue Mar 10, 2022
Xitian9 added a commit to Xitian9/hledger that referenced this issue Mar 10, 2022
Xitian9 added a commit to Xitian9/hledger that referenced this issue Mar 10, 2022
Xitian9 added a commit to Xitian9/hledger that referenced this issue Mar 12, 2022
Xitian9 added a commit to Xitian9/hledger that referenced this issue Mar 26, 2022
Xitian9 added a commit to Xitian9/hledger that referenced this issue Apr 28, 2022
Xitian9 added a commit to Xitian9/hledger that referenced this issue May 10, 2022
…ith utf8 encoding. (simonmichael#1619)

May also fix simonmichael#1154, simonmichael#1033, simonmichael#708, simonmichael#536, simonmichael#73: testing is needed.

This should hopefully avoid encoding issues, but since it fundamentally
alters how encoding is dealt with it may lead to unexpected outcomes.
Widespread testing on a number of different platforms would be useful.
Xitian9 added a commit to Xitian9/hledger that referenced this issue May 22, 2022
…ith utf8 encoding. (simonmichael#1619)

May also fix simonmichael#1154, simonmichael#1033, simonmichael#708, simonmichael#536, simonmichael#73: testing is needed.

This aims to solve all problems where misconfigured locales lead to
parsers failing on utf8-encoded data. This should hopefully avoid
encoding issues, but since it fundamentally alters how encoding is dealt
with it may lead to unexpected outcomes. Widespread testing on a number
of different platforms would be useful.
Xitian9 added a commit to Xitian9/hledger that referenced this issue May 22, 2022
…ith utf8 encoding. (simonmichael#1619)

May also fix simonmichael#1154, simonmichael#1033, simonmichael#708, simonmichael#536, simonmichael#73: testing is needed.

This aims to solve all problems where misconfigured locales lead to
parsers failing on utf8-encoded data. This should hopefully avoid
encoding issues, but since it fundamentally alters how encoding is dealt
with it may lead to unexpected outcomes. Widespread testing on a number
of different platforms would be useful.
@simonmichael
Copy link
Owner Author

Same on Guix, although it's easily resolved by:

guix install glibc-locales

And adding

export GUIX_LOCPATH=$HOME/.guix-profile/lib/locale

somewhere to .profile

Hello @SqrtMinusOne .. do you know of a one-line GUIX install command suitable for our Install page ? Would guix install glibc-locales hledger (plus the GUIX_LOCPATH tip) be enough ?

@SqrtMinusOne
Copy link

@simonmichael Well, Guix is meant to be configured declaratively, so there isn't a one liner that installs the package the Guix way :-)

But yeah, in principle

guix install glibc-locales hledger && echo "export GUIX_LOCPATH=$HOME/.guix-profile/lib/locale" >> ~/.profile

will do the job. Users will have to adapt it to their setup.

@simonmichael
Copy link
Owner Author

Thanks, perhaps I'll leave GUIX off the page in that case.

@simonmichael
Copy link
Owner Author

See also #2089.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. docs Documentation-related. i18n Internationalisation/localisation-related. platform:nix
Projects
None yet
Development

No branches or pull requests

6 participants