Skip to content
This repository has been archived by the owner on Oct 10, 2019. It is now read-only.

STFL Parser error near '' with musl-libc #364

Closed
4 tasks
Minoru opened this issue Sep 28, 2016 · 8 comments
Closed
4 tasks

STFL Parser error near '' with musl-libc #364

Minoru opened this issue Sep 28, 2016 · 8 comments

Comments

@Minoru
Copy link
Collaborator

Minoru commented Sep 28, 2016

Janko on IRC reported this.

When converting forms to STFL's internal widechar representation, we specify //TRANSLIT to ensure that as many users as possible see something meaningful in place of characters that their locale encoding can't represent. However, musl doesn't implement transliteration, causing Newsbeuter to crash with STFL Parser error near ''. (The call to stfl_ipool_towc returns empty string.)

Three solutions were proposed:

  1. Remove //TRANSLIT altogether. The world moved on, new systems use UTF-8 (but we didn't find any statistics on that).
  2. Use #ifdefs to figure out what libc we're dealing with, decide to include //TRANSLIT based on that. There's a slight problem there: musl doesn't define any macro. We'll have to do it autoconf-style: compile a testing binary, run it, look at the exit code and figure out things from there.
  3. Do runtime check to see if //TRANSLIT results in stfl_ipool_towc returning empty string. If so, don't use //TRANSLIT anymore.

Since the problem depends on libc, i.e. something that doesn't change at runtime, the third option is out.

Out of the first two, the first is the best maintainability-wise, but will break things for people who run Newsbeuter on older systems with non-Unicode locales. Thus it looks like we'll have to go with option 2.

Implementation might proceed as follows:

  • write a small C++ program that will check if //TRANSLIT is necessary (with glibc it's used to avoid errors on unmappable chars). The result should be returned via exit code.
  • edit config.sh to compile and run the test program. Depending on the result, some flag shall be put into config.mk (see how DEFINES is used in check_ssl_implementation()).
  • edit all the places where //TRANSLIT is used so that //TRANSLIT is excluded if there's no support from libraries.
  • try and test it on other platforms, including Linux with glibc and some BSD systems (just to make sure nothing was broken).
@richfelker
Copy link

Why not at runtime, see if iconv_open fails when //TRANSLIT is used and then retry without it? Also, always omit //TRANSLIT when dest charset is UTF-8 since it's meaningless in that case (UTF-8 can represent everything).

@richfelker
Copy link

The reason musl does not support //TRANSLIT, btw, is that its contrary to the requirements of the standard. POSIX specifies that an argument to iconv_open containing / is to be treated as a pathname to a charmap file. Implementations are not required to support charmap files (musl doesn't) but in this case the open needs to fail to indicate that they aren't supported.

glibc is also wrong to produce an error when //TRANSLIT is omitted. POSIX requires iconv to perform an implementation-defined conversion of characters that are not present in the dest charset. Replacing them with ?'s or transliterating them would be conforming options.

@Minoru
Copy link
Collaborator Author

Minoru commented Sep 28, 2016

Why not at runtime

Because the behaviour doesn't depend on information that's only available in runtime. Implementing this as a runtime would be just bad taste, IMHO.

Thanks for the notes on how different libc implementations should and do behave—it's the first time I deal with this stuff. I'll edit the issue to say "check if //TRANSLIT is necessary" instead of "if current libc and iconv provide //TRANSLIT support".

@richfelker
Copy link

My view is that runtime checks/fallbacks are not bad/dirty/wrong for behavior that can't be statically determined without execution. The only way to do build-time checks for them would be to hard-code assumptions about specific targets (bad/dirty/wrong) or execute a test at build time (precluding cross compiling). The runtime fallback is simple and inexpensive.

@Minoru
Copy link
Collaborator Author

Minoru commented Sep 28, 2016

Janko also adds: "if musl or something else decides to implement TRANSLIT, you wouldn't have to recompile newsbeuter", which is a very good argument.

@richfelker
Copy link

Yes, it is. And if that happens, if you did a test at build-time with a new version of libc and found support for //TRANSLIT, the resulting binary would run (because there's no missing symbol) but fail to work right with older libc. This is another reason why, in general, it's best to do runtime checks for things you can't test just by compile/link tests. Another big class to which that applies is checking for behavior that depends on the kernel; someone who compiled a program on a newer kernel might still end up running it on an older one.

@Minoru
Copy link
Collaborator Author

Minoru commented Sep 28, 2016

Okay, you persuaded me. I'm going to rewrite the plan to implement runtime check instead.

@Minoru
Copy link
Collaborator Author

Minoru commented Sep 28, 2016

No, I'd rather keep the original text as context for the discussion. Let's have new plan here instead.

Implementation might proceed as follows:

  • we seem to just append //TRANSLIT to things, so let's define a function const char* f(const char* std) that will do just that;
  • use f everywhere where we mention //TRANSLIT at the moment. Check if anything broke;
  • make f check if our iconv supports //TRANSLIT and only append the string if it does;
  • make result of the check a static variable since it can't change during runtime for sure.

@Minoru Minoru self-assigned this Sep 28, 2016
kobezda added a commit to kobezda/newsbeuter that referenced this issue Sep 29, 2016
Some iconv() implementations don't support transliteration.
When iconv with //TRANSLIT is first attempted and fails,
try again without it, and remember the result.

Fixes akrennmair#364
@Minoru Minoru removed their assignment Sep 29, 2016
Minoru added a commit that referenced this issue Oct 2, 2016
Check for iconv() TRANSLIT support

Fixes #364, closes #174.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants