-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default to a UTF-8 locale in Java stub template #15159
Conversation
@comius I opted to go with the more hermetic variant of always setting |
The test failures indicate that there are all sorts of compatibility issues with this approach:
|
87c8e06
to
848a6d8
Compare
Given the large number of test failures and the fact that the test encyclopedia explicitly documents that all locale variables are unset, I gave up on the original approach of setting |
d84452d
to
d1d1725
Compare
cc @cushon If something along this way was possible for java_test rule, I would prefer this instead of making the stub template more complex. Testing such change internally would probably be more involved, but I don't mind doing it if it's the right thing. It seems to me the right thing to do would be to have it on all test rules, because I don't see Java anything special in the bug report (test in other languages could use UTF-8 filenames - and this is also another reason why not to put it into Java stub), however this would be more involved and would probably require migration by parts.
Does setting LANG instead of LC_TYPE help here?
Is there a way to fix this? How does the output look like? Should just the test be fixed? |
It seems that @comius Assuming this would require a migration anyway, would you prefer to set this on all actions or just on all test actions? |
d1d1725
to
cdedef8
Compare
All test actions is where such filenames may occur. All actions is too wide. This might conflict some other requirements of specific actions, so let's avoid it. |
@comius This seems pretty subtle based on the CI results: The Ubuntu 18.04 runner doesn't have the |
Hello @fmeum, Thank you for sharing your contributions in PR. Can you please fix the above buildkite presubmit build failures ? |
@comius Given the problem that neither |
Could we use platforms for this, to differentiate between those target environments and then statically pick the locale to use here? Also, for my understanding, is this going to make the default behaviour the JDK will have after JEP 400: UTF-8 by Default? |
That may be possible. It isn't going to be easy though if the available locales already differ between major distributions - should those be reflected in platform constraints? What I'm thinking: Maybe a prominent hint in the docs instructing users to set either
I wasn't aware of this effort, but based on a quick glance I would say yes. It even mentions |
My sense is that it would be nice if platforms could easily express this, but that it probably isn't a straightforward approach. I don't have as crisp an answer to 'what belongs in platform constraints' as I'd like. Maybe someone else reading this does?
What does the change to do this in the stub template look like? All else being equal it'd be nice to keep more logic out of the stub template, but there's precedent for putting work-arounds in the stub template (e.g. |
It would essentially amount to querying |
Thanks, that sounds reasonable to me. |
cdedef8
to
355b69d
Compare
On non-macOS Unix, without any locale variable set, the JVM defaults to using ASCII rather than UTF-8 as the encoding for file system paths (i.e., the value of the `sun.jnu.encoding` property).
355b69d
to
6cbd964
Compare
@cushon I implemented this approach. |
This makes sense to me, although I also want @comius's input. What do you think about also putting this change behind an |
I will wait for @comius input on the general approach, but agree that this should be moved behind a flag. |
This implementation looks like the most reasonable default behaviour.
I'd avoid additional flag if possible, because we already have many and I wouldn't expect a significant breakages to happen. In this case Bazel LTS policy doesn't require us to do the flag dance - introducing it for one release, flipping it and removing it. I'd also assume the stub template is something, that is probably already overriden/replaced by large users (like Google does) |
@bazel-io flag |
On non-macOS Unix, without any locale variable set, the OpenJDK defaults
to using ASCII rather than UTF-8 as the encoding for file system paths
(i.e., the value of the
sun.jnu.encoding
property).Fixes #15106