feat: write word to Program dictionary's stdin in UTF-8 instead of local 8 bit #1743
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
One line of change. No impact for Unix, only impact today's Windows in rare situations.
The root problem is that Python since 3.6 assume stdin on Windows is UTF-8 since 2016 12.
It is impossible for normal user to figure out this issue and unlikely to find out what's his local code page and how to deal with it.
A programming language has to take extra care to make needed Windows API available, but the important languages simply don't care, including python1, rust3, go, java17+4….
On high level, both Unix's locale dependent and Windows's code pages are 💩💩💩💩💩 that sane programmers generally avoid. In fact, Windows 11 default the code page to utf-8 5.
GD's original code assumes programs on Windows will use windows' code page 💩 to process data, but that's not true nowadays.
Since Python assumes stdin is UTF-8, I don't see why we shouldn't write stdin in UTF-8. This eliminates the rare Unicode error on Windows for Python.
In case of any encoding error on Windows for program dictionary, user can now deterministically and obviously know that what the root issue is and the direction of fixing it.
Footnotes
https://peps.python.org/pep-0528/ ↩ ↩2
https://docs.python.org/3/using/cmdline.html#envvar-PYTHONIOENCODING ↩
https://doc.rust-lang.org/std/io/fn.stdin.html#note-windows-portability-considerations ↩
https://docs.oracle.com/en/java/javase/21/intl/supported-encodings.html#GUID-A17E6FED-5880-4836-8E62-18007BD58E85 ↩
https://stackoverflow.com/questions/70201846/windows-11-default-api-and-utf-encoding ↩