-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong LaTeX-Unicode mapping of \varepsilon #14751
Comments
As you can see in /base/latex_symbols.jl this mapping is autogenerated from https://www.w3.org/Math/characters/unicode.xml, and as far as I can tell the script faithfully copies the mapping from the source. There might be a bug in the w3 mappings though. cc: @stevengj |
I think this may be because This bug affects JunoLab/atom-latex-completions#3 |
Seems reasonable to add an exception here and update the table. For Also affects ipython/ipython#6380, as well as other Julia editor plug-ins. |
Our choice is not so much wrong as a reflection of historical inconsistencies over the proper mapping of epsilon. In this case it looks like we just happen to pick MathML's mappings, which are at variance with other standard definitions for STIX and XML/MathML2. Digging further:
|
I definitely agree with other people on making an exception here for a few reasons. If we compile a LaTeX document containing Personally, I am inputting Greek letters in Julia directly using the Greek keyboard layout instead of using the LaTeX command because that is faster. The "e" key in the Greek keyboard also generates This inconsistency could be a source of errors that are hard to track down, as I just experienced. I was extending someone else's code who used |
This problem will still happen even if we change the mapping (though it will be less frequent). This is the same situation as with mu vs. micro. See #5903. |
+1 |
@Godisemo: it's unclear what you're +1 ing here. |
@StefanKarpinski Yeah, I realised that now when you pointed it out. I'm +1 ing the fact that this really is an issue and that I support the proposition to change the \varepsilon expansion from ɛ (https://en.wikipedia.org/wiki/Open-mid_front_unrounded_vowel) to ε (https://en.wikipedia.org/wiki/Epsilon). |
The only thing I see that complicates things are that we have to make the same change in all editor plugins that people use. Personally, I only use vim and atom, so I don't know what other plugins are available for other editors. |
I think the best solution would be to first implement a custom normalization so that ɛ (U+025B latin small letter open e) and ε (U+03B5 greek small letter epsilon) are treated as equivalent in identifiers. Once that is done, we can gradually migrate editor plugins without breaking code. See JuliaStrings/utf8proc#11 (My main concern is that this opens a can of worms, since there are potentially a lot of custom normalizations we might want.) |
I think if we're conservative and take the custom normalizations on a case-by-case basis, it should be ok. The only major danger of each normalization is that someone might be using both letters in a pair that we start to normalize in otherwise indistinguishable ways, breaking code. However, any code that does that is either accidentally broken and would be fixed by the normalization or intentionally obfuscated, which I don't think is a major concern. So the criterion for custom normalization should be at least: would it be crazy to use these two characters in otherwise indistinguishable ways. |
Maybe we could issue a warning if both versions are detected in the same code? |
If we go the normalization route, I think we would just have a list of codepoints that we treat as (permanently) equivalent, with no warning. i.e. different ways of inputting "ε" should all be equally valid. |
I don't think treating them the same as a long term plan is a good idea. What if we start doing this for characters that look the same but are totally different, for example Α (capital alpha) and A. What if they look different in in other fonts? I just think we should use the correct characters for the specific latex expansions. |
Normalization of confusable characters is pretty well established in Unicode. Python 3 does much more aggressive (NFKC) normalization than us, for example. |
It gets a bit funny though when you do a search and or replace in your source file since no editor i know of treats visually similar characters as equal. Epsilon and varepsilon though are treated equal since they are the same character. |
We already do NFC normalization, so probably that bridge has already been crossed. And, as I said, Python 3 already does NFKC normalization and I don't see people complaining |
Yeah, maybe you are right. It would definitely be convenient to treat the visually ambiguous characters as the same. If normalization is a thing then maybe the editors should change instead. |
At some point after the 0.6 release, we should push this change to the various editor plugins |
\varepsilon
is currently mapped to ɛ (U+025B latin small letter open e). This is wrong. The correct mapping is ε (U+03B5 greek small letter epsilon).The text was updated successfully, but these errors were encountered: