Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename scope punctuation.definition.numeric.base → constant.numeric.integer.base #2460

Conversation

BenjaminSchaaf
Copy link
Member

0b and related should not be punctuation. There's currently a workaround in the default colorschemes for this but this is a proper fix.

Copy link
Member

@wbond wbond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this rolling!

There are a number of places where constant.numeric.integer is being used for a number scoped with constant.numeric.float, which is incorrect. However, the more pressing issue to solve is the approach of the scope stacking and specialization. Currently:

  • All of these rules result in stacking: constant.numeric.integer.hexadecimal.cs constant.numeric.integer.base.cs. There aren't nested numbers here, so the scopes shouldn't "stack". Previously the punctuation was stacked on top of a constant.numeric since it was a different scope.
  • If we simply remove the stacking, we will end up with one of two different scope styles, neither of which are particularly useful to color schemes:
    • The base is a specialization of the main scope for the number. For example with the first pattern in C#, the 0x would be constant.numeric.integer.hexadecimal.base.cs and the rest would be constant.numeric.integer.hexadecimal.cs. This allows someone to target hexadecimal integer easily, but not be able to target all bases easily.
    • The base is not a specialization of the main number, but just of constant.numeric or constant.numeric.integer|float. This would result in the 0x getting constant.numeric.integer.base.cs and the rest getting constant.numeric.integer.hexadecimal.cs. This would make it pretty trivial to target all bases (you need two rules, one for integers and one for floats), but you can't target an entire hexadecimal number at all.

Pull request #2229 proposed adding some meta. scopes to the constant.numeric scopes. However, I feel like they would probably be better put on top, resulting in something like constant.numeric.integer.hexadecimal.cs meta.number.base and the rest getting constant.numeric.integer.hexadecimal.cs. This would make it possible to target constant.numeric meta.number.base to target bases and constant.numeric.integer.hexadecimal to target the entire scope. It is sort of unprecedented to stack meta. scopes on top, but I think it provides a way to make it practical for color scheme authors to target what they want.

Part of @deathaxe's proposal from #2229 was:

meta
  number
    sign
    base
    value
      mantissa
      exponent
    type
    unit

I think that this may be a little too complex, partially from the implementation perspective, but also so nuanced that I'm not really sure color scheme authors will really ever use most of them. I do think that meta.number.base should probably replace punctuation.definition.numeric.base and meta.number.type should probably replace storage.type.numeric.

One benefit of the proposed stacking of the two meta. scope on top is that we should be able to implement it pretty trivially with search and replaced throughout the syntaxes and tests. There shouldn't really need to be a shuffling of captures and whatnot.

In addition to feedback from @deathaxe, I'd love to get input from @FichteFoll, @jwortmann and @michaelblyons since they were involved with the discussion in #2135 related to numeric scoping. From a practical perspective, I'd love to try and keep the discussion focused on a mostly correct, but highly practical implementation. I'd prefer to not get too far into the weeds about exact semantics of number components, partially since I don't think that will have much real-world benefit, but also because we need to come up with guidelines that work across all syntaxes.

@BenjaminSchaaf
Copy link
Member Author

@wbond I figured the scope naming was going to be an issue here, so I just did a global replace to get the ball rolling here. Personally I'm in favor of the proposed meta scopes approach.

@deathaxe
Copy link
Collaborator

Adopting the meta.number approach shouldn't be too complex even though a bit more than just find and replace. I'd propably suggest to simplify those a bit. I am with @wbond mantissa and exponent to propably be a bit too much of detail. These are the ones which would make implementation a bit more complicated, from what I recall. I'd replace type and unit by a common suffix to avoid any discussions about what a certain token is. (There were discussions about i imaginary numbers suffix to be types or units.) I'd handle sign optional - if we need it at at all - not sure.

The result would be:

meta
  number
    (sign)
    base
    value
    suffix

I'd propably try to keep the constant.numeric scope right most. Color schemes should easily be able to address meta.number.base constant.numeric.

The basic idea behind meta.number is to apply the same strategy as with meta.string string.quoted and meta.string meta.interpolation to numbers. I hope it will never happen, but in case of variable interpolation the current strategy would allow something like meta.number meta.interpolation variable.other .... That's what I learned from HTML using meta.class-name after string scope - it makes it hard to clear coloring scopes while keeping metas.

@michaelblyons
Copy link
Collaborator

I'm mildly in favor of the meta scopes stacked on top but I'm willing to be convinced otherwise. If the meta scopes are inserted beneath the constant (DeathAxe style), will that mess with a view.find_by_selector('constant.numeric')? i.e. Will it return sets of numbers or sets of pieces of numbers (also same question for view.find_by_selector('constant.numeric, meta.number'))?

In no particular order:

  • I haven't ever found a specific use for the integer/float/etc. suffixes, but I don't object to keeping them. They're basically canon at this point.

  • I'd like to target i/j separately from the storage type (L/etc.).

  • I imagine it'd be nice to highlight the exponent, but I haven't tried it. 1.073861E63 could do with making the 63 stand out, even if it's just de-emphasising the E.

  • Do you want to scope "throwaway" characters? e.g. Python 10_000_000.

@deathaxe
Copy link
Collaborator

deathaxe commented Aug 24, 2020

No, it doesn't mess. constant.numeric always matches the whole number. It doesn't matter whether meta.number is before or after it. Also meta.number should have the same result as constant.numeric.

I'd like to target i/j separately from the storage type (L/etc.).

That's exactly the unit vs. type discussion. It might become tricky and complex for syntaxes which allow mixing imaginary and other type suffixes. IMHO an real and imaginary numbers are different types of numbers, while there are oppinions about those to denote the unit.

I imagine it'd be nice to highlight the exponent

Go used to highlighte the e of the exponent special. I found it useful here and there to quickly realize a number having an exponent at all. It isn't useful in syntaxes with many small tokens though. Things might quickly feel messy then. I'd just dimm it a bit.

Do you want to scope "throwaway" characters? e.g. Python 10_000_000

That's barely possible without pushing into dedicated contexts for each digit as sregex doesn't support repetitions. You never know how many throwaway characters may be present. Thats just too much of everything - implementation and highlighting.

@keith-hall
Copy link
Collaborator

Do you want to scope "throwaway" characters? e.g. Python 10_000_000

That's barely possible without pushing into dedicated contexts for each digit as sregex doesn't support repetitions. You never know how many throwaway characters may be present. Thats just too much of everything - implementation and highlighting.

I wonder if a lookahead match could check it is a valid integer (ie repetition shouldn't be a problem, nothing is being captured/scoped) , and push into a context which would match all digits and give them a constant.numeric scope and match all underscores and give those a different scope. Anything else could pop. The whole context would receive a meta.number scope. This would be similar to how the Markdown syntax identifies and scopes thematic breaks iirc.

@jwortmann
Copy link
Contributor

I would agree to basically everything @wbond wrote above, i.e. to use meta.number scopes for prefix and suffix instead of possibly specialized constant.numeric.integer.hexadecimal.base.cs scopes. For the meta scopes to be on top or before the constant.numeric scope, I believe it wouldn't make a difference regarding functionality, but I can see the point from @wbond now that on top would be easier from an implementation side. I can't imagine that there will ever be such a thing as variable interpolation within numbers (if I understood that hypothetical thought from @deathaxe correctly), so apart from the fact that it is the current norm for meta scopes, I don't see a benefit from it beeing before the constant.numeric scope.


IMHO an real and imaginary numbers are different types of numbers

I don't agree with that, because with "type" I would always mean the data type here, i.e. how the number is stored internally. This is denoted by the storage part in the current scope storage.type.numeric. Imaginary numbers in general can still be stored as integer or floating point (or rational in some languages), what we saw from the Ruby discussion, where the imaginary and type suffixes can be combined as e.g. ir. But I think this right now is the discussion @wbond wants to avoid here, because we should keep it practical both from the implementation and usage side. Therefore I think it is sufficient to combine them into one single meta scope, because I can't imagine that there is much demand from color schemes to distinguish between i and other suffixes. And in particular I would prefer meta.number.suffix as @deathaxe suggested as well, to avoid this discussion whether i/j really should belong to a type scope.


For the exponent or the e in floating point numbers: I personally wouldn't need or use a specific scope for it, but I don't have objections against it if people find it useful. Maybe it might be nice to have a suggested meta.number.value scope for the "main" part of a number just for consistency, i.e. so that the meta.number scope would span the whole number like constant.numeric does. But I'm not sure if it's really needed or useful.


Regarding 10_000_000, I don't see a need to give a special scope to the underscores, and I would imagine it to be pretty confusing if they are highlighted differently (for example you might have to take a closer look if it is really an underscore or a minus sign then). In any case, the underscores certainly should stay to be included into constant.numeric in my opinion, because they are part of the number notation, just like a prefix like 0b is. And since the implementation would apparently be unnecessarily complicated, especially for 3rd-party syntax authors, I don't think it would be a good idea to include a separate scope for this into a guideline.

This was referenced Aug 25, 2020
@wbond
Copy link
Member

wbond commented Aug 26, 2020

There are two PRs now showing the proposed scoping, thanks to @deathaxe at #2462 and #2463

@BenjaminSchaaf
Copy link
Member Author

One thing I don't think has been addressed yet is the punctuation.separator.decimal.cs. Considering it's most often part of the number token in parsers I think it makes sense to use meta.number.separator or similar instead. The current patches by deathaxe make it impossible to target the separator in a number, as they've changed from constant.numeric punctuation.separator to just punctuation.separator.

@jwortmann
Copy link
Contributor

I agree that the scope for the decimal separator dot should contain constant.numeric, because it is part of the number. I expressed my concern about its punctuation scope earlier in #2229, and I'm in favor of dropping punctuation for something like meta.number.separator. Color schemes can still target the separator dot separately then, if they want to highlight it, but it won't stand out by default due to the punctuation scope, as it is the case right now in all of the built-in color schemes but Monokai. If we decide to use meta.number.value for the main part of a number, then the only solution I see for this is to stack meta.number.separator on top of it, i.e. constant.numeric meta.number.value meta.number.separator (or meta.number.value meta.number.separator constant numeric).

@wbond
Copy link
Member

wbond commented Aug 26, 2020

I'm not sure I understand what you are referring to @BenjaminSchaaf and @jwortmann. You can definitely target the decimal point, at least that is what I am seeing in #2462. It is part of meta.number, part of constant.numeric and has punctuation.separator.

Ruby Number Scopes

Generally we don't want people using meta. scopes to highlight specific tokens, but more as a way to increase the specificity of a selector. This is actually a really strong argument for what @deathaxe did in putting the meta. scopes below the "normal" scopes. With the implementation shown above, a color scheme author could decide to not use the normal punctuation.separator by targeting constant.numeric punctuation.separator.

Another observation is that meta. scopes are general meant to be thought of as covering multiple tokens to help understand the context. In this case we are sort of expanding our usage of them to dissect a single token into components, where the average user probably doesn't want special highlighting, but certain power users may.

@jwortmann
Copy link
Contributor

You're right, I just saw the shift of constant.numeric from scope: into the capture group items, and saw some capture groups having only punctuation.separator.decimal. But actually they are always part of another capture group now, therefore including the meta.number.value constant.numeric scope.

However, I'm still unsure about its punctuation scope and personally would prefer another solution to avoid confusion with accessor dots. Another option to what I wrote above, would be to append separator to meta.number.value, resulting in meta.number.value.separator. That would require using 1 additional capture group though.

@wbond
Copy link
Member

wbond commented Aug 26, 2020

However, I'm still unsure about its punctuation scope and personally would prefer another solution to avoid confusion with accessor dots.

Those are punctuation.accessor.

@jwortmann
Copy link
Contributor

Those are punctuation.accessor

... which uses the exact same color in Mariana, Sixteen and Celeste (#2229 (comment)).

@wbond
Copy link
Member

wbond commented Aug 26, 2020

... which uses the exact same color in Mariana, Sixteen and Celeste (#2229 (comment)).

That's fine, use a different color scheme? I think we are getting far afield of the scoping of numbers by talking about color schemes.

@jwortmann
Copy link
Contributor

jwortmann commented Aug 26, 2020

I indeed use my own color scheme, I was just expressing my personal opinion about the visual style from giving the decimal dot a punctuation scope and hence highlighting it with a different color, which is something I have never seen in other editors. Fell free to ignore it, if the majority of users prefer this style.

It's the same as with the current prefix punctuation and suffix storage.type.numeric scopes, if color schemes don't address them, there would be no need for this proposal for a change to meta.numer, right?

@wbond
Copy link
Member

wbond commented Aug 26, 2020

I indeed use my own color scheme, I was just expressing my personal opinion about the visual style from giving the decimal dot a punctuation scope and hence highlighting it with a different color, which is something I have never seen in other editors. Fell free to ignore it, if the majority of users prefer this style.

I was looking at the existing Ruby syntax, and it uses that, so I don't think we are breaking new ground with #2462.

It's the same as with the current prefix punctuation and suffix storage.type.numeric scopes, if color schemes don't address them, there would be no need for this proposal for a change to meta.numer, right?

The punctuation.definition thing was new within the 4xxx builds, so I don't think anyone is using that.

The storage.type.numeric suffix is the only regression I think. And it probably should warrant some more consideration. We may need to keep it for a while for compatibility, sort of like we do with the scopes storage.type.function keyword.declaration.function for def in Python. The storage.type.function shouldn't really be there, but we didn't want to break all existing color schemes. Funny how storage.type is always the issue… 😄

@FichteFoll
Copy link
Collaborator

FichteFoll commented Aug 26, 2020

#2462 is really helpful in this dicussion.

Conceptually I believe scopes should generally modify their latest part for differences in nuances where possible, such as denoting the value or suffix part of a number token. Since the entire token is constant.numeric and that isn't being modified in the proposal, it shouldn't be the last scope. meta scopes are otherwise used to denote token collections and give them context. Thus, to me, it doesn't really matter whether the meta scope comes before or after the constant (i.e. is on top) and I wonder whether color schemes would care after it has been established once. I would then pick the one that is easier to implement on a syntax side then, which would be to put the non-changing scope name below the changing part.

If meta was to come first, I have the following proposal, which moves the notation kind to the meta scopes:

0x123.123j
^^^^^^^^^^ meta.number.hexadecimal constant.numeric
^^ constant.numeric.base
  ^^^^^^^ constant.numeric.value
         ^ constant.numeric.suffix
     ^ punctuation.separator.decimal

That would be a breaking change of course, but it allows the meta scope to be below the constant scope and still only modify the last scope within a number construct.

(I also like simply using "suffix" to catch the discussions of what a suffix would actually be. People may want to be more specific and add another sub-scope, if they really want to.)

@deathaxe
Copy link
Collaborator

Using meta.number.value.separator may have a significant impact on capture groups as we need to split larger meta.number.value capture groups into smaller pieces to avoid stacked meta.number meta.number scopes. I guess it is not far away from what it means to implement meta.number.value.mantissa and ...exponent. Furthermore punctuation feels like a more correct scope to denote a decimal separator.


I personally wouldn't be worried about maintaining storage.type scope for number suffixes too much, because only few syntaxes used to implement them - those which were modified more recently. Most of them just didn't at all - same for punctuation scopes.

Hence I'd just replace it by a general suffix scope. The more important part is to have consistent solution for all syntaxes. It shouldn't be a too big issue to suggest a small change in a user defined override of a certain color scheme.


Besides the fact it is some kind of backward compatibility issue, I honestly thought about @FichteFoll's proposal when I reworked the Ruby and JavaScript syntaxes yesterday. I can absolutely follow his arguments and kind of like it. Why?

From a color scheme perspective the detailed type of a number may not be too important. I guess only few will ever colorize number types differently. Improving readability of hexadecimal floating point numbers like 0xa2_54_23.ff25fdsp-1230D by propably dimming the base, separator (exponent p)? and the type suffix D might be much more practical and easier to address with Fichtes approach.

@jwortmann
Copy link
Contributor

I like the suggestion from @FichteFoll as well, but I guess he meant to include the number kind into the meta part of the scope too, i.e. meta.number.{integer|float|rational|imaginary}.{decimal|binary|octal|hexadecimal}?

I think this approach is easy to implement and useful for color schemes. I could prepare a PR for Matlab as illustration.

@BenjaminSchaaf
Copy link
Member Author

I've put in a PR for C# that takes the same approach.

@FichteFoll
Copy link
Collaborator

FichteFoll commented Aug 27, 2020

I like how these two turned out. The contexts and scope definitions are easy to read and reason about and contain the desired precision while still allowing color schemes to target whatever they want.

@deathaxe
Copy link
Collaborator

Applied it to Ruby as well. With a look at the other PRs it feels like the correct solution.

@jwortmann
Copy link
Contributor

But numerous just address keyword the same way as they do for storage

Yes, the storage.type.function keyword.declaration.function indeed is only useful if a color scheme targets keyword.control, keyword.operator, keyword.other separately, but not keyword or keyword.decalaration.


So, we want to move forward with making scope names better and more consistent for syntax and color scheme authors, while not drastically changing the colors for third-party syntax

Alright, so the summary is that we either need to keep storage.type.numeric as a topmost scope if we want that color not to change, or otherwise only use constant.numeric.suffix if we want a semantically consistent scopes, which still allow to highlight the suffix if the user or color scheme author adds it to the color scheme. Whether this possible change of the default color for the suffix is a drastical change or not, would be on you to decide, I guess. Personally I would have no problem with that change and therefore prefer only constant.numeric.suffix, and this highlighting style to not color the suffix differently is the usual style in other editors as well, I believe.

@wbond
Copy link
Member

wbond commented Sep 16, 2020

Here's the path forward:

  • The next build of ST will "forward fill" the following scope rules if there is a rule for the old scope, but not the new one:
    • storage.type.function -> keyword.declaration.function
    • storage.type.class -> keyword.declaration.class
    • storage.type.numeric -> constant.numeric.suffix
  • Color schemes should scope both the old and new scopes in the list above, for compatibility with old and new syntax
  • The syntaxes in the Default syntaxes should only use the newer scope, and not include the old one. Due to the way that scope scoring works, including both is useless since a rule for bare keyword will override a rule for storage.type.function, and the forward fill will make sure color schemes work with the new scopes.

It looks like we've got PRs open for:

We still need to address:

  • C
  • C++
  • Clojure
  • D
  • Erlang
  • Go
  • Haskell
  • Lua
  • Ocaml
  • Objective-C
  • Perl
  • Python
  • Scala
  • ShellScript
  • YAML

If you are planning to do a PR for one of these, please comment here before starting so we don't duplicate work.

@deathaxe
Copy link
Collaborator

Will update JavaScript to the latest scheme. Already did that in Java. Could create a dedicated PR as there are still some things I want to work on in my rewrite branch. So can't tell when it's done.

May have a look at python and yaml, too.

@wbond
Copy link
Member

wbond commented Sep 16, 2020

It looks like #2308 is probably the right place to address C/C++, right?

@deathaxe
Copy link
Collaborator

We could abuse it for meta.numbers, yes or just merge it as preparation for a rework. A more general question about C/C++ is also whether we want to push into contexts to highlight single invalid characters or whether just find some more permissive/relexed rules.

@wbond
Copy link
Member

wbond commented Sep 16, 2020

We could abuse it for meta.numbers, yes or just merge it as preparation for a rework.

Alright, I'll see if I can provide guidance to merge that, then we'll handle the meta number stuff separately.

@jwortmann
Copy link
Contributor

I can do a PR for Lua.

@deathaxe
Copy link
Collaborator

YAML may take a little longer. May need to split the already huge pattern a bit for better readability.

@wbond
Copy link
Member

wbond commented Sep 17, 2020

I'll most likely be working on these today. If anyone is keen to do one, give a shout. I'll update the list as I go.

@deathaxe
Copy link
Collaborator

Continuing with Perl

@BenjaminSchaaf
Copy link
Member Author

Working on D

@BenjaminSchaaf
Copy link
Member Author

Working on C/C++/Objective-C

@BenjaminSchaaf
Copy link
Member Author

Working on Go

@BenjaminSchaaf
Copy link
Member Author

Working on Haskell

@wbond
Copy link
Member

wbond commented Sep 18, 2020

I'm going to look at OCaml, Clojure and Scala now

@wbond
Copy link
Member

wbond commented Sep 18, 2020

Scratch that, I ended up spending my time on the TypeScript -> TSX transition instead. Anyone else is free to work on Clojure, OCaml and Scala.

@BenjaminSchaaf
Copy link
Member Author

Working on Clojure

@BenjaminSchaaf
Copy link
Member Author

Working on OCaml

@BenjaminSchaaf
Copy link
Member Author

Working on Scala

@wbond
Copy link
Member

wbond commented Sep 21, 2020

All of the syntaxes that were part of this PR have now been updated with the meta.number scopes. Additionally a whole bunch of other syntaxes have been updated to the new guidelines.

Thanks for all of the work by everyone on this. It was one of the most significant remaining steps in getting the default syntaxes ready for ST4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants