-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Document local-id syntax #206
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍🏻 Left one non-blocking comment, I don't feel strongly about permitting simple-identifier or only numbers. I think an even more important documentation problem is that we need to emphasize that local symbols must be truly local to the document. +1 to updating the parser to ignore locals that don't match the format we specify in the grammar.
@@ -135,6 +135,7 @@ message Document { | |||
// <identifier-character> ::= '_' | '+' | '-' | '$' | ASCII letter or digit | |||
// <escaped-identifier> ::= '`' (<escaped-character>)+ '`' | |||
// <escaped-characters> ::= any UTF-8 character, escape backticks with double backtick. | |||
// <local-id> ::= <simple-identifier> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we permit <simple-identifier>
or only numeric numbers? I haven't used non-numeric IDs for local symbols, are we doing that anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One benefit of simple-identifier
is that the output may be more readable because you can use the name of the variable instead of only a number. So +1 to allowing simple-identifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We aren't doing that anywhere ourselves, but it seems useful enough to allow for debugging in case someone wants to use letters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Permitting more than numbers is useful for local debugging - currently when locals in document change, we renumber every single local occurrence - meaning the diff is huge.
If we did some sort of scope hashing or indexing by name, then it'd be local <scope-id>-<relative-number>
.
Not consequential for anything user-facing, but good for development workflow.
I've updated the parsing logic in Go and Rust to make sure that we only allow simple identifiers. I noticed that the Rust tests weren't running in CI, so I've tweaked the CI script to fix that. We have an unfortunately named API |
@@ -8,12 +8,41 @@ import ( | |||
"github.com/sourcegraph/sourcegraph/lib/errors" | |||
) | |||
|
|||
// IsGlobalSymbol returns true if the symbol is obviously not a local symbol. | |||
// | |||
// CAUTION: Does not perform full validation of the symbol string's contents. | |||
func IsGlobalSymbol(symbol string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's inconsistent that IsGlobalSymbol
does no validation while IsLocalSymbol
does validation. Either both of them validate or neither of them do. I propose neither of them do validation, and we add separate IsWellFormedLocalSymbol
/IsWellFormedGlobalSymbol
functions validate.
Validation can become a performance bottleneck if you run the same validation on every single symbol in a critical path. Applications should be able to have if conditions to switch behavior for global/local symbols without performing validation (because the payload has been validated elsewhere in the pipeline)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the validation from IsLocalSymbol
Right now, the symbol parser doesn't validate that the rest of the suffix is a simple identifier.FixedShould we do that too?
Fixes #205
Test plan
Added new test cases