Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Document local-id syntax #206

Merged
merged 3 commits into from
Oct 23, 2023
Merged

docs: Document local-id syntax #206

merged 3 commits into from
Oct 23, 2023

Conversation

varungandhi-src
Copy link
Contributor

@varungandhi-src varungandhi-src commented Oct 23, 2023

Right now, the symbol parser doesn't validate that the rest of the suffix is a simple identifier.
Should we do that too?
Fixed

Fixes #205

Test plan

Added new test cases

Copy link
Member

@olafurpg olafurpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏻 Left one non-blocking comment, I don't feel strongly about permitting simple-identifier or only numbers. I think an even more important documentation problem is that we need to emphasize that local symbols must be truly local to the document. +1 to updating the parser to ignore locals that don't match the format we specify in the grammar.

@@ -135,6 +135,7 @@ message Document {
// <identifier-character> ::= '_' | '+' | '-' | '$' | ASCII letter or digit
// <escaped-identifier> ::= '`' (<escaped-character>)+ '`'
// <escaped-characters> ::= any UTF-8 character, escape backticks with double backtick.
// <local-id> ::= <simple-identifier>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we permit <simple-identifier> or only numeric numbers? I haven't used non-numeric IDs for local symbols, are we doing that anywhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One benefit of simple-identifier is that the output may be more readable because you can use the name of the variable instead of only a number. So +1 to allowing simple-identifier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't doing that anywhere ourselves, but it seems useful enough to allow for debugging in case someone wants to use letters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed 👍🏻

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Permitting more than numbers is useful for local debugging - currently when locals in document change, we renumber every single local occurrence - meaning the diff is huge.

If we did some sort of scope hashing or indexing by name, then it'd be local <scope-id>-<relative-number>.

Not consequential for anything user-facing, but good for development workflow.

@varungandhi-src
Copy link
Contributor Author

varungandhi-src commented Oct 23, 2023

I've updated the parsing logic in Go and Rust to make sure that we only allow simple identifiers.

I noticed that the Rust tests weren't running in CI, so I've tweaked the CI script to fix that.

We have an unfortunately named API IsGlobalSymbol which technically doesn't do full validation; I've kept the behavior of it the same as before for backcompat.

@@ -8,12 +8,41 @@ import (
"github.com/sourcegraph/sourcegraph/lib/errors"
)

// IsGlobalSymbol returns true if the symbol is obviously not a local symbol.
//
// CAUTION: Does not perform full validation of the symbol string's contents.
func IsGlobalSymbol(symbol string) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's inconsistent that IsGlobalSymbol does no validation while IsLocalSymbol does validation. Either both of them validate or neither of them do. I propose neither of them do validation, and we add separate IsWellFormedLocalSymbol/IsWellFormedGlobalSymbol functions validate.

Validation can become a performance bottleneck if you run the same validation on every single symbol in a critical path. Applications should be able to have if conditions to switch behavior for global/local symbols without performing validation (because the payload has been validated elsewhere in the pipeline)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the validation from IsLocalSymbol

@varungandhi-src varungandhi-src merged commit 2260bb1 into main Oct 23, 2023
7 checks passed
@varungandhi-src varungandhi-src deleted the vg/local-id branch October 23, 2023 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

local-id is not defined in the symbol syntax
3 participants