Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
First step in #2461
This PR adds support for semantic token deltas.
Here's how semantic highlighting works before and after this change.
Before
On every keypress, the editor sends a
textDocument/semanticTokens/full
request and the server returns all tokens for the entire file. In large files, the editor will sometimes sendtextDocument/semanticTokens/range
to compute tokens for a limited range instead of doing it for the entire file.After
When the file is first opened, the editor will send a
textDocument/semanticTokens/full
to compute all tokens for that file. From that point on, the editor will only sendtextDocument/semanticTokens/full/delta
, to receive only the difference in tokens between the previous response and the current state of the file.Why does this matter
Computing semantic tokens on the server is cheap. In some of my benchmarks, the Ruby LSP can compute ~20k tokens in roughly ~10ms. However, applying these tokens in the editor is very expensive. The fewer tokens we return, the faster the editor becomes.
In fact, if we return ~20k tokens on every keypress, the editor becomes backlogged with work and the lag prevents users from working normally.
Does this PR completely fix the problem
No. It improves the problem, but I will follow this PR up with an audit of our semantic tokens implementations. We start out returning tokens for pretty much everything, but we need to be a lot more conservative and only return tokens for the things that are really ambiguous to avoid performance issues.
Implementation
There are many important parts in this implementation. First, for semantic token requests to work properly with deltas, we need to keep track of the semantic token result IDs and we need a separate cache for those tokens that cannot be cleared upon typing. The delta request needs to remember the previous result, so clearing upon typing would make it impossible to implement.
Secondly, running the full semantic tokens with the other listeners makes it incredibly difficult to ensure the correct result IDs. Accidentally bumping the ID will make the editor miss the delta and then we lose the benefits. It is clearly more impactful from a performance standpoint to support delta than it is to run semantic highlighting with the other listeners.
Having it separate will also play well with the next step to audit our semantic tokens, because we can convert the implementation to a visitor and have much better control over which tokens are returned.
Finally, the last part is computing the delta. The way the algorithm works is:
Whatever is left will be the single edit necessary to turn the previous tokens into the current ones.
Automated Tests
Added a bunch of tests.