-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LSP SemanticTokens offset incorrect in some cases #2715
Comments
Xtext offsets are for the complete document and this include whitespace so the question is if they are correctly translated to line and column for lsp a sample grammar and unit test might be helpful ( for the general offset of Xtext INode as well for the lsp Problem ) see e.g. org.eclipse.xtext.nodemodel.impl.AbstractNode.getTextRegion() |
I try to take a look into this in the next days. |
I think adjusting the way of calculating offsets is very complicated, as it is used widely in xtext and lays deeply in the core. Therefore, I am currently taking a look at the PositionReader that is returning an LSP4j Position based on an InputStream and an offset. I try to adjust it, so it ignores newlines. |
yes i assume the implementor of PositionReader simply overlooked this issue |
Update: The Problem lays in the underlying LineNumberReader which is used in the PositionReader. The LineNumberReader skips \r characters and does not count them as a character when reading. When the offset is calculated, it includes \r characters. Therefore, we get a wrong position, as these characters are skipped when resolving the position to this offset. As the LineNumberReader is part of the java.io package, I try to work around this behavior in the PositionReader by detecting if carriage return characters are used and then subtracting the line number from the calculated column. Still not 100% happy with my solution, so will think about a bit more. |
Can you provide an isolated test case that illustrates what you are observing?
? |
I created this branch in my fork: https://github.com/jnt0r/xtext/tree/positionreader-demo-test with this test main...jnt0r:xtext:positionreader-demo-test I'm not sure if we can assume that all line breaks contain carriage returns, or if there can be a mix of single \n and \r\n. |
I just created a PR with my current solution. |
I closed my PR in favor of #2735 |
The previous implementation of PositionReader extended LineNumberReader. Unfortunately, the implementation of PositionReader.read() compresses line endings. In other words, both "\n" and "\r\n" contribute one character to the document offset count. This behavior is undesired, and can lead to the production of incorrect line and column offsets. This commit changes PositionReader to extend BufferedReader instead. Now, calling read() always reads exactly one character. We also implement logic specific to PositionReader to correctly handle "\r\n", "\n" and "\r" line endings. The PositionReader class now also keeps track of both the line and column values internally. The mark() and reset() functions have been updated accordingly. Signed-off-by: Joao Ferreira <Joao.Ferreira@avaloq.com>
Fixed by #2735 |
The offset in SemanticTokens response is slightly off in some cases. I debugged this and found out, that the wrong offset is calculated deeper in xtext. The TextRegion is already filled with the wrong offset, which is then used to calculate the line and character in the SemanticTokensService. I don't know and yet didn't find where the TextRegion is created and filled with the offset.
It looks like line ending characters (\n) are counted in the offset as well:
The text was updated successfully, but these errors were encountered: