-
Notifications
You must be signed in to change notification settings - Fork 90
Description
As per the quote in #400:
A position inside a document (see Position definition below) is expressed as a zero-based line and character offset. The offsets are based on a UTF-16 string representation. So a string of the form a𐐀b the character offset of the character a is 0, the character offset of 𐐀 is 1 and the character offset of b is 3 since 𐐀 is represented using two code units in UTF-16.
The character offset should be in terms of UTF-16 codeunits. As far as
I can tell, LanguageServer.jl only uses UTF-8 internally and works in
terms of characters (codepoints) rather than codeunits. eglot works around
this
but not all editors might. I have no idea how VSCode behaves.
So for e.g. the file:
𐐀𐐀𐐀="hello"
𐐀𐐀𐐀𐐀=𐐀𐐀𐐀
Asking for line 1 position 6 should show the hover for 𐐀𐐀𐐀𐐀
since
the 7th UTF-16 codeunit is still within that variable. Instead it
shows the hover for 𐐀𐐀𐐀
:
client-request (id:105) Wed Oct 16 16:29:55 2019:
(:jsonrpc "2.0" :id 105 :method "textDocument/hover" :params
(:textDocument
(:uri "file:///home/adam/tmp/test.jl")
:position
(:line 1 :character 6)))
server-reply (id:105) Wed Oct 16 16:29:55 2019:
(:id 105 :jsonrpc "2.0" :result
(:contents
[(:language "julia" :value "𐐀𐐀𐐀 = \"hello\"")]))
There's some discussion about the awkwardness of using UTF-16 code units at microsoft/language-server-protocol#376 and a survey of other implementations at https://github.com/Avi-D-coder/lsp-range-unit-survey.