Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize performance of handling huge completion payloads #2190

Merged
merged 9 commits into from
Feb 9, 2023

Conversation

rchl
Copy link
Member

@rchl rchl commented Feb 7, 2023

A low-hanging fruit optimization that allows us to create CompletionList with drastically smaller payload.

Instead of serializing complete completion items and attaching those to CompletionItem.args, only attach an index of the item and lookup the item within the global completions list attached to the LspSelectCompletionCommand class (previously it was attached to the LspResolveDocsCommand class but moved to match the current use).

We did that already for the purpose of of accessing them for the completionItem/resolve request but there is no reason not to do that for the completion insertion also.

I've created a separate package at https://github.com/sublimelsp/LSP-performance-test which currently includes one performance test for testing the format_completion function. Comparing the main branch and this branch gives:

main: Best of 3: 2.038291124999887 s
this: Best of 3: 1.1438355416667036 s

so 56% improvement. (This is on a 50MB completions payload)

This is just testing the performance of this single function and that function is called on the async thread anyway so its performance is not that crucial. The main bottleneck that causes ST to lag is the act of returning that list to ST and transferring over process boundaries. That itself I haven't measured but it is also drastically improved since the payload of completion list is a lot smaller.

Notes:

  • LspSelectCompletionItemCommand renamed to LspSelectCompletionCommand. The purpose was to just make it shorter to micro-optimize the payload size but I'm not very attached to that and could revert it. Though since it should be only used internally, it shouldn't be an issue.
  • get_insert_replace_support_html inlined within format_completion to avoid getting textEdit twice (again, micro optimization). But there is not really a good reason for it being separate anyway.

@rchl
Copy link
Member Author

rchl commented Feb 7, 2023

Hmm, I'm not sure I should be super optimistic about this change since I've done some higher level testing now and it might not do that much for the actual problem of ST lagging. Kinda hard to tell without numbers but I did some silly test where I've started "ActivityIndicator" from sublime_lib and observed how much it will stutter before completions are shown and the results are pretty similar.

Screen.Recording.2023-02-07.at.21.29.35.mov

I'm sure it's better but not sure by how much. This is on M2 Pro so it's not the best machine for performance testing ;)

@jwortmann
Copy link
Member

This is measured in seconds, and not in microseconds, right? Because you've set number=1.

Just for comparison, how many characters are in the JSON payload, for the 50MB, and how long does it take for the server to respond? I would have thought that the time of the format_completion calls were not very significant in comparison with the server response time, and probably even with the JSON parsing time.

Does your video show an actual real-world example, how long it really takes for the completions to show up with LSP-typescript? Being several seconds, that looks completely unusable to me.


Besides that the PR itself looks good to me on a quick look.

@rchl
Copy link
Member Author

rchl commented Feb 7, 2023

This is measured in seconds, and not in microseconds, right?

Yes, seconds.

Just for comparison, how many characters are in the JSON payload, for the 50MB, and how long does it take for the server to respond?

The payload I'm testing on is formatted so it's bigger due to extra whitespace. The original one is ~36MB (so about 36 million characters):

:: [00:18:52.984] --> LSP-typescript textDocument/completion (4): {'position': {'line': 0, 'character': 11}, 'textDocument': {'uri': 'file:///usr/local/workspace/github/dynamodb-admin/lib/backend.js'}}
:: [00:18:55.887] <<< LSP-typescript (4) (duration: 2903ms): <params with 36735081 characters>

I would have thought that the time of the format_completion calls were not very significant in comparison with the server response time, and probably even with the JSON parsing time.

Yes, true, but everything adds up. The whole server response (including parsing) in this case is ~2.9s and JSON parsing takes 0.293s of that.

Does your video show an actual real-world example, how long it really takes for the completions to show up with LSP-typescript? Being several seconds, that looks completely unusable to me.

It is very slow in some cases like the one I've shown where I trigger completions within a JSDoc context. It depends where completions are triggered. For example in the same document but not in a JSDoc context the response is small and quick:

:: [00:13:04.623] --> LSP-typescript textDocument/completion (8): {'position': {'character': 0, 'line': 1}, 'textDocument': {'uri': 'file:///usr/local/workspace/github/dynamodb-admin/lib/backend.js'}}
:: [00:13:04.746] <<< LSP-typescript (8) (duration: 122ms): <params with 1065367 characters>

The main offender in many of the projects is the aws package that includes SDKs for all AWS services and thus exports way too many types. I believe it's possible to exclude specific packages from import suggestions in newer versions of Typescript so I should maybe start suggesting that or even ignore it by default.

@rchl
Copy link
Member Author

rchl commented Feb 7, 2023

In import statements the payload is even bigger:

:: [00:26:12.656] --> LSP-typescript textDocument/completion (47): {'position': {'character': 9, 'line': 31}, 'textDocument': {'uri': 'file:///usr/local/workspace/github/dynamodb-admin/lib/backend.js'}}
:: [00:26:16.233] <<< LSP-typescript (47) (duration: 3577ms): <params with 56976948 characters>

In this case JSON parsing takes 0.684~0.866.

@rchl
Copy link
Member Author

rchl commented Feb 7, 2023

Though those responses being slow is not the main issue in my opinion. It's the part where we pass finished CompletionList to the ST API and where ST starts lagging due to locking inter-process communication I guess. This creates UI freezes that people are mostly annoyed by. This change should somewhat help with that too.

@predragnikolic
Copy link
Member

I will test this tonight

@predragnikolic
Copy link
Member

main takes around 3.26 (best time),

this branch 1.98s (best time)

FWIW, commenting out only these 3 lines makes it even faster... 0.78s

    # if can_resolve_completion_items or item.get('documentation'):
    #     details.append(make_command_link(
    #         'lsp_resolve_docs', "More", {'index': index, 'session_name': session_name}, view_id=view_id))

@predragnikolic
Copy link
Member

predragnikolic commented Feb 8, 2023

the make_command_link is "expensive"

Here is one cheaper alternative 0.77s:

    if can_resolve_completion_items or item.get('documentation'):
        details.append("<a href='subl:lsp_resolve_docs {{&quot;index&quot;: {}, &quot;session_name&quot;: &quot;{}&quot;, &quot;view_id&quot;: {}}}'>More</a>".format(index, session_name, view_id))

but we would need to tweak the lsp_resolve_docs command to accept a view_id if we want to keep the old behavior.

@predragnikolic
Copy link
Member

I experimented a bit, so I modified the format_completion just to look as minimal as it can be:

def format_completion(
    item: CompletionItem, index: int, session_name: str
) -> sublime.CompletionItem:
    # This is a hot function. Don't do heavy computations or IO in this function.
    completion = sublime.CompletionItem.command_completion(
        trigger='a',
        command='l',
        args={"i": index, "s": session_name})
    return completion

and the time was 0.55s. So I guess I cannot get better than that on my Laptop.

@rchl
Copy link
Member Author

rchl commented Feb 8, 2023

the make_command_link is "expensive"

Good find. I've looked inside it (the sublime.command_url() specifically) and it appears that the json.dumps is the main offender there:

def format_command(cmd, args=None):
    if args is None:
        return cmd
    else:
        arg_str = json.dumps(
            args,
            ensure_ascii=False,
            check_circular=False,
            separators=(',', ':')
        )
        return '{} {}'.format(cmd, arg_str)

Probably good to avoid sublime.command_url then.

@rchl
Copy link
Member Author

rchl commented Feb 8, 2023

Added optimized function for creating documentation links. The test time goes down from 1.153 to 0.698 here.

@rchl
Copy link
Member Author

rchl commented Feb 8, 2023

And down to 0.435 with removal of html.escape that we don't really need with the strictly controlled input that we have.

@rchl
Copy link
Member Author

rchl commented Feb 8, 2023

I found this article about performance of JSON parsing: https://pythonspeed.com/articles/faster-python-json-parsing/

There are libraries that would likely half the time of built-in parser but those require Python 3.7+ and contain native code. So maybe in the future.

@rchl
Copy link
Member Author

rchl commented Feb 8, 2023

The sublime.CompletionItem.command_completion class method was also using sublime.command_url() behind the scenes so optimized that too. Now it's down to 0.185.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants