-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid indexing the same workspace multiple times #15495
Conversation
/// | ||
/// This method avoids re-indexing the same workspace multiple times if multiple files | ||
/// belonging to the same workspace have been changed. | ||
pub(super) fn reload_settings(&mut self, changes: &[FileEvent]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me. We do something similar in Red Knot where we process all changes and compute what requires updating and only then do the updating.
1d9bf50
to
1ca0631
Compare
|
Unrelated to your changes but I noticed it in the tracing logs. I think we could improve the logs by making tracing aware that the walk_dir threads run as part of the request (untested) Index: crates/ruff_server/src/session/index/ruff_settings.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_server/src/session/index/ruff_settings.rs b/crates/ruff_server/src/session/index/ruff_settings.rs
--- a/crates/ruff_server/src/session/index/ruff_settings.rs (revision f20780584daf0b42a641639f3e8da4d67e005dff)
+++ b/crates/ruff_server/src/session/index/ruff_settings.rs (date 1736946797069)
@@ -211,8 +211,11 @@
let index = std::sync::RwLock::new(index);
let has_error = AtomicBool::new(has_error);
+ let index_span = tracing::debug_span!("walk_dir");
+
walker.run(|| {
Box::new(|result| {
+ let index_span = index_span.clone();
+ let _span = index_span.entered();
let Ok(entry) = result else {
return WalkState::Continue;
};
|
Thank you, will look at it in a follow-up PR. |
This is great. It would be a much simpler fix than making indexing actually lazy! |
It's not a complete fix because the client sends multiple @MichaReiser I'm interested in hearing your opinion about whether that would be more feasible than lazy indexing. Maybe we could re-use the existing implementation from Red knot? |
I don't see how this can be avoided. At least not if the modifications happen too far apart. Red Knot debounces the events to combine events that happen shortly after each other, but that timeout must be small (<300ms), or users experience a lag when making changes in the editor. The server could implement its own debounce logic to debounce the change events coming from the client, but it increases the risk that we respond to requests with an outdated configuration, and it introduces a delay that will become noticeable to users. Lazy indexing might also not help if the changes happen too far apart because VS code might request updated diagnostics, forcing Ruff to search for the configuration. Let's see if your change improves the experience. I think it's fairly likely that it will. |
This is an interesting idea. I think the risk is less because at least for pull diagnostics it's only when the server will ask the client to refresh the diagnostics then the client will ask for diagnostics: ruff/crates/ruff_server/src/server/api/notifications/did_change_watched_files.rs Lines 25 to 47 in b5dbb2a
We could experiment with a small amount of debounce period only after which the server will re-index the project and ask the client to refresh the diagnostics / publish the new diagnostics. |
I suspect that VS codes file watcher already does some debouncing but maybe not? |
I'll need to look, maybe it does. |
Summary
This is not lazy indexing but it should somewhat help with #13686.
Currently, processing the change notifications for config files doesn't account for the fact that multiple config files could belong to the same workspace. This means that the server will re-index the same workspace
n
times wheren
is the number of file events which belongs to the same workspace. This is evident in the following trace logs:Trace logs:
Server logs:
This PR updates the logic to consider all the change events at once keeping track of the workspace path that have been already indexed.
I want to include this in tomorrow's release to check how this change would affect the linked issue.
Test Plan
Run the same scenario as above and check the logs to see that the server isn't re-indexing the same workspace multiple times:
Trace logs:
Server logs: