Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the calling process query #869

Merged
merged 5 commits into from
Jan 3, 2022
Merged

Conversation

th1000s
Copy link
Collaborator

@th1000s th1000s commented Dec 22, 2021

Always determining the parent process is faster now that this information is needed most of the time anyhow.

Also fixes the Linux process query slowdown for good.

src/utils/process.rs Outdated Show resolved Hide resolved
src/utils/process.rs Outdated Show resolved Hide resolved
static ref CACHED_CALLING_PROCESS: Option<CallingProcess> = determine_calling_process();
pub fn start_determining_calling_process_in_thread() {
// Do not keep the join handle so the main thread can exit early if
// it does not need to know its parent process.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is referring to the fact that there is no return value of this function (and that we use the spawn method on the builder as opposed to the thread::spawn function), right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The builder is used to set the thread name - in case someone without access to the sources wonders why delta looks at all the processes. It could also return a JoinHandle, but this is ignored. I updated the comments to reflect this.

// /proc file descriptors (fds) open. This caching is not needed here, so
// set this to zero (this does nothing on other platforms).
// Also, there is currently a kernel bug which slows down syscalls when threads are
// involved (here: the ctrlc handler) and a lot of files are kept open.
Copy link
Owner

@dandavison dandavison Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of interest, have you identified this bug in the linux kernel issue tracker?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is what allows us to fearlessly query many processes on linux again? Very interesting to see that you've got to the bottom of this problem.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the Linux bug tracker suggests, for now I have reported it to the Debian kernel team, see #1001818 for a minimal C and C++ example.

And yes, on Linux this makes process queries as fast as e.g. ps auxf.

let (caller_mutex, determine_done) = &**CALLER;
determine_done
.wait_while(caller_mutex.lock().unwrap(), |pending| {
*pending == CallingProcess::Pending
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful. And very useful to me to have an example of how to do this.

@th1000s
Copy link
Collaborator Author

th1000s commented Dec 23, 2021

panic at 'attempt to subtract with overflow', [..] sysinfo-0.22.3/src/lib.rs:132:28 -- (let diff = max - **x;)

Looks like on 32bit sysinfo does not like (concurrent?) set_open_files_limit(0), @GuillaumeGomez?

Also, #[cfg(test)] pub fn calling_process() -> Box<CallingProcess> is currently called 4412 times during cargo test.

@GuillaumeGomez
Copy link
Contributor

Also, #[cfg(test)] pub fn calling_process() -> Box<CallingProcess> is currently called 4412 times during cargo test.

You mean in sysinfo? As far as I can see, there is no calling_process function.

@GuillaumeGomez
Copy link
Contributor

Forgot to mention but I opened GuillaumeGomez/sysinfo#650. Please tell me if it fixes your issue.

@dandavison
Copy link
Owner

@GuillaumeGomez thanks for helping with this. I've created a version of this PR with your sysinfo fix: #872

It appears that it is still failing, due to the subtraction on the line before the line that you fixed:

thread 'tests::test_example_diffs::tests::test_single_character_line_is_not_whitespace_error'
panicked at 'attempt to subtract with overflow',
/cargo/git/checkouts/sysinfo-bc2293244598544b/62374ac/src/lib.rs:132:28

https://github.com/GuillaumeGomez/sysinfo/blob/62374ac97393dec978d44a888f6b0907b211938d/src/lib.rs#L132-L133

@GuillaumeGomez
Copy link
Contributor

I updated the PR, both subtractions are checked now.

@dandavison
Copy link
Owner

Great, looks like it works now. I'll leave it up to @th1000s to finalize things here.

@GuillaumeGomez
Copy link
Contributor

Ok, I'll release a new sysinfo version then.

This query only happens once, so caching is not needed

Also update sysinfo version to fix a crash related to this.
Start the query even before determining if information about the
parent process is required (which are most invocations anyhow).
Ensures that no unrelated process is found when selectively
refreshing a pid range.
@th1000s
Copy link
Collaborator Author

th1000s commented Jan 3, 2022

Thanks, version 0.22.4 fixes the crash.

And the calling_process() function is from delta, this patchset also reduces the number of times it is called during testing.

@dandavison dandavison merged commit 59604a3 into dandavison:master Jan 3, 2022
@dandavison
Copy link
Owner

Thanks @th1000s and @GuillaumeGomez!

@dandavison
Copy link
Owner

The docker script provided by @infokiller in #973 seems to show that having if there are many processes then delta is still significantly slowed down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants