--follow-exec causes difference in execution #966

boustrophedon · 2022-02-28T01:00:36Z

Describe the bug
In some cases it appears using --follow-exec can cause test code to not run somehow.

To Reproduce
https://github.com/boustrophedon/tarpaulin_missing_coverage/commits/master

The original problem I'm trying to solve is the following:
I have tests in tests/ and examples in examples/ and I want to get combined coverage for all of them. If I just run cargo tarpaulin --tests --examples, the examples don't execute because the test runner is used to look for tests rather than execute the examples (related?). I've worked around this by adding a small test in each example that just calls main.

However, one of my example programs calls itself (which, as above, is actually the test runner process) as a subprocess and although it appears the processes are executing (e.g. if I intentionally throw a panic in one, the panic shows up), tarpaulin isn't catching that.

While investigating this I've found a minimal example that shows just by adding and removing the --follow-exec flag that the subprocess code isn't being called somehow.

Expected behavior
The subprocess code should run, the file should be written to, and the CI step "Check file was actually created" should pass.

fails:
boustrophedon/tarpaulin_missing_coverage@9d91fb2
succeeds:
boustrophedon/tarpaulin_missing_coverage@4e92bfe

The text was updated successfully, but these errors were encountered:

xd009642 · 2022-02-28T09:29:48Z

so for the first part, cargo tarpaulin --examples is equivalent to cargo test --examples which is because of the issue you linked. If you want to avoid creating example tests you can run the examples directly with cargo tarpaulin --command Build --examples and use a config file to combine that with other test types. Something as follows should work:

[examples]
command = "Build"
run-types = ["Examples"]

[tests]
run-types =["tests"] # Just put the others here - maybe an empty section will work but I haven't tried

I have a feeling this may be related to some things I've been seeing with #953 so will try the examples and see if i can puzzle it out

boustrophedon · 2022-03-02T20:49:52Z

Thanks for the tip on using the run-types = Examples in a config file!

Let me know if there's anything I can do to help with understanding the weird behavior of --follow-exec.

xd009642 · 2022-03-02T23:44:54Z

So I tried this out and it fixes your example #962 if you want to try it on your own project 👀

boustrophedon · 2022-03-03T00:48:13Z

https://github.com/boustrophedon/tarpaulin_missing_coverage/runs/5400464385

Here's what I'm getting when using the issue/process-kill-953 branch


running 1 test
test call_main ... 
running 1 test
test call_main ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

Error: "Failed to get test coverage! Error: Failed to run tests: Error: Timed out waiting for test response"
Mar 03 00:43:07.529 ERROR cargo_tarpaulin: Failed to get test coverage! Error: Failed to run tests: Error: Timed out waiting for test response
Error: Process completed with exit code 1.

boustrophedon · 2022-03-03T00:48:35Z

Oh, I should change the code to remove the test harness and see if that does anything though.

boustrophedon · 2022-03-03T00:59:02Z

nope, same failure https://github.com/boustrophedon/tarpaulin_missing_coverage/runs/5400595536

xd009642 · 2022-03-03T08:55:55Z

Interesting, I tried it on 4e92bfe238d5391 and got 100% coverage 🤔

xd009642 · 2022-03-03T09:16:15Z

Oh it only times out using the config for me so must be another issue in the config stuff 👀 fun

xd009642 · 2022-03-03T18:25:38Z

It was me being a dummy, follow-exec wasn't aliased in serde so it expected follow_exec, I've added the alias in now though, it does time out without follow-exec though...

xd009642 · 2022-03-03T20:17:42Z

Okay, If you try now it works with and without the config file! Finally satisfied I've laid this one to rest. And I think this may have done enough to unblock me on #953 so thanks 😁

boustrophedon · 2022-03-03T23:42:18Z

https://github.com/boustrophedon/tarpaulin_missing_coverage/runs/5415282826?check_suite_focus=true

It worked!

Thanks so much for fixing this!

xd009642 · 2022-03-03T23:44:28Z

Brilliant, I'll be tweaking the branch a bit more to try and fix the issue it was originally there to fix. But once it's merged in I'll cut a new release 👍

boustrophedon · 2022-03-03T23:46:23Z

Awesome. Should I close this issue or do you want to do it when you do the release?

xd009642 · 2022-03-03T23:47:33Z

I'll do it when the PR's merged, that way you'll know the release is coming imminently and can switch your CI to use the latest release 👍

xd009642 · 2022-03-06T21:51:15Z

Just a reminder in case I don't solve this change in behaviour, it may become necessary to add an explicit wait() to that example test case to stop tarpaulin continuing on the exec killing the test process and then not capturing the exec'd coverage. But I should be able to solve that issue as well.

It's just the follow-exec issues recently are churning up a lot of the behaviours 😅 . Hopefully, it will be a lot simpler (and faster to execute) after this though

xd009642 · 2022-03-20T18:22:20Z

Release 0.20.0 will be out once CI finishes with the fix for this issue 👍

boustrophedon · 2022-03-22T06:03:14Z

I'm getting segfaults when I upgrade to 0.2 :(

https://github.com/boustrophedon/extrasafe/runs/5638757347

but the minimal example builds and runs: https://github.com/boustrophedon/tarpaulin_missing_coverage/actions/runs/2020270268 https://coveralls.io/github/boustrophedon/tarpaulin_missing_coverage

xd009642 · 2022-03-22T06:34:02Z

A quick check are you setting --test-threads in any way? I found there was a weird spurious one until I set it to 1 and then it disappeared completely.

boustrophedon · 2022-03-22T06:48:11Z

It's just calling cargo tarpaulin: https://github.com/boustrophedon/extrasafe/blob/ipc_coverage-rebase/.github/workflows/build-test.yaml#L60

I wonder if the examples are also being run with the equivalent of test-threads 1? https://github.com/boustrophedon/extrasafe/blob/ipc_coverage-rebase/.tarpaulin.toml#L3

boustrophedon · 2022-03-22T07:49:25Z

Actually it looks like this failure is probably on my side. Out of curiosity, is tarpaulin using signals in some way to catch syscalls or something?

xd009642 · 2022-03-22T08:29:19Z

Not explicitly, ptrace does catch all the signals, but tarpaulin will forward ones that it doesn't think it has use for back to the test i.e. SIGCHLD to identify a spawned processed has finished so that spawned commands can be waited on

boustrophedon · 2022-03-26T00:51:40Z

So unfortunately what's happening is that test-threads=1 actually just breaks all my tests. This is because my library is a wrapper around seccomp, which allows you to tell the kernel to deny the usage of syscalls of your choosing, for security reasons. Seccomp filters are applied to the current thread (and are inherited by child threads which isn't relevant here), so when you run two tests with different filters, the intersection of the two filters happens.

In particular, what's happening is that one test is only allowing filesystem operations, another one is allowing only network operations, and so when they run sequentially on the same thread, the end result is that neither is allowed, and the test fails.

When test-threads=1, the rust test runner says "we don't need to spawn new threads, just use the current one" (see here and run_test/run_test_inner later in the same file).

boustrophedon · 2022-03-26T03:34:46Z

With RUST_TEST_THREADS=2 and just running the tests, it doesn't have the issue.

However with RUST_TEST_THREADS=2 and running the examples as well, it segfaults on the subprocess example.

xd009642 · 2022-03-26T08:05:40Z

Okay looks like for you tests I have to fix that test thread > 1 segfault that happens like 4% of the time on my machine and >99% of the time on CI 😢.

I did find the segfault only happened on nightly not stable so maybe running on stable for coverage could be a stop gap solution?

boustrophedon · 2022-03-26T20:12:53Z

I'm not selecting nightly anywhere - does tarpaulin select nightly internally somewhere? Per the github CI documentation it's running rust 1.59.

I'll try adding an explicit .wait() at the subprocess call and see if that fixes it.

xd009642 · 2022-03-26T20:16:47Z

No it doesn't, I just saw the segfault in CI only on nightly for my own tests, maybe yours just exercises the issue stronger so it appears on stable 🤔

boustrophedon · 2022-03-26T20:21:57Z

Actually I just checked and I have explicit kill calls.

What's happening is that I have:

The main process
A "db server" process
A "web server" process
and then two client processes that make network calls to the webserver

First I start the db server in a subprocess (without wait ing since it's a server), and per the CI log the db process gets to the point where it's waiting

Then we sleep for 100 ms (which could be what's making the issue occur every time) so that we're sure the server's ready.

Then we try to start the webserver but we never actually get there because the first line of it is a println that doesn't show up.

So either the sleep is causing the issue or the second subprocess call itself. In particular, the subprocess call is calling the same executable as the first, /proc/self/exe, just with different arguments.

orium · 2022-06-14T09:59:43Z

I have the same problem in orium/cargo-rdme:

$ cargo tarpaulin --follow-exec
⋮
Jun 14 10:57:52.890  INFO cargo_tarpaulin::process_handling::linux: Launching test
Jun 14 10:57:52.890  INFO cargo_tarpaulin::process_handling: running /home/orium/programming/projects/cargo-rdme/target/debug/deps/tests-e4918e5f44a98741

running 44 tests
test system_test_avoid_overwrite_uncommitted_readme ... ok
test system_test_custom_lib_path ... Jun 14 10:57:57.976 ERROR cargo_tarpaulin: Failed to get test coverage! Error: Failed to run tests: A segfault occurred while executing tests
Error: "Failed to get test coverage! Error: Failed to run tests: A segfault occurred while executing tests"

Setting the number of threads to 1 doesn't seem to make a difference.

This happens with tarpaulin 0.20.1.

boustrophedon assigned xd009642 Feb 28, 2022

xd009642 added a commit that referenced this issue Mar 6, 2022

Add test for issue #966

55b9788

xd009642 mentioned this issue Mar 6, 2022

Failed to collect coverage when killing spawned server process in test #953

Closed

xd009642 closed this as completed Mar 20, 2022

xd009642 reopened this Mar 22, 2022

orium mentioned this issue Oct 11, 2022

LLVM coverage instrumentation #549

Closed

29 tasks

xd009642 added bug Instrumentation Issues relating to ptrace and parsing of DWARF tables labels Jan 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--follow-exec causes difference in execution #966

--follow-exec causes difference in execution #966

boustrophedon commented Feb 28, 2022

xd009642 commented Feb 28, 2022

boustrophedon commented Mar 2, 2022

xd009642 commented Mar 2, 2022

boustrophedon commented Mar 3, 2022

boustrophedon commented Mar 3, 2022

boustrophedon commented Mar 3, 2022

xd009642 commented Mar 3, 2022

xd009642 commented Mar 3, 2022

xd009642 commented Mar 3, 2022 •

edited

Loading

xd009642 commented Mar 3, 2022

boustrophedon commented Mar 3, 2022

xd009642 commented Mar 3, 2022

boustrophedon commented Mar 3, 2022

xd009642 commented Mar 3, 2022

xd009642 commented Mar 6, 2022

xd009642 commented Mar 20, 2022

boustrophedon commented Mar 22, 2022

xd009642 commented Mar 22, 2022

boustrophedon commented Mar 22, 2022

boustrophedon commented Mar 22, 2022

xd009642 commented Mar 22, 2022

boustrophedon commented Mar 26, 2022

boustrophedon commented Mar 26, 2022

xd009642 commented Mar 26, 2022

boustrophedon commented Mar 26, 2022

xd009642 commented Mar 26, 2022

boustrophedon commented Mar 26, 2022

orium commented Jun 14, 2022 •

edited

Loading

--follow-exec causes difference in execution #966

--follow-exec causes difference in execution #966

Comments

boustrophedon commented Feb 28, 2022

xd009642 commented Feb 28, 2022

boustrophedon commented Mar 2, 2022

xd009642 commented Mar 2, 2022

boustrophedon commented Mar 3, 2022

boustrophedon commented Mar 3, 2022

boustrophedon commented Mar 3, 2022

xd009642 commented Mar 3, 2022

xd009642 commented Mar 3, 2022

xd009642 commented Mar 3, 2022 • edited Loading

xd009642 commented Mar 3, 2022

boustrophedon commented Mar 3, 2022

xd009642 commented Mar 3, 2022

boustrophedon commented Mar 3, 2022

xd009642 commented Mar 3, 2022

xd009642 commented Mar 6, 2022

xd009642 commented Mar 20, 2022

boustrophedon commented Mar 22, 2022

xd009642 commented Mar 22, 2022

boustrophedon commented Mar 22, 2022

boustrophedon commented Mar 22, 2022

xd009642 commented Mar 22, 2022

boustrophedon commented Mar 26, 2022

boustrophedon commented Mar 26, 2022

xd009642 commented Mar 26, 2022

boustrophedon commented Mar 26, 2022

xd009642 commented Mar 26, 2022

boustrophedon commented Mar 26, 2022

orium commented Jun 14, 2022 • edited Loading

xd009642 commented Mar 3, 2022 •

edited

Loading

orium commented Jun 14, 2022 •

edited

Loading