Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add concurrency tests with Coyote to CI #4879

Merged
merged 46 commits into from
Nov 17, 2023

Conversation

Yun-Ting
Copy link
Contributor

@Yun-Ting Yun-Ting commented Sep 22, 2023

Fixes #2513
cc @pdeligia

Changes

Coyote is a cross-platform library and tool for testing concurrent C# code and deterministically reproducing bugs. Deck made by @pdeligia from MSR with excellent visual aids:
https://microsoft-my.sharepoint.com/:p:/p/pdeligia/EW96pNu-66dKurXENmz4du0BtMWCNKiT1jprciqc9P5r1w?e=5Gk9BQ

The methodology has been validated in reproducing a known concurrency bug in OTel 1.3.0 release.

Details of the bug

Used Coyote to capture and reproduce the above mentioning synchronization bug in OTel 1.3.0 release

  • Demo using Coyote to catch the bug and reproduce the steps with the Coyote replay feature
Demo-20231004_162907-Meeting.Recording.mp4

What's Next

This PR is the first step to integrate Coyote into the CI pipeline.
When Coyote detects concurrency bug(s) during CI, developers can use the replay feature in Coyote to replay and debug a trace file locally. Check the above demo for the E2E debugging flow for OTel developers.

Follow-up PRs can be made to use Coyote to run other existing concurrency test cases.

For example:

We can also add/improve our concurrency tests coverage with the infrastructure introduced by this PR.

Merge requirement checklist

  • CONTRIBUTING guidelines followed (nullable enabled, static analysis, etc.)
  • Unit tests added/updated
  • Appropriate CHANGELOG.md files updated for non-trivial changes
  • Changes in public API reviewed (if applicable)

@Yun-Ting Yun-Ting changed the title add concurrency test with Coyote to CI add concurrency tests with Coyote to CI Sep 22, 2023
Directory.Packages.props Outdated Show resolved Hide resolved
OpenTelemetry.sln Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Sep 27, 2023

Codecov Report

Merging #4879 (7dca448) into main (e904994) will decrease coverage by 0.02%.
Report is 2 commits behind head on main.
The diff coverage is 100.00%.

❗ Current head 7dca448 differs from pull request most recent head 583b7d6. Consider uploading reports for the commit 583b7d6 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4879      +/-   ##
==========================================
- Coverage   83.54%   83.52%   -0.02%     
==========================================
  Files         296      296              
  Lines       12481    12487       +6     
==========================================
+ Hits        10427    10430       +3     
- Misses       2054     2057       +3     
Flag Coverage Δ
unittests 83.52% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/OpenTelemetry.Api/Trace/TracerProvider.cs 93.93% <100.00%> (ø)
src/OpenTelemetry/Metrics/AggregatorStore.cs 78.53% <100.00%> (-1.33%) ⬇️
src/OpenTelemetry/Metrics/MeterProviderSdk.cs 89.97% <100.00%> (+0.02%) ⬆️
src/OpenTelemetry/Metrics/Metric.cs 96.49% <100.00%> (+0.03%) ⬆️
src/OpenTelemetry/Metrics/MetricPoint.cs 68.47% <100.00%> (ø)
src/OpenTelemetry/Metrics/MetricReaderExt.cs 87.06% <100.00%> (+0.22%) ⬆️

... and 6 files with indirect coverage changes

@Yun-Ting Yun-Ting marked this pull request as ready for review October 6, 2023 22:29
@Yun-Ting Yun-Ting requested a review from a team October 6, 2023 22:29
@github-actions github-actions bot removed the Stale Issues and pull requests which have been flagged for closing due to inactivity label Oct 21, 2023
@utpilla
Copy link
Contributor

utpilla commented Oct 26, 2023

I was hoping for more to be done for me.

I had the same expectations @CodeBlanch.

I was expecting Coyote to do more than just detecting the presence of a bug. I expected it to provide further insights into how we can fix the issue by pointing out the buggy critical sections/ shared data access or by some other means.

I believe only one of the following is true:

  1. This is an actual limitation of Coyote (that it would only detect the presence of a bug and not provide more code level insights)
  2. We are not using Coyote to its full potential and are not aware of how to do that (We probably don't know how to get hold of the code level insights from the generated artifacts)

@pdeligia Would you be able to help us here?

@pdeligia
Copy link

@CodeBlanch @utpilla -- the tool is not "magic" for sure :) Your understanding is correct, you need to identify areas you care about testing and have concerns over concurrency, and then write concurrent unit tests as entry points to explore coverage. Coyote will then drive coverage of them.

The good thing is that with a "few" concurrency tests (controlled by Coyote) you can drive coverage across tons of different execution paths that your program might take non-deterministically. These can then be part of your regression suite to catch concurrency bugs as the code evolves over time. (Test generation to reduce the test-writing effort is an interesting problem for sure, and a lot of research is going into this, perhaps LLMs can help soon with all the advancements happening recently, but it's an orthogonal problem to what Coyote focuses on.)

The "giant leap" arguably comes in two directions: (1) find bugs in (very) rare interleavings in a very small setting i.e. few numbers of tasks needed to run your logic (which regular unit tests might never discover or only once in a while, or stress testing which might need to create thousands of tasks, ending up with tons of logs to repro which are hard to make sense of), and (2) deterministically replay such bugs on the debugger (even if the test itself is by itself concurrent), which previously you relied on timing to replay (the infamous Heizenbugs :))

These are the two things Coyote gives you. It won't sadly tell you how to fix the bug as that is hard to generalize to all kinds of bugs and applications! (Although would love for that to be the case! :)) The found bug then needs to be debugged using the coyote -replay process, which (in our experience with many other teams using it internally) is what makes this a significantly better dev experience than say stress testing. Especially for these kinds of super complex concurrency issues with tons of tasks involved that you can't make sense of easily. Honestly, this is when this kind of tooling gets appreciated more. For shallow bugs, the delta value you get is indeed smaller, as these could be found and debugged already relatively easily, without tools like Coyote.

(BTW Coyote also has more advanced capabilities -- that you can't easily get with unit/stress testing -- for example detecting liveness issues, i.e. something that should eventually happen, does not happen, but these can be more complex to encode as tests & debug and might not matter to OT at all.)

Hope you guys get some good value out of this! (And happy -- for MSFT employees -- to connect you to internal teams to learn from their own experience, if interested.) Thanks so much @Yun-Ting for driving all this integration effort :)

@Yun-Ting
Copy link
Contributor Author

Thank you @pdeligia for your detailed explanation about the capabilities of Coyote!

The good thing is that with a "few" concurrency tests (controlled by Coyote) you can drive coverage across tons of different execution paths that your program might take non-deterministically. These can then be part of your regression suite to catch concurrency bugs as the code evolves over time. (Test generation to reduce the test-writing effort is an interesting problem for sure, and a lot of research is going into this, perhaps LLMs can help soon with all the advancements happening recently, but it's an orthogonal problem to what Coyote focuses on.)

The "giant leap" arguably comes in two directions: (1) find bugs in (very) rare interleavings in a very small setting i.e. few numbers of tasks needed to run your logic (which regular unit tests might never discover or only once in a while, or stress testing which might need to create thousands of tasks, ending up with tons of logs to repro which are hard to make sense of), and (2) deterministically replay such bugs on the debugger (even if the test itself is by itself concurrent), which previously you relied on timing to replay (the infamous Heizenbugs :))

Definitely, I think Coyote is a powerful tool that could easily stress test all kinds of interleaving in the focused area. And I'm looking forward to seeing the LLM projects MSR are focusing on lately to further advance developers life ✨. Please let us know if you have interesting LLM projects that you think OTel could try out!

@CodeBlanch
Copy link
Member

Just FYI we (maintainers) discussed this yesterday and we are going to move forward with Coyote! @utpilla is going to review and merge when he has a cycle or two

@CodeBlanch
Copy link
Member

@pdeligia

Here's some random food for thought.

You familiar at all with SpecFlow? It allows you to author (essentially) test cases using gherkin.

What I was hoping Coyote would have is something kind of similar to help author/model the test cases.

Imagine something like:

Scenario: TracerProvider.GetTracer concurrency test
Given A {TracerProvider} instance
And Many threads calling {TracerProvider.GetTracer}
When A single thread calls {TracerProvider.Dispose}
Then {TracerProvider.tracers} remains empty

Now that would be magic 😄

Essentially some way to describe how the threads are expected (single/many readers, single/many writes, etc.) and then the tool builds the tests.

@pdeligia
Copy link

pdeligia commented Nov 2, 2023

@CodeBlanch great thoughts here! I have heard about SpecFlow, but have not used it myself. However, I am in general familiar with test-generation (even internally, there is a lot of very interest research happening in MSR from some colleagues of mine on test-generation from "high-level specs" of a system, and nowadays even AI can be a great accelerator here!).

Coyote, however, is in some sense orthogonal to test generation: you can think of Coyote as the concurrency exploration engine, which you use once you have tests in place. How these tests are authored or generated is completely up to developers, as we do not want to be opinionated, there are so many different ways to do this including manually authoring them (so this is out of scope for Coyote, by design). Once you have tests in place (manually or by using an automated tool), Coyote will then take control of the inherent non-determinism in them and explore hundreds/thousands of concurrency execution paths for you (out of each test) to find race conditions, etc!

I believe you could use any tool that works with C# (like SpecFlow, but not sure if it supports C#?) to generate your concurrent tests, and then feed them to Coyote to explore task/thread interleavings for bugs :) Are there any such test generation tools for C# you have used before, might be worth trying them?

Copy link
Contributor

This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.

@github-actions github-actions bot added the Stale Issues and pull requests which have been flagged for closing due to inactivity label Nov 10, 2023
@utpilla utpilla removed the Stale Issues and pull requests which have been flagged for closing due to inactivity label Nov 13, 2023
Copy link
Contributor

@utpilla utpilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yun-Ting Apologies for the delay! Need to address this https://github.com/open-telemetry/opentelemetry-dotnet/pull/4879/files#r1396782338, otherwise it looks good to be merged.

@utpilla utpilla merged commit d346196 into open-telemetry:main Nov 17, 2023
39 checks passed
@utpilla utpilla mentioned this pull request Nov 17, 2023
@Yun-Ting Yun-Ting deleted the yunl/addCoyoteCI_3 branch November 17, 2023 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test thread safety during CI
5 participants