-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduce flaky tests #1015
reduce flaky tests #1015
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1015 +/- ##
==========================================
- Coverage 97.90% 91.90% -6.01%
==========================================
Files 230 109 -121
Lines 39531 16755 -22776
==========================================
- Hits 38704 15399 -23305
- Misses 827 1356 +529 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @josephleekl ! Could you please clarify why flaky tests can be reduced with this implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Happy to approve 🙌
Thanks Shuli, the shots increase significantly reduces the failure rate, which means that we should no longer need flaky tests for some tests. ALso some tests are not non-deterministic so no flaky is required. |
Please complete the following checklist when submitting a PR: - [ ] All new features must include a unit test. If you've fixed a bug or added code that should be tested, add a test to the [`tests`](../tests) directory! - [ ] All new functions and code must be clearly commented and documented. If you do make documentation changes, make sure that the docs build and render correctly by running `make docs`. - [ ] Ensure that the test suite passes, by running `make test`. - [x] Add a new entry to the `.github/CHANGELOG.md` file, summarizing the change, and including a link back to the PR. - [x] Ensure that code is properly formatted by running `make format`. When all the above are checked, delete everything above the dashed line and fill in the pull request template. ------------------------------------------------------------------------------------------------------------ **Context:** The aim is to reduce the number of flaky tests, and deal with some of the stochastic tests failures observed in CI. **Description of the Change:** 4 tests are updated: 1. test_shots_single_measure_obs ([TSSMO](https://github.com/PennyLaneAI/pennylane-lightning/blob/f9e8f62a073ab72c8d96b3bb01de399d02e288ba/tests/test_measurements.py#L753)) Previously flaky test, flaky now removed and shot increased 10x. Previous failure due to low shots count. Previous observed failure rate (without flaky): LQ 4/1000 Updated failure rate: LQ 0/1000 2. test_controlled_qubit_gates ([TCQG](https://github.com/PennyLaneAI/pennylane-lightning/blob/f9e8f62a073ab72c8d96b3bb01de399d02e288ba/tests/lightning_qubit/test_measurements_class.py#L742)) Previously flaky test, flaky now removed. There is no non-determinism in test. No failure. 3. test_cnot_controlled_qubit_unitary ([TCCQU](https://github.com/PennyLaneAI/pennylane-lightning/blob/f9e8f62a073ab72c8d96b3bb01de399d02e288ba/tests/lightning_qubit/test_measurements_class.py#L825)) Previously flaky test, flaky now removed. Only non-determinism in initial state preparation. Initial state now fixed with seed (already used in above test). No failure. 4. test_sample_variations ([TSV](https://github.com/PennyLaneAI/pennylane-lightning/blob/f9e8f62a073ab72c8d96b3bb01de399d02e288ba/tests/test_measurements.py#L661)) Shots increased 10x and now reference compares with analytical probability calculation in default.qubit rather than with shots. Previous observed failure rate: 13/1000 Updated failure rate: LQ 0/1000 **Benefits:** - Reduced flaky tests from 6 to 3 - Reduce likelihood of TSV causing CI failure, which is frequently observed ([1](https://github.com/PennyLaneAI/pennylane-lightning/actions/runs/10887374615/job/30210122673) [2](https://github.com/PennyLaneAI/pennylane-lightning/actions/runs/11582353424/job/32245407148) [3](https://github.com/PennyLaneAI/pennylane-lightning/actions/runs/12131635773/job/33824313065) [4](https://github.com/PennyLaneAI/pennylane-lightning/actions/runs/12150423891/job/33887968196)) **Possible Drawbacks:** Increased test runtime: 1. TSSMO LQ x86: 5s -> 16s LG x86: 3s -> 6s LK x86 CPU: 1s -> 4s LK GPU: 1s -> 4s 2. TSV LQ x86: 3s -> 5s LG x86: 6s -> 10s LK x86 CPU: 1s -> 4s LK GPU: 1s -> 4s However, originally, for TSSMO the time increases if flaky is required to be triggered, and TSV failure would require re-running the tests. **Related GitHub Issues:** [sc-78212] --------- Co-authored-by: ringo-but-quantum <github-ringo-but-quantum@xanadu.ai>
Before submitting
Please complete the following checklist when submitting a PR:
All new features must include a unit test.
If you've fixed a bug or added code that should be tested, add a test to the
tests
directory!All new functions and code must be clearly commented and documented.
If you do make documentation changes, make sure that the docs build and
render correctly by running
make docs
.Ensure that the test suite passes, by running
make test
.Add a new entry to the
.github/CHANGELOG.md
file, summarizing thechange, and including a link back to the PR.
Ensure that code is properly formatted by running
make format
.When all the above are checked, delete everything above the dashed
line and fill in the pull request template.
Context:
The aim is to reduce the number of flaky tests, and deal with some of the stochastic tests failures observed in CI.
Description of the Change:
4 tests are updated:
test_shots_single_measure_obs (TSSMO)
Previously flaky test, flaky now removed and shot increased 10x. Previous failure due to low shots count.
Previous observed failure rate (without flaky): LQ 4/1000
Updated failure rate: LQ 0/1000
test_controlled_qubit_gates (TCQG)
Previously flaky test, flaky now removed. There is no non-determinism in test.
No failure.
test_cnot_controlled_qubit_unitary (TCCQU)
Previously flaky test, flaky now removed. Only non-determinism in initial state preparation. Initial state now fixed with seed (already used in above test).
No failure.
test_sample_variations (TSV)
Shots increased 10x and now reference compares with analytical probability calculation in default.qubit rather than with shots.
Previous observed failure rate: 13/1000
Updated failure rate: LQ 0/1000
Benefits:
Possible Drawbacks:
Increased test runtime:
TSSMO
LQ x86: 5s -> 16s
LG x86: 3s -> 6s
LK x86 CPU: 1s -> 4s
LK GPU: 1s -> 4s
TSV
LQ x86: 3s -> 5s
LG x86: 6s -> 10s
LK x86 CPU: 1s -> 4s
LK GPU: 1s -> 4s
However, originally, for TSSMO the time increases if flaky is required to be triggered, and TSV failure would require re-running the tests.
Related GitHub Issues:
[sc-78212]