-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(16905): Fix a number of edge cases where assignment corrupted a Series #16930
fix(16905): Fix a number of edge cases where assignment corrupted a Series #16930
Conversation
The operation is in place and saves a memory allocation. I think we should keep the original code, but do the bound check before we do the in place operation. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16930 +/- ##
==========================================
- Coverage 81.44% 81.09% -0.36%
==========================================
Files 1425 1435 +10
Lines 187970 189585 +1615
Branches 2704 2712 +8
==========================================
+ Hits 153091 153741 +650
- Misses 34382 35344 +962
- Partials 497 500 +3 ☔ View full report in Codecov by Sentry. |
I wrote a little mini-benchmark to make sure there's no performance hit in non-error case for this change: from time import time
import polars as pl
series = pl.Series(list(range(1_000_000)))
start = time()
for i in range(1000):
series[500_000] = 17
print(time() - start) Some of my first attempts made it 100× slower, admittedly in non-release builds. |
Oh, and to be clear, it is still in-place, I didn't change that! I just made sure it gets restored if an error occurs. |
Can you confirm that you still hit the |
Will investigate. How would you feel about adding a test based on https://crates.io/crates/cov-mark? It's a good technique for this sort of situation where the external behavior is the same, but you really want to hit a particular branch in your test. |
The get_mut_values() branch does git hit, yeah, I added a debug print:
But a |
Great! Yeah, actually testing it would be better indeed. Though it is important that it only is used in test compilations. |
Fixes #16905
Previously various interactions in Python of assigning to a Series would clear the data from the Series if an error occurred. In interactive usage, for example in a Jupyter notebook, this is undesirable behavior because typos or just playing around shouldn't result in data disappearing.
Example of previous, buggy behavior that this PR fixes:
s
should not be empty just because an OutOfBoundsError happens...