Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix value_counts for Pandas 2 #28500

Merged
merged 1 commit into from
Sep 19, 2023
Merged

Fix value_counts for Pandas 2 #28500

merged 1 commit into from
Sep 19, 2023

Conversation

caneff
Copy link
Contributor

@caneff caneff commented Sep 18, 2023

Two changes here:

  1. In 2.0, value_counts has different naming. See https://pandas.pydata.org/docs/whatsnew/v2.0.0.html#value-counts-sets-the-resulting-name-to-count for more details.

  2. Fix df.value_counts interaction of subset and dropna, discovered by a new
    doctest failure. Previously we were dropping rows with NA in any column, not
    just the columns of interest if subset was non-empty.

Umbrella issue: #27221

@caneff
Copy link
Contributor Author

caneff commented Sep 18, 2023

R: @tvalentyn

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

@codecov
Copy link

codecov bot commented Sep 18, 2023

Codecov Report

Merging #28500 (4dd9dff) into master (603b517) will decrease coverage by 0.02%.
Report is 98 commits behind head on master.
The diff coverage is 85.71%.

@@            Coverage Diff             @@
##           master   #28500      +/-   ##
==========================================
- Coverage   72.34%   72.33%   -0.02%     
==========================================
  Files         682      683       +1     
  Lines      100536   100754     +218     
==========================================
+ Hits        72737    72877     +140     
- Misses      26221    26299      +78     
  Partials     1578     1578              
Flag Coverage Δ
python 82.80% <85.71%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
sdks/python/apache_beam/dataframe/frames.py 95.24% <85.71%> (-0.08%) ⬇️

... and 17 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@tvalentyn tvalentyn merged commit 0f2f3b1 into apache:master Sep 19, 2023
81 of 86 checks passed
@caneff caneff deleted the value_counts branch September 21, 2023 17:26
m-trieu pushed a commit to m-trieu/beam that referenced this pull request Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants