Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add groupby scan operations (sort groupby) #7387

Merged
merged 56 commits into from
Mar 23, 2021

Conversation

karthikeyann
Copy link
Contributor

@karthikeyann karthikeyann commented Feb 16, 2021

Adds support for groupby scan operations.

Addresses part of
#1298 cumsum
#1296 cumcount

  • sum
  • min
  • max
  • count

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Feb 16, 2021
@karthikeyann karthikeyann added 2 - In Progress Currently a work in progress feature request New feature or request non-breaking Non-breaking change and removed libcudf Affects libcudf (C++/CUDA) code. labels Feb 16, 2021
cpp/src/groupby/sort/group_scan.hpp Outdated Show resolved Hide resolved
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Feb 17, 2021
@codecov
Copy link

codecov bot commented Feb 17, 2021

Codecov Report

Merging #7387 (c8e4b99) into branch-0.19 (7871e7a) will increase coverage by 0.60%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #7387      +/-   ##
===============================================
+ Coverage        81.86%   82.47%   +0.60%     
===============================================
  Files              101      101              
  Lines            16884    17397     +513     
===============================================
+ Hits             13822    14348     +526     
+ Misses            3062     3049      -13     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/categorical.py 91.97% <ø> (+0.58%) ⬆️
python/cudf/cudf/core/column/column.py 87.86% <ø> (+0.10%) ⬆️
python/cudf/cudf/core/column/datetime.py 89.63% <ø> (+0.54%) ⬆️
python/cudf/cudf/core/column/decimal.py 92.75% <ø> (-2.12%) ⬇️
python/cudf/cudf/core/column/lists.py 92.17% <ø> (+0.77%) ⬆️
python/cudf/cudf/core/column/numerical.py 94.83% <ø> (-0.20%) ⬇️
python/cudf/cudf/core/column/string.py 86.79% <ø> (+0.30%) ⬆️
python/cudf/cudf/core/column/timedelta.py 88.57% <ø> (+0.33%) ⬆️
python/cudf/cudf/core/column_accessor.py 95.45% <ø> (+0.14%) ⬆️
python/cudf/cudf/core/dataframe.py 90.90% <ø> (+0.44%) ⬆️
... and 64 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8773a40...c8e4b99. Read the comment docs.

@github-actions github-actions bot added the CMake CMake build issue label Mar 2, 2021
@karthikeyann karthikeyann requested a review from davidwendt March 17, 2021 05:19
@karthikeyann karthikeyann requested a review from davidwendt March 17, 2021 15:37
cpp/include/cudf/groupby.hpp Show resolved Hide resolved
cpp/src/groupby/sort/aggregate.cpp Outdated Show resolved Hide resolved
cpp/src/groupby/sort/aggregate.cpp Show resolved Hide resolved
cpp/src/groupby/sort/functors.hpp Show resolved Hide resolved
cpp/tests/groupby/group_count_scan_test.cpp Outdated Show resolved Hide resolved
cpp/tests/groupby/group_count_scan_test.cpp Outdated Show resolved Hide resolved
cpp/tests/groupby/group_max_scan_test.cpp Outdated Show resolved Hide resolved
@karthikeyann karthikeyann requested a review from ttnghia March 19, 2021 09:13
Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There a bunch of files in this PR that only had their copyright changed. Do not change the year unless you changed the file.

cpp/tests/groupby/group_sum_scan_test.cpp Outdated Show resolved Hide resolved
cpp/tests/groupby/group_sum_scan_test.cpp Outdated Show resolved Hide resolved
cpp/tests/groupby/group_sum_scan_test.cpp Outdated Show resolved Hide resolved
cpp/tests/groupby/group_sum_scan_test.cpp Outdated Show resolved Hide resolved
cpp/tests/groupby/group_min_scan_test.cpp Outdated Show resolved Hide resolved
cpp/src/groupby/sort/group_reductions.hpp Outdated Show resolved Hide resolved
cpp/src/groupby/sort/group_quantiles.cu Outdated Show resolved Hide resolved
cpp/src/groupby/sort/group_nth_element.cu Outdated Show resolved Hide resolved
cpp/src/groupby/sort/group_nunique.cu Outdated Show resolved Hide resolved
cpp/src/groupby/sort/group_min.cu Outdated Show resolved Hide resolved
Co-authored-by: David <45795991+davidwendt@users.noreply.github.com>
@kkraus14 kkraus14 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond labels Mar 22, 2021
@kkraus14
Copy link
Collaborator

@gpucibot merge

@kkraus14
Copy link
Collaborator

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 500f42c into rapidsai:branch-0.19 Mar 23, 2021
hyperbolic2346 pushed a commit to hyperbolic2346/cudf that referenced this pull request Mar 25, 2021
- Replace device_vector with device_uvector
- Replace device_vector const& with device_span<const>

Ref. rapidsai#7387 (comment)

Authors:
  - Karthikeyan (@karthikeyann)

Approvers:
  - Mike Wilson (@hyperbolic2346)
  - David (@davidwendt)

URL: rapidsai#7523
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge CMake CMake build issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants