Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression in 8-bit HistogramRange between 1.3.2B and 1.8.0 #852

Closed
dumerrill opened this issue May 2, 2018 · 2 comments
Closed
Assignees
Labels
cub For all items related to CUB

Comments

@dumerrill
Copy link

dumerrill commented May 2, 2018

FYI, HistogramRange is about half the performance for 8 bit data in 1.8.0 as was 1.3.2B, but everything else is about twice as fast.

V100 on Cuda 9.1 with an updated driver that supports Volta.

@alliepiper
Copy link
Collaborator

Marking as unverified, since we'll need to check this again after NVIDIA/cub#208.

@alliepiper alliepiper changed the title FYI, HistogramRange is about half the performance for 8 bit data in 1.8.0 as was 1.3.2B, but everything else is about twice as fast. Performance regression in HistogramRange between 1.3.2B and 1.8.0 May 6, 2022
@alliepiper alliepiper changed the title Performance regression in HistogramRange between 1.3.2B and 1.8.0 Performance regression in 8-bit HistogramRange between 1.3.2B and 1.8.0 May 6, 2022
@jrhemstad jrhemstad added the cub For all items related to CUB label Feb 22, 2023
@miscco
Copy link
Collaborator

miscco commented Feb 23, 2023

We would need a reproducer to investigate this further

@jarmak-nv jarmak-nv transferred this issue from NVIDIA/cub Nov 8, 2023
@jrhemstad jrhemstad closed this as not planned Won't fix, can't repro, duplicate, stale Nov 8, 2023
@github-project-automation github-project-automation bot moved this from Awaiting Feedback to Done in CCCL Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cub For all items related to CUB
Projects
Archived in project
Development

No branches or pull requests

4 participants