-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic validation in reader benchmarks #14647
Conversation
…bm-basic-validation
…bm-basic-validation
…bm-basic-validation
size_t const chunk_size = cudf::util::div_rounding_up_safe(source_sink.size(), num_chunks); | ||
auto const chunk_row_cnt = | ||
cudf::util::div_rounding_up_safe(view.num_rows(), static_cast<cudf::size_type>(num_chunks)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old approach was rounding down and losing some rows. Adding the check uncovered the issue.
Also some logic in the loop got simplified by rounding up here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some non-blocking nits. Otherwise LGTM.
Co-authored-by: Yunsong Wang <yunsongw@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of questions...
…nto bm-basic-validation
/merge |
Description
Check the output table shape in the CSV, JSON, ORC and Parquet reader benchmarks.
Other changes:
Fixed some chunking logic in the CSV reader benchmark.
Shortened the lifetime of the original table to reduce peak memory use (adopted the pattern from the JSON reader benchmark).
Checklist