Fix floating point window range extents. #13606

mythrocks · 2023-06-22T20:38:45Z

Description

This commit fixes the window range calculations for floating-point order by columns.

Window range calculations involve comparing the delta value (preceding/following) with the current row value, and capping current_row - delta at numeric_limits::min().

It turns out that for float/double values, numeric_limits::min() returns FLT_MIN which is the lowest positive finite float value. This causes the erstwhile logic to produce incorrect results when the order-by column contains negative float values.

The fix involves replacing numeric_limits::min() with numeric_limits::lowest() which returns the true min float value.

Reference:

https://en.cppreference.com/w/cpp/types/numeric_limits/min

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

This commit fixes the window range calculations for floating-point order by columns. Window range calculations involve comparing the `delta` value (preceding/following) with the current row value, and capping `current_row - delta` at `numeric_limits::min()`. It turns out that for `float`/`double` values, `numeric_limits::min()` returns `FLT_MIN` which is the lowest positive finite float value. This causes the erstwhile logic to produce incorrect results when the order-by column contains negative float values. The fix involves replacing `numeric_limits::min()` with `numeric_limits::lowest()` which returns the true min float value. Reference: 1. https://en.cppreference.com/w/cpp/types/numeric_limits/min

mythrocks · 2023-06-26T05:36:57Z

/merge

This is a follow-up to rapidsai#13512 (which added support for floating point order-by columns in window functions), and rapidsai#13606 (which fixed how negative values are handled for floating point order-by). This commit fixes how `NaN` and `+/- Infinity` values are handled for floating point. Prior to this commit, the calculations for range window extents depended on the behaviour of `thrust::less<float>` and `thrust::greater<float>`, as well as addition/subtraction on `+/- Infinity`. This produced some unexpected results: 1. `thrust::less`/`greater` on `NaN` does not produce strict ordering. 2. Addition/Subtraction on the numerical values of `Infinity` could produce finite values that interfere with window extent calculations. Ideally, the results should have remained infinite. This commit adds custom comparators with `NaN` awareness, to better handle columns that contain `NaN`s. It also fixes range calculations where `Infinity` is involved. Tests have been added to cover ASC/DESC order sorting on `FLOAT`, with `NaN` and `Infinity` values.

This is a follow-up to #13512 (which added support for floating point order-by columns in window functions), and #13606 (which fixed how negative values are handled for floating point order-by). This commit fixes how `NaN` and `+/- Infinity` values are handled for floating point. Prior to this commit, the calculations for range window extents depended on the behaviour of `thrust::less<float>` and `thrust::greater<float>`, as well as addition/subtraction on `+/- Infinity`. This produced some unexpected results: 1. `thrust::less`/`greater` on `NaN` does not produce strict ordering. 2. Addition/Subtraction on the numerical values of `Infinity` could produce finite values that interfere with window extent calculations. Ideally, the results should have remained infinite. This commit adds custom comparators with `NaN` awareness, to better handle columns that contain `NaN`s. It also fixes range calculations where `Infinity` is involved. Tests have been added to cover ASC/DESC order sorting on `FLOAT`, with `NaN` and `Infinity` values. Authors: - MithunR (https://github.com/mythrocks) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Mike Wilson (https://github.com/hyperbolic2346) - https://github.com/nvdbaranec URL: #13635

mythrocks added bug Something isn't working improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jun 22, 2023

mythrocks requested a review from a team as a code owner June 22, 2023 20:38

mythrocks self-assigned this Jun 22, 2023

mythrocks requested review from harrism and vyasr June 22, 2023 20:38

mythrocks removed the improvement Improvement / enhancement to an existing function label Jun 22, 2023

github-actions bot added libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function labels Jun 22, 2023

mythrocks removed the improvement Improvement / enhancement to an existing function label Jun 22, 2023

harrism approved these changes Jun 22, 2023

View reviewed changes

ttnghia approved these changes Jun 24, 2023

View reviewed changes

rapids-bot bot merged commit 9a3f3a9 into rapidsai:branch-23.08 Jun 26, 2023
51 checks passed

mythrocks mentioned this pull request Jun 28, 2023

Fix inf/NaN comparisons for FLOAT orderby in window functions #13635

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix floating point window range extents. #13606

Fix floating point window range extents. #13606

mythrocks commented Jun 22, 2023

mythrocks commented Jun 26, 2023

Fix floating point window range extents. #13606

Fix floating point window range extents. #13606

Conversation

mythrocks commented Jun 22, 2023

Description

Checklist

mythrocks commented Jun 26, 2023