-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] logical_cast from FLOAT32 to INT32 and FLOAT64 to INT64 or bit_cast for the same #6834
Comments
@jrhemstad Any objections to adding a bit_cast function to column_view.hpp? I figure it'd follow the same restrictions as |
Looking back at the original issue description the prohibition against integral <-> floating point of the same size wasn't mentioned: #4794. So I'm guessing this was just an oversight. |
Currently, From my point of view, just allow casting between any types that have the same data size. |
I think as @nvdbaranec mentioned we can just adopt the rules exactly from
|
@jrhemstad @nvdbaranec @codereport In the case of
|
No, bit_cast should just take the
Providing a user-defined overload for |
@ttnghia I setup a unit test EXPECT_TRUE(std::is_trivially_copyable<decimal32>::value); and it passed. Do you mind sharing the code that returned |
Ah, I'm sorry, I did something wrong with my test thus it returned false. Yeah, your test is correct, and the result should be true. @jrhemstad I'm not very clear. If we allow casting the I think @revans2 was right in the first place: we should not allow doing |
Your confusion is understandable. For the |
To be clear I don't want a I would do the same thing for FLOAT64 separately doing a logical_cast to INT64 and using a different mask. There are also odd corner cases with NaNs that I would want to do similar things with. I don't want to put the ugliness of this directly into cudf, so all I really want is a relief valve that I can then use to work around it. |
@jrhemstad Just to clarify: are we going with |
I kind of like the idea of renaming |
Currently, I'm dealing with unit tests for this. Previously, we cast only between integer types thus we can run a unit test to compare the values of data elements before and after casting. However, now we can cast between integer and other types such as
How do you think? Since the underlying data remains unchanged, comparing the same value but viewed in different types does not make much sense. |
@ttnghia Why not use std::bit_cast in the test? That is the goal after all. |
That's a good idea! However, |
One easy way to test it might be just to test that you can round trip from one type, to another, and back to the original and assert that the original column and the roundtripped column are equal. |
…7373) This MR resolves #6834. It renames `logical_cast` to `bit_cast` which performs casting following the rules of `std::bit_cast`: the two types (FromType and ToType) must have the same size and are trivially copyable. Authors: - Nghia Truong (@ttnghia) Approvers: - @nvdbaranec - David (@davidwendt) - Keith Kraus (@kkraus14) URL: #7373
Spark is really inconsistent in how it handles some values like
-0.0
vs0.0
and the variousNaN
values that are possible. I don't expect cuDF to be aware of any of this, but I would like the ability to work around it in some cases by treating the floating point value as if it were just a bunch of bits. To melogical_cast
feels like the right place to do this, but floating point values are not allowed to be cast to ints.cudf/cpp/include/cudf/column/column_view.hpp
Line 580 in dbeac89
If we could reduce that restriction that would be great. If you would rather call it something like
bit_cast
so it is more in line with what C++ does that is fine too. In the worst case I can just implement something myself in the JNI layer, but I thought it might be good to ask here just in case the python side felt that they wanted this too.The text was updated successfully, but these errors were encountered: