Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-20213: [C++] Implement cast to/from halffloat #40067

Merged
merged 21 commits into from
Apr 4, 2024

Conversation

ClifHouck
Copy link
Contributor

@ClifHouck ClifHouck commented Feb 13, 2024

Rationale for this change

What changes are included in this PR?

This PR implements casting to and from float16 types using the vendored float16 library included in arrow at cpp/arrrow/util/float16.*.

Are these changes tested?

Unit tests are included in this PR.

Are there any user-facing changes?

In that casts to and from float16 will now work, yes.

TODO

Copy link

⚠️ GitHub issue #20213 has been automatically assigned in GitHub to PR creator.

@ClifHouck
Copy link
Contributor Author

This still needs work, but I would greatly appreciate someone who is more familiar with the arrow code-base to comment on whether my approach is sane or not.

Copy link

⚠️ GitHub issue #20213 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Feb 14, 2024
@pitrou
Copy link
Member

pitrou commented Feb 14, 2024

Thanks for posting this. The approach is sane, I posted some comments.

You will have to decide whether you want to tackle casting to string, or prefer leaving that for later/another contributor.

@ClifHouck
Copy link
Contributor Author

So far this PR seems to work for the following casts:

  • float16 to float32
  • float16 to float64
  • float32 to float16
  • float64 to float16

@pitrou you mentioned string cast(s).

Is that:

  1. float16 to string
  2. string to float16

?

What other casts are required to consider this done?

@pitrou
Copy link
Member

pitrou commented Feb 21, 2024

@pitrou you mentioned string cast(s).
Is that:
1. float16 to string
2. string to float16
?

Yes.

What other casts are required to consider this done?

Ideally, integer-to-half and half-to-integer would be implemented as well.

Copy link

⚠️ GitHub issue #20213 has been automatically assigned in GitHub to PR creator.

1 similar comment
Copy link

⚠️ GitHub issue #20213 has been automatically assigned in GitHub to PR creator.

@ClifHouck
Copy link
Contributor Author

Please correct me if I'm wrong, but it seems like float32/float64 cast to string is ultimately handled by this vendored library cpp/src/arrow/vendored/double-conversion, source repo here: https://github.com/google/double-conversion .

There doesn't seem to be a direct way to cast from a half float ultimately represented by a uint16_t to a string, but I wonder if it's acceptable to cast to float first and then use the existing float to string cast? or is that unacceptable overhead?

@pitrou
Copy link
Member

pitrou commented Feb 22, 2024

There doesn't seem to be a direct way to cast from a half float ultimately represented by a uint16_t to a string, but I wonder if it's acceptable to cast to float first and then use the existing float to string cast? or is that unacceptable overhead?

Yes, it's acceptable. The only annoyance would be the superfluous precision, but that's not a blocking issue.

@pitrou
Copy link
Member

pitrou commented Feb 22, 2024

By the way, you'll have to implement unit tests at some point (in C++).

@ianmcook
Copy link
Member

@ClifHouck thanks for working on this!

Small bit of housekeeping:

There is a note in docs/source/status.rst that says:

(1) Casting to/from Float16 in C++ is not supported.

Could you please push a commit here that removes that note? Looks like you will need to rebase first because there have been some changes in status.rst since you began work on this PR.

@ClifHouck
Copy link
Contributor Author

So one question I currently have is, should I specialize Datum on HalfFloatType (or similar)? Right now tests are failing when trying to compare expected to actual output from doing casts from, say, int to float16.

@pitrou
Copy link
Member

pitrou commented Feb 29, 2024

So one question I currently have is, should I specialize Datum on HalfFloatType (or similar)?

Which method would you specialize?

@ClifHouck
Copy link
Contributor Author

ClifHouck commented Feb 29, 2024

So one question I currently have is, should I specialize Datum on HalfFloatType (or similar)?

Which method would you specialize?

After reading more code, more likely I need to specialize or modify CompareFloating, VisitFloatingEquality, etc. But I'm not certain.

@pitrou
Copy link
Member

pitrou commented Mar 6, 2024

After reading more code, more likely I need to specialize or modify CompareFloating, VisitFloatingEquality, etc. But I'm not certain.

If you're struggling on this, someone else can take a look and help. Perhaps @benibus or @felipecrv ?
(or I can get to it later when I have time)

@ClifHouck
Copy link
Contributor Author

ClifHouck commented Mar 6, 2024

I'm currently stuck on getting a new FloatingToFloating test to pass. AFAICT the other existing tests now pass with the new float16 casts. Trying to make progress on getting this last test to pass today.

Copy link

github-actions bot commented Mar 6, 2024

⚠️ GitHub issue #20213 has been automatically assigned in GitHub to PR creator.

@ClifHouck
Copy link
Contributor Author

I think I've fixed things with my latest commit. Turned out a specialization of CastPrimitive was not being called as I expected, and instead the generic template version was being used.

@ClifHouck
Copy link
Contributor Author

I added a few more TODOs but I think this is very close to being ready for review.

Copy link

github-actions bot commented Mar 6, 2024

⚠️ GitHub issue #20213 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added the awaiting change review Awaiting change review label Mar 28, 2024
@ClifHouck
Copy link
Contributor Author

Could you rebase on main? The failure was fixed by #40794.

Done.

@ianmcook
Copy link
Member

ianmcook commented Apr 4, 2024

Any objections to @kou merging this?

@kou
Copy link
Member

kou commented Apr 4, 2024

Oh, sorry. I missed this. I'll merge this.

@kou kou merged commit 72d20ad into apache:main Apr 4, 2024
37 checks passed
@kou kou removed the awaiting change review Awaiting change review label Apr 4, 2024
@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Apr 4, 2024
@pitrou
Copy link
Member

pitrou commented Apr 4, 2024

Oops, sorry that I couldn't give this a review earlier.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Apr 4, 2024
@ClifHouck
Copy link
Contributor Author

I'll address any additional code-related changes needed here in a follow-up PR.

pitrou pushed a commit that referenced this pull request Apr 10, 2024
…41084)

### Rationale for this change

Address remaining tasks brought up post-merge in this PR: #40067

### What changes are included in this PR?

Adds a couple of tests and cleans up cruft on another test.

### Are these changes tested?

Yes, as these are test-only related changes.

### Are there any user-facing changes?

No.

* GitHub Issue: #41089 

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request Apr 15, 2024
…casts (apache#41084)

### Rationale for this change

Address remaining tasks brought up post-merge in this PR: apache#40067

### What changes are included in this PR?

Adds a couple of tests and cleans up cruft on another test.

### Are these changes tested?

Yes, as these are test-only related changes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#41089 

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
tolleybot pushed a commit to tmct/arrow that referenced this pull request May 2, 2024
### Rationale for this change

### What changes are included in this PR?

This PR implements casting to and from float16 types using the vendored float16 library included in arrow at `cpp/arrrow/util/float16.*`.

### Are these changes tested?

Unit tests are included in this PR.

### Are there any user-facing changes?

In that casts to and from float16 will now work, yes.

* Closes: apache#20213

### TODO

- [x] Add casts to/from float64.
- [x] String <-> float16 casts.
- [x] Integer <-> float16 casts.
- [x] Tests.
- [x] Update https://github.com/apache/arrow/blob/main/docs/source/status.rst about half float.
- [x] Rebase.
- [x] Run clang format over this PR.
* GitHub Issue: apache#20213

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
tolleybot pushed a commit to tmct/arrow that referenced this pull request May 2, 2024
…casts (apache#41084)

### Rationale for this change

Address remaining tasks brought up post-merge in this PR: apache#40067

### What changes are included in this PR?

Adds a couple of tests and cleans up cruft on another test.

### Are these changes tested?

Yes, as these are test-only related changes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#41089 

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
tolleybot pushed a commit to tmct/arrow that referenced this pull request May 4, 2024
### Rationale for this change

### What changes are included in this PR?

This PR implements casting to and from float16 types using the vendored float16 library included in arrow at `cpp/arrrow/util/float16.*`.

### Are these changes tested?

Unit tests are included in this PR.

### Are there any user-facing changes?

In that casts to and from float16 will now work, yes.

* Closes: apache#20213

### TODO

- [x] Add casts to/from float64.
- [x] String <-> float16 casts.
- [x] Integer <-> float16 casts.
- [x] Tests.
- [x] Update https://github.com/apache/arrow/blob/main/docs/source/status.rst about half float.
- [x] Rebase.
- [x] Run clang format over this PR.
* GitHub Issue: apache#20213

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
tolleybot pushed a commit to tmct/arrow that referenced this pull request May 4, 2024
…casts (apache#41084)

### Rationale for this change

Address remaining tasks brought up post-merge in this PR: apache#40067

### What changes are included in this PR?

Adds a couple of tests and cleans up cruft on another test.

### Are these changes tested?

Yes, as these are test-only related changes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#41089 

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
rok pushed a commit to tmct/arrow that referenced this pull request May 8, 2024
### Rationale for this change

### What changes are included in this PR?

This PR implements casting to and from float16 types using the vendored float16 library included in arrow at `cpp/arrrow/util/float16.*`.

### Are these changes tested?

Unit tests are included in this PR.

### Are there any user-facing changes?

In that casts to and from float16 will now work, yes.

* Closes: apache#20213

### TODO

- [x] Add casts to/from float64.
- [x] String <-> float16 casts.
- [x] Integer <-> float16 casts.
- [x] Tests.
- [x] Update https://github.com/apache/arrow/blob/main/docs/source/status.rst about half float.
- [x] Rebase.
- [x] Run clang format over this PR.
* GitHub Issue: apache#20213

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
rok pushed a commit to tmct/arrow that referenced this pull request May 8, 2024
…casts (apache#41084)

### Rationale for this change

Address remaining tasks brought up post-merge in this PR: apache#40067

### What changes are included in this PR?

Adds a couple of tests and cleans up cruft on another test.

### Are these changes tested?

Yes, as these are test-only related changes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#41089 

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
rok pushed a commit to tmct/arrow that referenced this pull request May 8, 2024
### Rationale for this change

### What changes are included in this PR?

This PR implements casting to and from float16 types using the vendored float16 library included in arrow at `cpp/arrrow/util/float16.*`.

### Are these changes tested?

Unit tests are included in this PR.

### Are there any user-facing changes?

In that casts to and from float16 will now work, yes.

* Closes: apache#20213

### TODO

- [x] Add casts to/from float64.
- [x] String <-> float16 casts.
- [x] Integer <-> float16 casts.
- [x] Tests.
- [x] Update https://github.com/apache/arrow/blob/main/docs/source/status.rst about half float.
- [x] Rebase.
- [x] Run clang format over this PR.
* GitHub Issue: apache#20213

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
rok pushed a commit to tmct/arrow that referenced this pull request May 8, 2024
…casts (apache#41084)

### Rationale for this change

Address remaining tasks brought up post-merge in this PR: apache#40067

### What changes are included in this PR?

Adds a couple of tests and cleans up cruft on another test.

### Are these changes tested?

Yes, as these are test-only related changes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#41089 

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
### Rationale for this change

### What changes are included in this PR?

This PR implements casting to and from float16 types using the vendored float16 library included in arrow at `cpp/arrrow/util/float16.*`.

### Are these changes tested?

Unit tests are included in this PR.

### Are there any user-facing changes?

In that casts to and from float16 will now work, yes.

* Closes: apache#20213

### TODO

- [x] Add casts to/from float64.
- [x] String <-> float16 casts.
- [x] Integer <-> float16 casts.
- [x] Tests.
- [x] Update https://github.com/apache/arrow/blob/main/docs/source/status.rst about half float.
- [x] Rebase.
- [x] Run clang format over this PR.
* GitHub Issue: apache#20213

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
…casts (apache#41084)

### Rationale for this change

Address remaining tasks brought up post-merge in this PR: apache#40067

### What changes are included in this PR?

Adds a couple of tests and cleans up cruft on another test.

### Are these changes tested?

Yes, as these are test-only related changes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#41089 

Authored-by: Clif Houck <me@clifhouck.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Copy link

After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit 72d20ad.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 4 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] Cast to/from halffloat not implemented
4 participants