Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Float16 design doc #5313

Merged
merged 5 commits into from
Nov 7, 2017
Merged

Conversation

kexinzhao
Copy link
Contributor

No description provided.

float16 float_to_half_rn(float f); // convert to half precision in round-to-nearest-even mode
float half_to_float(float16 h);
```
which provides one-to-one conversion between float32 and float16. These twos functions will do different conversion routines based on the current hardware. CUDA/ARM instrinsics will be used when the corresonding hardware is available. When the hardware falls back to non-ARM cpu, software emulation will be performed to do the conversion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the hardware falls back to non-ARM cpu -> If the hardware or compiler level does not support float32 to float16 conversion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


A brief survey of float16 support on different hardwares can be found [here](https://github.com/PaddlePaddle/Paddle/issues/4853). A brief survey of existing float16 implementations can be found [here](https://github.com/Xreki/Xreki.github.io/blob/master/multi_data_types_in_dl_framework/ppt/float16_and_quantized_type.md).

There are various natively supported float16 implementations on different hardwares/linear algebra libraries including half on cuda, float16_t on ARM processor, and Eigen::half on Eigen.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a more detailed description is needed here. Need to describe the support of float16 from the three levels of hardware, compiler, and library respectively.

  • For nvcc compiler, after CUDA 7.5, the __half type is supported.
  • For NVIDIA GPU, maybe after sm_6.0(or sm_5.3?)
  • For gcc/clang? In the mobile, usually, clang compiler.
  • For libraries. Currently, what libraries support float16 calculations?
    This is important because of if we upgrade the environment, we need to know what the minimum environment to support is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point! A detailed description has been added. Thanks!

@kexinzhao kexinzhao merged commit 4d42215 into PaddlePaddle:develop Nov 7, 2017
@kexinzhao kexinzhao deleted the float16_design_doc branch November 7, 2017 08:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants