Float16 design doc #5313

kexinzhao · 2017-11-02T17:47:09Z

No description provided.

hedaoyuan · 2017-11-06T03:04:10Z

doc/design/float16.md

+float16 float_to_half_rn(float f);  // convert to half precision in round-to-nearest-even mode
+float half_to_float(float16 h);
+```
+which provides one-to-one conversion between float32 and float16. These twos functions will do different conversion routines based on the current hardware. CUDA/ARM instrinsics will be used when the corresonding hardware is available. When the hardware falls back to non-ARM cpu, software emulation will be performed to do the conversion.


When the hardware falls back to non-ARM cpu -> If the hardware or compiler level does not support float32 to float16 conversion.

hedaoyuan · 2017-11-06T03:20:18Z

doc/design/float16.md

+
+A brief survey of float16 support on different hardwares can be found [here](https://github.com/PaddlePaddle/Paddle/issues/4853). A brief survey of existing float16 implementations can be found [here](https://github.com/Xreki/Xreki.github.io/blob/master/multi_data_types_in_dl_framework/ppt/float16_and_quantized_type.md). 
+
+There are various natively supported float16 implementations on different hardwares/linear algebra libraries including half on cuda, float16_t on ARM processor, and Eigen::half on Eigen.


I think a more detailed description is needed here. Need to describe the support of float16 from the three levels of hardware, compiler, and library respectively.

For nvcc compiler, after CUDA 7.5, the __half type is supported.

For NVIDIA GPU, maybe after sm_6.0(or sm_5.3?)

For gcc/clang? In the mobile, usually, clang compiler.

For libraries. Currently, what libraries support float16 calculations?
This is important because of if we upgrade the environment, we need to know what the minimum environment to support is.

Great point! A detailed description has been added. Thanks!

… float16_design_doc

small fix

b5177e3

kexinzhao requested review from Xreki and hedaoyuan November 2, 2017 17:47

hedaoyuan reviewed Nov 6, 2017

View reviewed changes

kexinzhao added 4 commits November 6, 2017 14:39

fix comment

eca6c0e

address comment

ed04982

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

94b9029

… float16_design_doc

small fix

aa747b5

hedaoyuan approved these changes Nov 7, 2017

View reviewed changes

kexinzhao merged commit 4d42215 into PaddlePaddle:develop Nov 7, 2017

kexinzhao deleted the float16_design_doc branch November 7, 2017 08:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Float16 design doc #5313

Float16 design doc #5313

kexinzhao commented Nov 2, 2017

hedaoyuan Nov 6, 2017

kexinzhao Nov 7, 2017

hedaoyuan Nov 6, 2017

kexinzhao Nov 7, 2017


		A brief survey of float16 support on different hardwares can be found [here](https://github.com/PaddlePaddle/Paddle/issues/4853). A brief survey of existing float16 implementations can be found [here](https://github.com/Xreki/Xreki.github.io/blob/master/multi_data_types_in_dl_framework/ppt/float16_and_quantized_type.md).

		There are various natively supported float16 implementations on different hardwares/linear algebra libraries including half on cuda, float16_t on ARM processor, and Eigen::half on Eigen.

Float16 design doc #5313

Float16 design doc #5313

Conversation

kexinzhao commented Nov 2, 2017

hedaoyuan Nov 6, 2017

Choose a reason for hiding this comment

kexinzhao Nov 7, 2017

Choose a reason for hiding this comment

hedaoyuan Nov 6, 2017

Choose a reason for hiding this comment

kexinzhao Nov 7, 2017

Choose a reason for hiding this comment