-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bf16] add bf16 kernel: elementwise_max #39461
Conversation
Thanks for your contribution! |
… dev/bf16_op_5
… dev/bf16_op_5
|
||
|
||
@unittest.skipIf( | ||
core.is_compiled_with_cuda() and get_cuda_runtime_version() < 11000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not core.is_compiled_with_cuda() or core.cudnn_version() < 8100
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because support cpu bf16 kernel, so change to "core.is_compiled_with_cuda() and core.cudnn_version() < 8100"
@@ -272,7 +274,7 @@ def convert_float_to_uint16(float_list, data_format="NCHW"): | |||
def convert_uint16_to_float(in_list): | |||
in_list = np.asarray(in_list) | |||
out = np.vectorize( | |||
lambda x: struct.unpack('<f', struct.pack('<I', x << 16))[0], | |||
lambda x: struct.unpack('<f', struct.pack('<I', np.uint32(x) << np.uint32(16)))[0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why update this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
solve the window-openblas ci error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
OPs
Describe
为
elementwise_max
添加 bf16 kernel