Skip to content

Commit

Permalink
Update Infer proto docs to mention BFloat16 type (kubeflow#2159)
Browse files Browse the repository at this point in the history
Signed-off-by: Ryan McCormick <rmccormick@nvidia.com>
  • Loading branch information
rmccorm4 authored Apr 28, 2022
1 parent 81d81b3 commit 3dc5bfb
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 15 deletions.
10 changes: 4 additions & 6 deletions docs/predict-api/v2/grpc_predict_v2.proto
Original file line number Diff line number Diff line change
Expand Up @@ -194,9 +194,8 @@ message ModelInferRequest
// what is expected by the tensor's shape and data type. The raw
// data must be the flattened, one-dimensional, row-major order of
// the tensor elements without any stride or padding between the
// elements. Note that the FP16 data type must be represented as raw
// content as there is no specific data type for a 16-bit float
// type.
// elements. Note that the FP16 and BF16 data types must be represented as
// raw content as there is no specific data type for a 16-bit float type.
//
// If this field is specified then InferInputTensor::contents must
// not be specified for any input tensor.
Expand Down Expand Up @@ -249,9 +248,8 @@ message ModelInferResponse
// what is expected by the tensor's shape and data type. The raw
// data must be the flattened, one-dimensional, row-major order of
// the tensor elements without any stride or padding between the
// elements. Note that the FP16 data type must be represented as raw
// content as there is no specific data type for a 16-bit float
// type.
// elements. Note that the FP16 and BF16 data types must be represented as
// raw content as there is no specific data type for a 16-bit float type.
//
// If this field is specified then InferOutputTensor::contents must
// not be specified for any output tensor.
Expand Down
17 changes: 8 additions & 9 deletions docs/predict-api/v2/required_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -421,9 +421,9 @@ Tensor data given explicitly is provided in a JSON array. Each element
of the array may be an integer, floating-point number, string or
boolean value. The server can decide to coerce each element to the
required type or return an error if an unexpected value is
received. Note that fp16 is problematic to communicate explicitly
since there is not a standard fp16 representation across backends nor
typically the programmatic support to create the fp16 representation
received. Note that fp16 and bf16 are problematic to communicate explicitly
since there is not a standard fp16/bf16 representation across backends nor
typically the programmatic support to create the fp16/bf16 representation
for a JSON number.

For example, the 2-dimensional matrix:
Expand Down Expand Up @@ -667,9 +667,8 @@ failure. The request and response messages for ModelInfer are:
// what is expected by the tensor's shape and data type. The raw
// data must be the flattened, one-dimensional, row-major order of
// the tensor elements without any stride or padding between the
// elements. Note that the FP16 data type must be represented as raw
// content as there is no specific data type for a 16-bit float
// type.
// elements. Note that the FP16 and BF16 data types must be represented as
// raw content as there is no specific data type for a 16-bit float type.
//
// If this field is specified then InferInputTensor::contents must
// not be specified for any input tensor.
Expand Down Expand Up @@ -722,9 +721,8 @@ failure. The request and response messages for ModelInfer are:
// what is expected by the tensor's shape and data type. The raw
// data must be the flattened, one-dimensional, row-major order of
// the tensor elements without any stride or padding between the
// elements. Note that the FP16 data type must be represented as raw
// content as there is no specific data type for a 16-bit float
// type.
// elements. Note that the FP16 and BF16 data types must be represented as
// raw content as there is no specific data type for a 16-bit float type.
//
// If this field is specified then InferOutputTensor::contents must
// not be specified for any output tensor.
Expand Down Expand Up @@ -868,3 +866,4 @@ of each type, in bytes.
| FP32 | 4 |
| FP64 | 8 |
| BYTES | Variable (max 2<sup>32</sup>) |
| BF16 | 2 |

0 comments on commit 3dc5bfb

Please sign in to comment.