Skip to content

cudnn FE 1.7.0 Release

Latest
Compare
Choose a tag to compare
@Anerudhan Anerudhan released this 23 Sep 20:53
de355c7

cudnn FE 1.7.0 Release notes:

New API

  • Kernel Cache support for dynamic graphs Added New APIs to enable kernel cache support for graphs with dynamic shapes. Please refer to documentation for API details.

Added examples Convolution fprop dynamic shape, CSBR Graph dynamic shape, Matmul dynamic shape and Bias + Matmul dynamic shape to showcase use of dynamic shapes and kernel cache.

  • Two new APIs to describe the plan in the form engine number and knobs are introduced.
error_t
get_plan_name(std::string &name) const;

error_t
get_plan_name_at_index(int64_t plan_index, std::string &name) const;

Note:
This name can be used later if you want to deselect_plan_by_name, if run into any potential errors.

  • Added an API to query tensor attributes from its UID in a graph. query_tensor_with_uid(int64_t const uid, Tensor_attributes &tensor) const;

Improvements

  • sdpa fp16 bprop node can now compute dbias when padding mask is enabled (requires cudnn 9.4.0 and above).

  • sdpa fp8 (forward and bprop) nodes now support optional bias, dropout and padding mask(requires cudnn 9.4.0 and above).

  • Matmul fp8 node can now accept M,N,K overrides.

  • Added new python notebooks for implementing BatchNorm and BatchNorm bprop using cuDNN.

  • Updated benchmark numbers with cudnn 9.4.0 for fp16 and fp8 datatypes.

  • Fixed compilation issues when NV_CUDNN_DISABLE_EXCEPTION is enabled.

Bug fixes

  • Fixed a crash when the output dimension of dgrad node is not specified. This now returns an error message instead.

  • Fixed incorrect SDPA stats stride inferencing.

  • Fixed a bug in sdpa test when sliding window attention is enabled and query sequence length (s_q) is greater than key length (s_kv). This case is now not supported.