Release v3.3.1 · oneapi-src/oneDNN

This is a patch release containing the following changes to v3.3:

Fixed int8 convolution accuracy issue on Intel GPUs (09c87c7)
Switched internal stream to in-order mode for NVIDIA and AMD GPUs to avoid synchronization issues (db01d62)
Fixed runtime error for avgpool_bwd operation in Graph API (d025ef6, 9e0602a, e0dc1b3)
Fixed benchdnn error reporting for some Graph API cases (98dc9db)
Fixed accuracy issue in experimental Graph Compiler for int8 MHA variant from StarCoder model (5476ef7)
Fixed incorrect results for layer normalization with trivial dimensions on Intel GPUs (a2ec0a0)
Removed redundant synchronization for out-of-order SYCL queues (a96e9b1)
Fixed runtime error in experimental Graph Compiler for int8 MLP subgraph from LLAMA model (595543d)
Fixed SEGFAULT in experimental Graph Compiler for fp32 MLP subgraph (4207105)
Fixed incorrect results in experimental Graph Compiler for MLP subgraph (57e14b5)
Fixed the issue with f16 inner product primitive with s8 output returning unimplemented on Intel GPUs (bf12207, 800b5e9, ec7054a)
Fixed incorrect results for int8 deconvolution with zero-points on processors with Intel AMX instructions support (55d2cec)

Provide feedback