Skip to content

Commit

Permalink
[CUDA] Build nhwc ops by default (microsoft#22648)
Browse files Browse the repository at this point in the history
### Description

* Build cuda nhwc ops by default.
* Deprecate `--enable_cuda_nhwc_ops` in build.py and add
`--disable_cuda_nhwc_ops` option

Note that it requires cuDNN 9.x. If you build with cuDNN 8, NHWC ops
will be disabled automatically.

### Motivation and Context

In general, NHWC is faster than NCHW for convolution in Nvidia GPUs with
Tensor Cores, and this could improve performance for vision models.

This is the first step to prefer NHWC for CUDA in 1.21 release. Next
step is to do some tests on popular vision models. If it help in most
models and devices, set `prefer_nhwc=1` as default cuda provider option.
  • Loading branch information
tianleiwu authored and ankitm3k committed Dec 11, 2024
1 parent 6731c0a commit 6e5d9b8
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ stages:
--parallel \
--build_wheel \
--enable_onnx_tests --use_cuda --cuda_version=11.8 --cuda_home=/usr/local/cuda-11.8 --cudnn_home=/usr/local/cuda-11.8 \
--enable_cuda_profiling --enable_cuda_nhwc_ops \
--enable_cuda_profiling \
--enable_pybind --build_java \
--use_cache \
--cmake_extra_defines 'CMAKE_CUDA_ARCHITECTURES=75;86' ; \
Expand Down

0 comments on commit 6e5d9b8

Please sign in to comment.