fuse conv and batch_norm #3769

Summary: When `batchnorm` is applied after `conv` in a model, we can fuse the weight and bias of `batchnorm` into `conv` and thereafter remove the `batchnorm` node. We implement this fusion through graph transforms and apply it in `vulkan_preprocess.py`. This change can reduce both the latency and memory. We illustrate the performance improvement with Mobilenet_v2. - The model has 52 conv+batch_norm instances. After fusing, when we export the model as in D57475757, `_native_batch_norm_legit_no_training` doesn't show up anymore. - The performance has been improved as below. In particular, inference latency has been reduced from 161 ms to 148 ms. | fuse | Loading(ms) | vmRss(KB) | vmaBlock(KB) | Inference(ms) | vmRss(KB) | vmaBlock(KB) | | -------- | ------- | ------- | ------- | ------- | | Yes | 380 | 22928 | 65536 | 148 | 24296 | 65536 | | No | 473 | 26036 | 65536 | 161 | 27416 | 65536 | Differential Revision: D57895439

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fuse conv and batch_norm #3769

fuse conv and batch_norm #3769

Commits on May 29, 2024