Reuse input buffer in lowering to krnl #2939

chentong319 · 2024-09-11T17:58:00Z

Previous, a new memref is always created when onnx is lowered to krnl.
However, an input buffer could be reused as output, if the input has only only one use and the shape is the same as the output. The PR tried this idea on element-wise operations.
by default, the buffer reuse is turned off.
This feature is tested with Roberta-base-11.onnx. About 80% of malloc (in either numbers or memory size) can be avoided with buffer reuse. The buffer reuse code can generate correct result, but did not bring notable performance improvement, with either SIMD on or off.

Signed-off-by: chentong319 <chentong@us.ibm.com>

AlexandreEichenberger · 2024-09-12T12:38:26Z

src/Compiler/CompilerOptions.cpp

@@ -212,6 +213,16 @@ static llvm::cl::opt<bool, true> disableKrnlOpFusionOpt(
    llvm::cl::location(disableKrnlOpFusion), llvm::cl::init(false),
    llvm::cl::cat(OnnxMlirCommonOptions));

+static llvm::cl::opt<bool, true> disableKrnlBufferReuseOpt(


NIT: generally "disable" is for a function that is default on, "enable" is for one that is default off. I think you want "enable" here.

AlexandreEichenberger · 2024-09-12T12:42:53Z

src/Conversion/ONNXToKrnl/Math/Elementwise.cpp

+    ValueRange generatedOperands, MemRefType outputMemRefType, DimsExprRef dims,
+    int64_t alignment, int64_t VL) {
+
+  // By default, disableKrnlBufferReuse is true. Simply allocate a memref.


This code could be simplified as follows

if (!disableKrnlBufferReuse) { int indexToReuse = xxx if (indexToReuse != -1) return xxx } // no reuse, alloc

Fixed. Thanks for the suggestion.

AlexandreEichenberger · 2024-09-12T12:43:56Z

test/mlir/conversion/onnx_to_krnl/onnx_lowering_reuse.mlir

+// RUN: onnx-mlir-opt --disable-krnl-op-fusion=true --disable-krnl-buffer-reuse=false --shape-inference --convert-onnx-to-krnl --canonicalize %s -split-input-file | FileCheck %s
+
+// -----
+func.func @test_reuse(%arg0: tensor<1024xf32>, %arg1: tensor<1024xf32>) -> tensor<1024xf32> {


Nice concise test, we should all aspire to do that!

Signed-off-by: chentong319 <chentong@us.ibm.com>

AlexandreEichenberger

LGTM, thanks for this exploratory work.

jenkins-droid · 2024-09-13T17:52:21Z

Jenkins Linux amd64 Build #15619 [push] Reuse input buffer in lo... started at 12:52

jenkins-droid · 2024-09-13T17:52:22Z

Jenkins Linux ppc64le Build #14649 [push] Reuse input buffer in lo... started at 14:03

jenkins-droid · 2024-09-13T17:52:23Z

Jenkins Linux s390x Build #15622 [push] Reuse input buffer in lo... started at 13:52

jenkins-droid · 2024-09-13T19:10:40Z

Jenkins Linux amd64 Build #15619 [push] Reuse input buffer in lo... passed after 1 hr 18 min

jenkins-droid · 2024-09-13T19:53:56Z

Jenkins Linux s390x Build #15622 [push] Reuse input buffer in lo... passed after 2 hr 1 min

jenkins-droid · 2024-09-13T20:12:08Z

Jenkins Linux ppc64le Build #14649 [push] Reuse input buffer in lo... passed after 2 hr 19 min

* first step Signed-off-by: chentong319 <chentong@us.ibm.com> * cpu Signed-off-by: chentong319 <chentong@us.ibm.com> * options Signed-off-by: chentong319 <chentong@us.ibm.com> * unify Signed-off-by: chentong319 <chentong@us.ibm.com> * simd Signed-off-by: chentong319 <chentong@us.ibm.com> * comments Signed-off-by: chentong319 <chentong@us.ibm.com> * lit test Signed-off-by: chentong319 <chentong@us.ibm.com> * fix test Signed-off-by: chentong319 <chentong@us.ibm.com> * format Signed-off-by: chentong319 <chentong@us.ibm.com> * response Signed-off-by: chentong319 <chentong@us.ibm.com> --------- Signed-off-by: chentong319 <chentong@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* Change lowering of onnx.IF to Krnl (#2932) * implementation Signed-off-by: chentong319 <chentong@us.ibm.com> * test case change Signed-off-by: chentong319 <chentong@us.ibm.com> * format Signed-off-by: chentong319 <chentong@us.ibm.com> * add test for If back Signed-off-by: chentong319 <chentong@us.ibm.com> * format Signed-off-by: chentong319 <chentong@us.ibm.com> --------- Signed-off-by: chentong319 <chentong@us.ibm.com> Co-authored-by: Tung D. Le <tung@jp.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Update c style cast to c++ style cast (#2934) Signed-off-by: Mike Essenmacher <essen@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Change c style cast to c++ style cast (#2936) Signed-off-by: Mike Essenmacher <essen@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Add coding practices for onnx-mlir (#2935) Signed-off-by: Mike Essenmacher <essen@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * try to use new buffer deallocation (#2919) * implementation Signed-off-by: Chen Tong <chentong@us.ibm.com> * comments Signed-off-by: Chen Tong <chentong@us.ibm.com> * format Signed-off-by: Chen Tong <chentong@us.ibm.com> --------- Signed-off-by: Chen Tong <chentong@us.ibm.com> Co-authored-by: Tung D. Le <tung@jp.ibm.com> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * fix requirements.txt link Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Reuse input buffer in lowering to krnl (#2939) * first step Signed-off-by: chentong319 <chentong@us.ibm.com> * cpu Signed-off-by: chentong319 <chentong@us.ibm.com> * options Signed-off-by: chentong319 <chentong@us.ibm.com> * unify Signed-off-by: chentong319 <chentong@us.ibm.com> * simd Signed-off-by: chentong319 <chentong@us.ibm.com> * comments Signed-off-by: chentong319 <chentong@us.ibm.com> * lit test Signed-off-by: chentong319 <chentong@us.ibm.com> * fix test Signed-off-by: chentong319 <chentong@us.ibm.com> * format Signed-off-by: chentong319 <chentong@us.ibm.com> * response Signed-off-by: chentong319 <chentong@us.ibm.com> --------- Signed-off-by: chentong319 <chentong@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Fix GroupNorm to support Opset21 (#2928) * Group norm for opset 21 * Testing phase * Fix GroupNorm to support Opset21 --------- Signed-off-by: hamptonm1 <79232909+hamptonm1@users.noreply.github.com> Co-authored-by: Megan Hampton <hamptonm@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Update Ops documentation for ONNX 1.16.2 (#2942) * Update Ops documentation for ONNX 1.16.2 * Fix format --------- Co-authored-by: Megan Hampton <hamptonm@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * LLVM/StableHLO Upgrade eaa95a1 (#2943) Co-authored-by: Megan Hampton <hamptonm@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * added support for no-zero-point quantization (#2938) Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> Co-authored-by: Tung D. Le <tung@jp.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * update with main Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> --------- Signed-off-by: chentong319 <chentong@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> Signed-off-by: Mike Essenmacher <essen@us.ibm.com> Signed-off-by: Chen Tong <chentong@us.ibm.com> Signed-off-by: hamptonm1 <79232909+hamptonm1@users.noreply.github.com> Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Sunny Anand <164108690+Sunny-Anand@users.noreply.github.com> Co-authored-by: Tong Chen <chentong@us.ibm.com> Co-authored-by: Tung D. Le <tung@jp.ibm.com> Co-authored-by: Mike Essenmacher <112431871+mikeessen@users.noreply.github.com> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Co-authored-by: hamptonm1 <79232909+hamptonm1@users.noreply.github.com> Co-authored-by: Megan Hampton <hamptonm@us.ibm.com>

chentong319 added 10 commits September 6, 2024 14:25

first step

1e991e4

Signed-off-by: chentong319 <chentong@us.ibm.com>

cpu

4d728e8

Signed-off-by: chentong319 <chentong@us.ibm.com>

options

4d4e938

Signed-off-by: chentong319 <chentong@us.ibm.com>

unify

c911e40

Signed-off-by: chentong319 <chentong@us.ibm.com>

simd

fa390ab

Signed-off-by: chentong319 <chentong@us.ibm.com>

comments

05b6aa0

Signed-off-by: chentong319 <chentong@us.ibm.com>

Merge remote-tracking branch 'upstream/main' into reuse-buffer

4136c6f

lit test

e9941b2

Signed-off-by: chentong319 <chentong@us.ibm.com>

fix test

3600425

Signed-off-by: chentong319 <chentong@us.ibm.com>

format

c59908c

Signed-off-by: chentong319 <chentong@us.ibm.com>

chentong319 requested a review from AlexandreEichenberger September 12, 2024 00:18

AlexandreEichenberger reviewed Sep 12, 2024

View reviewed changes

response

d954e6a

Signed-off-by: chentong319 <chentong@us.ibm.com>

chentong319 requested a review from AlexandreEichenberger September 13, 2024 15:56

AlexandreEichenberger approved these changes Sep 13, 2024

View reviewed changes

chentong319 merged commit 97d497f into onnx:main Sep 13, 2024
7 checks passed

chentong319 deleted the reuse-buffer branch September 13, 2024 17:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reuse input buffer in lowering to krnl #2939

Reuse input buffer in lowering to krnl #2939

chentong319 commented Sep 11, 2024

AlexandreEichenberger Sep 12, 2024

chentong319 Sep 13, 2024

AlexandreEichenberger Sep 12, 2024

chentong319 Sep 13, 2024

AlexandreEichenberger Sep 12, 2024

AlexandreEichenberger left a comment

jenkins-droid commented Sep 13, 2024

jenkins-droid commented Sep 13, 2024

jenkins-droid commented Sep 13, 2024

jenkins-droid commented Sep 13, 2024

jenkins-droid commented Sep 13, 2024

jenkins-droid commented Sep 13, 2024

Reuse input buffer in lowering to krnl #2939

Reuse input buffer in lowering to krnl #2939

Conversation

chentong319 commented Sep 11, 2024

AlexandreEichenberger Sep 12, 2024

Choose a reason for hiding this comment

chentong319 Sep 13, 2024

Choose a reason for hiding this comment

AlexandreEichenberger Sep 12, 2024

Choose a reason for hiding this comment

chentong319 Sep 13, 2024

Choose a reason for hiding this comment

AlexandreEichenberger Sep 12, 2024

Choose a reason for hiding this comment

AlexandreEichenberger left a comment

Choose a reason for hiding this comment

jenkins-droid commented Sep 13, 2024

jenkins-droid commented Sep 13, 2024

jenkins-droid commented Sep 13, 2024

jenkins-droid commented Sep 13, 2024

jenkins-droid commented Sep 13, 2024

jenkins-droid commented Sep 13, 2024