Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse input buffer in lowering to krnl #2939

Merged
merged 11 commits into from
Sep 13, 2024
Merged

Conversation

chentong319
Copy link
Collaborator

Previous, a new memref is always created when onnx is lowered to krnl.
However, an input buffer could be reused as output, if the input has only only one use and the shape is the same as the output. The PR tried this idea on element-wise operations.
by default, the buffer reuse is turned off.
This feature is tested with Roberta-base-11.onnx. About 80% of malloc (in either numbers or memory size) can be avoided with buffer reuse. The buffer reuse code can generate correct result, but did not bring notable performance improvement, with either SIMD on or off.

Signed-off-by: chentong319 <chentong@us.ibm.com>
Signed-off-by: chentong319 <chentong@us.ibm.com>
Signed-off-by: chentong319 <chentong@us.ibm.com>
Signed-off-by: chentong319 <chentong@us.ibm.com>
Signed-off-by: chentong319 <chentong@us.ibm.com>
Signed-off-by: chentong319 <chentong@us.ibm.com>
Signed-off-by: chentong319 <chentong@us.ibm.com>
Signed-off-by: chentong319 <chentong@us.ibm.com>
Signed-off-by: chentong319 <chentong@us.ibm.com>
@@ -212,6 +213,16 @@ static llvm::cl::opt<bool, true> disableKrnlOpFusionOpt(
llvm::cl::location(disableKrnlOpFusion), llvm::cl::init(false),
llvm::cl::cat(OnnxMlirCommonOptions));

static llvm::cl::opt<bool, true> disableKrnlBufferReuseOpt(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: generally "disable" is for a function that is default on, "enable" is for one that is default off. I think you want "enable" here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

ValueRange generatedOperands, MemRefType outputMemRefType, DimsExprRef dims,
int64_t alignment, int64_t VL) {

// By default, disableKrnlBufferReuse is true. Simply allocate a memref.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code could be simplified as follows

if (!disableKrnlBufferReuse) {
  int indexToReuse = xxx
  if (indexToReuse != -1) return xxx
}
// no reuse, alloc

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks for the suggestion.

// RUN: onnx-mlir-opt --disable-krnl-op-fusion=true --disable-krnl-buffer-reuse=false --shape-inference --convert-onnx-to-krnl --canonicalize %s -split-input-file | FileCheck %s

// -----
func.func @test_reuse(%arg0: tensor<1024xf32>, %arg1: tensor<1024xf32>) -> tensor<1024xf32> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice concise test, we should all aspire to do that!

Signed-off-by: chentong319 <chentong@us.ibm.com>
Copy link
Collaborator

@AlexandreEichenberger AlexandreEichenberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for this exploratory work.

@chentong319 chentong319 merged commit 97d497f into onnx:main Sep 13, 2024
7 checks passed
@chentong319 chentong319 deleted the reuse-buffer branch September 13, 2024 17:51
@jenkins-droid
Copy link
Collaborator

Jenkins Linux amd64 Build #15619 [push] Reuse input buffer in lo... started at 12:52

@jenkins-droid
Copy link
Collaborator

Jenkins Linux ppc64le Build #14649 [push] Reuse input buffer in lo... started at 14:03

@jenkins-droid
Copy link
Collaborator

Jenkins Linux s390x Build #15622 [push] Reuse input buffer in lo... started at 13:52

@jenkins-droid
Copy link
Collaborator

Jenkins Linux amd64 Build #15619 [push] Reuse input buffer in lo... passed after 1 hr 18 min

@jenkins-droid
Copy link
Collaborator

Jenkins Linux s390x Build #15622 [push] Reuse input buffer in lo... passed after 2 hr 1 min

@jenkins-droid
Copy link
Collaborator

Jenkins Linux ppc64le Build #14649 [push] Reuse input buffer in lo... passed after 2 hr 19 min

Sunny-Anand pushed a commit to Sunny-Anand/onnx-mlir that referenced this pull request Sep 17, 2024
* first step

Signed-off-by: chentong319 <chentong@us.ibm.com>

* cpu

Signed-off-by: chentong319 <chentong@us.ibm.com>

* options

Signed-off-by: chentong319 <chentong@us.ibm.com>

* unify

Signed-off-by: chentong319 <chentong@us.ibm.com>

* simd

Signed-off-by: chentong319 <chentong@us.ibm.com>

* comments

Signed-off-by: chentong319 <chentong@us.ibm.com>

* lit test

Signed-off-by: chentong319 <chentong@us.ibm.com>

* fix test

Signed-off-by: chentong319 <chentong@us.ibm.com>

* format

Signed-off-by: chentong319 <chentong@us.ibm.com>

* response

Signed-off-by: chentong319 <chentong@us.ibm.com>

---------

Signed-off-by: chentong319 <chentong@us.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>
Sunny-Anand added a commit that referenced this pull request Sep 17, 2024
* Change lowering of onnx.IF to Krnl (#2932)

* implementation

Signed-off-by: chentong319 <chentong@us.ibm.com>

* test case change

Signed-off-by: chentong319 <chentong@us.ibm.com>

* format

Signed-off-by: chentong319 <chentong@us.ibm.com>

* add test for If back

Signed-off-by: chentong319 <chentong@us.ibm.com>

* format

Signed-off-by: chentong319 <chentong@us.ibm.com>

---------

Signed-off-by: chentong319 <chentong@us.ibm.com>
Co-authored-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* Update c style cast to c++ style cast (#2934)

Signed-off-by: Mike Essenmacher <essen@us.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* Change c style cast to c++ style cast (#2936)

Signed-off-by: Mike Essenmacher <essen@us.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* Add coding practices for onnx-mlir (#2935)

Signed-off-by: Mike Essenmacher <essen@us.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* try to use new buffer deallocation (#2919)

* implementation

Signed-off-by: Chen Tong <chentong@us.ibm.com>

* comments

Signed-off-by: Chen Tong <chentong@us.ibm.com>

* format

Signed-off-by: Chen Tong <chentong@us.ibm.com>

---------

Signed-off-by: Chen Tong <chentong@us.ibm.com>
Co-authored-by: Tung D. Le <tung@jp.ibm.com>
Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* fix requirements.txt link

Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* Reuse input buffer in lowering to krnl (#2939)

* first step

Signed-off-by: chentong319 <chentong@us.ibm.com>

* cpu

Signed-off-by: chentong319 <chentong@us.ibm.com>

* options

Signed-off-by: chentong319 <chentong@us.ibm.com>

* unify

Signed-off-by: chentong319 <chentong@us.ibm.com>

* simd

Signed-off-by: chentong319 <chentong@us.ibm.com>

* comments

Signed-off-by: chentong319 <chentong@us.ibm.com>

* lit test

Signed-off-by: chentong319 <chentong@us.ibm.com>

* fix test

Signed-off-by: chentong319 <chentong@us.ibm.com>

* format

Signed-off-by: chentong319 <chentong@us.ibm.com>

* response

Signed-off-by: chentong319 <chentong@us.ibm.com>

---------

Signed-off-by: chentong319 <chentong@us.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* Fix GroupNorm to support Opset21 (#2928)

* Group norm for opset 21

* Testing phase

* Fix GroupNorm to support Opset21

---------

Signed-off-by: hamptonm1 <79232909+hamptonm1@users.noreply.github.com>
Co-authored-by: Megan Hampton <hamptonm@us.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* Update Ops documentation for ONNX 1.16.2 (#2942)

* Update Ops documentation for ONNX 1.16.2

* Fix format

---------

Co-authored-by: Megan Hampton <hamptonm@us.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* LLVM/StableHLO Upgrade eaa95a1 (#2943)

Co-authored-by: Megan Hampton <hamptonm@us.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* added support for no-zero-point quantization (#2938)

Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com>
Co-authored-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

* update with main

Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>

---------

Signed-off-by: chentong319 <chentong@us.ibm.com>
Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>
Signed-off-by: Mike Essenmacher <essen@us.ibm.com>
Signed-off-by: Chen Tong <chentong@us.ibm.com>
Signed-off-by: hamptonm1 <79232909+hamptonm1@users.noreply.github.com>
Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com>
Signed-off-by: Sunny Anand <164108690+Sunny-Anand@users.noreply.github.com>
Co-authored-by: Tong Chen <chentong@us.ibm.com>
Co-authored-by: Tung D. Le <tung@jp.ibm.com>
Co-authored-by: Mike Essenmacher <112431871+mikeessen@users.noreply.github.com>
Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com>
Co-authored-by: hamptonm1 <79232909+hamptonm1@users.noreply.github.com>
Co-authored-by: Megan Hampton <hamptonm@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants