-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse input buffer in lowering to krnl #2939
Conversation
Signed-off-by: chentong319 <chentong@us.ibm.com>
src/Compiler/CompilerOptions.cpp
Outdated
@@ -212,6 +213,16 @@ static llvm::cl::opt<bool, true> disableKrnlOpFusionOpt( | |||
llvm::cl::location(disableKrnlOpFusion), llvm::cl::init(false), | |||
llvm::cl::cat(OnnxMlirCommonOptions)); | |||
|
|||
static llvm::cl::opt<bool, true> disableKrnlBufferReuseOpt( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: generally "disable" is for a function that is default on, "enable" is for one that is default off. I think you want "enable" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
ValueRange generatedOperands, MemRefType outputMemRefType, DimsExprRef dims, | ||
int64_t alignment, int64_t VL) { | ||
|
||
// By default, disableKrnlBufferReuse is true. Simply allocate a memref. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code could be simplified as follows
if (!disableKrnlBufferReuse) {
int indexToReuse = xxx
if (indexToReuse != -1) return xxx
}
// no reuse, alloc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks for the suggestion.
// RUN: onnx-mlir-opt --disable-krnl-op-fusion=true --disable-krnl-buffer-reuse=false --shape-inference --convert-onnx-to-krnl --canonicalize %s -split-input-file | FileCheck %s | ||
|
||
// ----- | ||
func.func @test_reuse(%arg0: tensor<1024xf32>, %arg1: tensor<1024xf32>) -> tensor<1024xf32> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice concise test, we should all aspire to do that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for this exploratory work.
Jenkins Linux amd64 Build #15619 [push] Reuse input buffer in lo... started at 12:52 |
Jenkins Linux ppc64le Build #14649 [push] Reuse input buffer in lo... started at 14:03 |
Jenkins Linux s390x Build #15622 [push] Reuse input buffer in lo... started at 13:52 |
Jenkins Linux amd64 Build #15619 [push] Reuse input buffer in lo... passed after 1 hr 18 min |
Jenkins Linux s390x Build #15622 [push] Reuse input buffer in lo... passed after 2 hr 1 min |
Jenkins Linux ppc64le Build #14649 [push] Reuse input buffer in lo... passed after 2 hr 19 min |
* first step Signed-off-by: chentong319 <chentong@us.ibm.com> * cpu Signed-off-by: chentong319 <chentong@us.ibm.com> * options Signed-off-by: chentong319 <chentong@us.ibm.com> * unify Signed-off-by: chentong319 <chentong@us.ibm.com> * simd Signed-off-by: chentong319 <chentong@us.ibm.com> * comments Signed-off-by: chentong319 <chentong@us.ibm.com> * lit test Signed-off-by: chentong319 <chentong@us.ibm.com> * fix test Signed-off-by: chentong319 <chentong@us.ibm.com> * format Signed-off-by: chentong319 <chentong@us.ibm.com> * response Signed-off-by: chentong319 <chentong@us.ibm.com> --------- Signed-off-by: chentong319 <chentong@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com>
* Change lowering of onnx.IF to Krnl (#2932) * implementation Signed-off-by: chentong319 <chentong@us.ibm.com> * test case change Signed-off-by: chentong319 <chentong@us.ibm.com> * format Signed-off-by: chentong319 <chentong@us.ibm.com> * add test for If back Signed-off-by: chentong319 <chentong@us.ibm.com> * format Signed-off-by: chentong319 <chentong@us.ibm.com> --------- Signed-off-by: chentong319 <chentong@us.ibm.com> Co-authored-by: Tung D. Le <tung@jp.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Update c style cast to c++ style cast (#2934) Signed-off-by: Mike Essenmacher <essen@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Change c style cast to c++ style cast (#2936) Signed-off-by: Mike Essenmacher <essen@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Add coding practices for onnx-mlir (#2935) Signed-off-by: Mike Essenmacher <essen@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * try to use new buffer deallocation (#2919) * implementation Signed-off-by: Chen Tong <chentong@us.ibm.com> * comments Signed-off-by: Chen Tong <chentong@us.ibm.com> * format Signed-off-by: Chen Tong <chentong@us.ibm.com> --------- Signed-off-by: Chen Tong <chentong@us.ibm.com> Co-authored-by: Tung D. Le <tung@jp.ibm.com> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * fix requirements.txt link Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Reuse input buffer in lowering to krnl (#2939) * first step Signed-off-by: chentong319 <chentong@us.ibm.com> * cpu Signed-off-by: chentong319 <chentong@us.ibm.com> * options Signed-off-by: chentong319 <chentong@us.ibm.com> * unify Signed-off-by: chentong319 <chentong@us.ibm.com> * simd Signed-off-by: chentong319 <chentong@us.ibm.com> * comments Signed-off-by: chentong319 <chentong@us.ibm.com> * lit test Signed-off-by: chentong319 <chentong@us.ibm.com> * fix test Signed-off-by: chentong319 <chentong@us.ibm.com> * format Signed-off-by: chentong319 <chentong@us.ibm.com> * response Signed-off-by: chentong319 <chentong@us.ibm.com> --------- Signed-off-by: chentong319 <chentong@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Fix GroupNorm to support Opset21 (#2928) * Group norm for opset 21 * Testing phase * Fix GroupNorm to support Opset21 --------- Signed-off-by: hamptonm1 <79232909+hamptonm1@users.noreply.github.com> Co-authored-by: Megan Hampton <hamptonm@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * Update Ops documentation for ONNX 1.16.2 (#2942) * Update Ops documentation for ONNX 1.16.2 * Fix format --------- Co-authored-by: Megan Hampton <hamptonm@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * LLVM/StableHLO Upgrade eaa95a1 (#2943) Co-authored-by: Megan Hampton <hamptonm@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * added support for no-zero-point quantization (#2938) Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> Co-authored-by: Tung D. Le <tung@jp.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> * update with main Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> --------- Signed-off-by: chentong319 <chentong@us.ibm.com> Signed-off-by: Sunny-Anand <sunnyanand.979@gmail.com> Signed-off-by: Mike Essenmacher <essen@us.ibm.com> Signed-off-by: Chen Tong <chentong@us.ibm.com> Signed-off-by: hamptonm1 <79232909+hamptonm1@users.noreply.github.com> Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Sunny Anand <164108690+Sunny-Anand@users.noreply.github.com> Co-authored-by: Tong Chen <chentong@us.ibm.com> Co-authored-by: Tung D. Le <tung@jp.ibm.com> Co-authored-by: Mike Essenmacher <112431871+mikeessen@users.noreply.github.com> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Co-authored-by: hamptonm1 <79232909+hamptonm1@users.noreply.github.com> Co-authored-by: Megan Hampton <hamptonm@us.ibm.com>
Previous, a new memref is always created when onnx is lowered to krnl.
However, an input buffer could be reused as output, if the input has only only one use and the shape is the same as the output. The PR tried this idea on element-wise operations.
by default, the buffer reuse is turned off.
This feature is tested with Roberta-base-11.onnx. About 80% of malloc (in either numbers or memory size) can be avoided with buffer reuse. The buffer reuse code can generate correct result, but did not bring notable performance improvement, with either SIMD on or off.