[PTen] Compatible runtime performance optimization #36946

chenwhql · 2021-11-02T12:28:24Z

PR types

Performance optimization

PR changes

Others

Describe

[PTen] Compatible runtime performance optimization

目前PTen执行兼容态由于引入了pten kernelContext及SmallVector构造析构，pten::DenseTensor的构造析构，导致调度性能下降。

测试代码：

import paddle
import numpy as np
import yep

paddle.set_device("cpu")
x_data = np.random.uniform(0.1, 1, [10]).astype(np.float32)
y_data = np.random.uniform(1, 3, [10]).astype(np.float32)

x = paddle.to_tensor(x_data)
y = paddle.to_tensor(y_data)

yep.start("dot.prof")
for i in range(1000000):
    z = paddle.dot(x, y)
yep.stop()

现develop核心执行函数火焰图如下：

Run: 24.73%，3.73s

因此，本PR尝试对这一问题进行优化，主要通过缓存KernelContext、DenseTensor解决问题，能够避免大部分不必要的开销。

但由于兼容态存在两种Tensor（fluid Tensor和pten DenseTensor），所以至少会引入Tensor成员和shared_ptr的拷贝构造及析构开销，现阶段难以避免，本PR修改后的核心执行函数火焰图如下：

Run: 18.03%, 2.33s

tqdm测试，测试代码中循环部分改为：

for i in tqdm(range(1000000)):
    z = paddle.dot(x, y)

现develop的数据：

本PR数据：

在demo上的执行性能约提升27%

paddle-bot-old · 2021-11-02T12:28:52Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… pten/compatible_phase_perf_improve

JiabinYang

some comment

JiabinYang · 2021-11-09T03:36:32Z

paddle/pten/core/compat_utils.h

+  }
+};
+
+class CompatibleDenseTensorMetaUtils {


It seems we don't need it.

JiabinYang · 2021-11-10T03:14:03Z

paddle/pten/api/lib/utils/storage.h

+  void ResetAllocation(std::shared_ptr<paddle::memory::Allocation> allocation,
+                       size_t offset) {
+    allocation_ = allocation;
+    data_ = pten::Allocation(


Maybe cause value error, if we resize tensor in kernel. Anyway, we can solve it in the future

I know this point, will change kernel output share rule in next pr

JiabinYang

LGTM

zhiqiu

LGTM for operator.h

chenwhql added 4 commits November 2, 2021 09:51

resolve conflit with develop

b3b4c4f

cache kernel context in tracer for perf up

4a66707

replace densetensor when build kernel context

5ada286

fix detail compile error

0d3a09e

chenwhql added 9 commits November 3, 2021 09:43

append impl to static mode

9448ad0

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

5bc01d6

… pten/compatible_phase_perf_improve

fix conflit error

ec4a898

clear attrs after run kernel

b7a3501

fix coverage failed

65458a6

fix cycle compile error

ac4c2ca

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

166113b

… pten/compatible_phase_perf_improve

remove multi-in&out adapt code

82d4657

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

71d67d5

… pten/compatible_phase_perf_improve

Shixiaowei02 previously approved these changes Nov 9, 2021

View reviewed changes

JiabinYang reviewed Nov 9, 2021

View reviewed changes

remove tensor meta utils

a9fdff8

chenwhql dismissed Shixiaowei02’s stale review via a9fdff8 November 9, 2021 06:09

clear data when throw exception

08862de

JiabinYang reviewed Nov 10, 2021

View reviewed changes

Shixiaowei02 approved these changes Nov 10, 2021

View reviewed changes

JiabinYang approved these changes Nov 10, 2021

View reviewed changes

YuanRisheng approved these changes Nov 10, 2021

View reviewed changes

zyfncg approved these changes Nov 10, 2021

View reviewed changes

zhiqiu approved these changes Nov 10, 2021

View reviewed changes

chenwhql merged commit 76d2fd1 into PaddlePaddle:develop Nov 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PTen] Compatible runtime performance optimization #36946

[PTen] Compatible runtime performance optimization #36946

chenwhql commented Nov 2, 2021 •

edited

Loading

paddle-bot-old bot commented Nov 2, 2021

JiabinYang left a comment

JiabinYang Nov 9, 2021

chenwhql Nov 9, 2021

JiabinYang Nov 10, 2021

chenwhql Nov 10, 2021

JiabinYang left a comment

zhiqiu left a comment

[PTen] Compatible runtime performance optimization #36946

[PTen] Compatible runtime performance optimization #36946

Conversation

chenwhql commented Nov 2, 2021 • edited Loading

PR types

PR changes

Describe

paddle-bot-old bot commented Nov 2, 2021

JiabinYang left a comment

Choose a reason for hiding this comment

JiabinYang Nov 9, 2021

Choose a reason for hiding this comment

chenwhql Nov 9, 2021

Choose a reason for hiding this comment

JiabinYang Nov 10, 2021

Choose a reason for hiding this comment

chenwhql Nov 10, 2021

Choose a reason for hiding this comment

JiabinYang left a comment

Choose a reason for hiding this comment

zhiqiu left a comment

Choose a reason for hiding this comment

chenwhql commented Nov 2, 2021 •

edited

Loading