Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NewIR] Support Ir run program node #56791

Merged
merged 41 commits into from
Sep 14, 2023

Conversation

2742195759
Copy link
Contributor

@2742195759 2742195759 commented Aug 30, 2023

PR types

New features

PR changes

Others

Description

support dy2static in new ir api mode.

目前这个PR只是一个基础PR,还有很多的分支没有支持:

  1. Parameter 还没有支持,但是对于组合算子进行试验,足够了。
  2. 输入输出的裁剪逻辑还没有适配,后续会尝试将 x、output、middle 都统一作为 RunProgramGrad 的输入。
  3. pass 逻辑, fp16, amp 等逻辑没有适配。

PCard-66972

@paddle-bot
Copy link

paddle-bot bot commented Aug 30, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle/fluid/eager/to_static/run_program_op_node.h Outdated Show resolved Hide resolved
paddle/fluid/eager/to_static/run_program_op_node.h Outdated Show resolved Hide resolved

VLOG(4) << "global_inner_scope:" << global_inner_scope;

auto input_values =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些 fx、fo、fm 是什么?建议取个更可读的名字

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好,后续会更换,这个只是最初的版本。

paddle/fluid/eager/to_static/run_program_op_node.h Outdated Show resolved Hide resolved

auto output_grad_values =
PADDLE_GET_CONST(std::vector<::ir::Value>, attrs.at("bo_g"));
auto forward_input_values =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

paddle/fluid/framework/framework.proto Outdated Show resolved Hide resolved
@@ -0,0 +1,1150 @@
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里建立一个exprimental_ir 目录,统一放新IR的适配逻辑,类似 partial_program.py 的文件都可以是同名的,后续我们切换上线时,可以统一替换,然后删除ir目录就可以了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前影响可控,等后面如果发现有大量的侵入代码,会换其他的策略来方便后续的替换。

python/paddle/jit/dy2static/newir_partial_program.py Outdated Show resolved Hide resolved
python/paddle/jit/dy2static/newir_partial_program.py Outdated Show resolved Hide resolved
@@ -1156,6 +1166,106 @@ def __init__(
self.name_generator = name_generator
self.kwargs = kwargs

@staticmethod
@switch_to_static_graph
def newir_from_func_spec(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个函数定义是不是也可以放到 exprimental_ir 里,通过动态patch 的形式来实现新旧ir的切换?

zhangbo9674
zhangbo9674 previously approved these changes Sep 13, 2023
Copy link
Contributor

@zhangbo9674 zhangbo9674 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for print

Copy link
Contributor

@Aurelius84 Aurelius84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall,由于下游工作的强依赖,可以考虑特殊先合入,但请尽快提后续的Fix TODO 的优化PR,包括:

  1. 避免引入过多缩写的Attribute,解耦run_program_op Maker
  2. 部分函数名称需要优化下,并按照「大写+驼峰」的规则,且为动宾结构
  3. VLOG的级别要提升,避免任何时候都会打印输出
  4. 函数代码中不应该出现 newir,现在已经正式化为 pir了

@@ -174,3 +175,107 @@ inline void run_program_ad_func(
egr::EagerUtils::SetHistory(&p_autograd_outs, grad_node);
}
}

inline void newir_run_program_ad_func(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
inline void newir_run_program_ad_func(
inline void pir_run_program_ad_func(

std::vector<paddle::Tensor*>& dout, // NOLINT
const paddle::framework::AttributeMap& attrs) {
// Prepare Autograd Meta
VLOG(2) << "start run newir run_program ad function.";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
VLOG(2) << "start run newir run_program ad function.";
VLOG(2) << "start run pir run_program ad function.";


// Create Middle Output for GradNode.
auto middle_size =
PADDLE_GET_CONST(std::vector<::pir::Value>, attrs.at("fm")).size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里后续需要解耦run_program_op maker


if (require_any_grad) {
// Create GradOpNode (1 means [out_grad], 2 means [x_grad, paramx_grad])
grad_node = std::make_shared<NewIRGradNodeRunProgram>(1, 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
grad_node = std::make_shared<NewIRGradNodeRunProgram>(1, 2);
grad_node = std::make_shared<GradNodeRunProgramPir>(1, 2);

这里或许可以使用namespace 来隔离,pir::GradNodeRunProgram

name =
op->attributes().at("name").dyn_cast<pir::StrAttribute>().AsString();
value2name[op->results()[0].Value::impl()] = name;
} else if (op->name() == "builtin.set_parameter") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else if (op->name() == "builtin.set_parameter") {
} else if (op->isa<pir::SetParameterOp) {

return middle_values;
}

void mapping_value(const std::vector<pir::Value> &origin,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

return pir::OpResult(nullptr);
}

SplitedResult ForwardBackwardSplit(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

函数命名应该是「动宾」结构,如SplitForwardBackward

auto *cloned_op = BuildOpFrom(op, forward_value_map);
forward_program->block()->push_back(cloned_op);
});
VLOG(1) << "After Forward Construct.";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议 VLOG(4)

@@ -170,6 +172,34 @@ def args_to_input_spec(self, args, kwargs):

return args_with_spec, kwargs_with_spec

@switch_to_static_graph
def newir_to_static_inputs_with_spec(self, input_with_spec, main_program):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def newir_to_static_inputs_with_spec(self, input_with_spec, main_program):
def pir_to_static_inputs_with_spec(self, input_with_spec, main_program):

def run_function(to_static=True):
import paddle

# 设置随机种子
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

代码里不应该出现中文

@phlrain phlrain self-requested a review September 14, 2023 07:25
Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@2742195759 2742195759 merged commit 97442d3 into PaddlePaddle:develop Sep 14, 2023
risemeup1 added a commit that referenced this pull request Sep 14, 2023
risemeup1 added a commit that referenced this pull request Sep 14, 2023
2742195759 added a commit to 2742195759/Paddle that referenced this pull request Sep 15, 2023
* support build model in python

* fix ci bugs

* fix ci bugs

* fix compile bugs

* fix ci bugs

* add infermeta for data

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix bugs when run ir program mutiple times

* perfect code

* frontend demo debugging

* support program split and go into run program node.

* simple run the dy2static test in newir_api mode.

* remove frame.proto changes

* merge

* fix ir-run-program-node

* fix some code

* fix output error

* fix some errors

* fix

* fix

* fix

* fix conflict

* fix files

* fix some errors

* merge and solve conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>
2742195759 added a commit that referenced this pull request Sep 20, 2023
* [NewIR] Support Ir run program node (#56791)

* support build model in python

* fix ci bugs

* fix ci bugs

* fix compile bugs

* fix ci bugs

* add infermeta for data

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix bugs when run ir program mutiple times

* perfect code

* frontend demo debugging

* support program split and go into run program node.

* simple run the dy2static test in newir_api mode.

* remove frame.proto changes

* merge

* fix ir-run-program-node

* fix some code

* fix output error

* fix some errors

* fix

* fix

* fix

* fix conflict

* fix files

* fix some errors

* merge and solve conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>

* new pr

* fix

* fix

* fix segment error

* fix

* add dependences

* fix

* fix link error.

* fix some cmake problem

* fix

* fix

* fix dependecy

* fix

* fix

* fix circle dependence

* fix

* fix

* fix rocm

* fix

* add python library

* fix cmake

* merge

* fix

* fix

* fix conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>
iosmers pushed a commit to iosmers/Paddle that referenced this pull request Sep 21, 2023
* [NewIR] Support Ir run program node (PaddlePaddle#56791)

* support build model in python

* fix ci bugs

* fix ci bugs

* fix compile bugs

* fix ci bugs

* add infermeta for data

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix bugs when run ir program mutiple times

* perfect code

* frontend demo debugging

* support program split and go into run program node.

* simple run the dy2static test in newir_api mode.

* remove frame.proto changes

* merge

* fix ir-run-program-node

* fix some code

* fix output error

* fix some errors

* fix

* fix

* fix

* fix conflict

* fix files

* fix some errors

* merge and solve conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>

* new pr

* fix

* fix

* fix segment error

* fix

* add dependences

* fix

* fix link error.

* fix some cmake problem

* fix

* fix

* fix dependecy

* fix

* fix

* fix circle dependence

* fix

* fix

* fix rocm

* fix

* add python library

* fix cmake

* merge

* fix

* fix

* fix conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>
Frida-a pushed a commit to Frida-a/Paddle that referenced this pull request Oct 14, 2023
* [NewIR] Support Ir run program node (PaddlePaddle#56791)

* support build model in python

* fix ci bugs

* fix ci bugs

* fix compile bugs

* fix ci bugs

* add infermeta for data

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix bugs when run ir program mutiple times

* perfect code

* frontend demo debugging

* support program split and go into run program node.

* simple run the dy2static test in newir_api mode.

* remove frame.proto changes

* merge

* fix ir-run-program-node

* fix some code

* fix output error

* fix some errors

* fix

* fix

* fix

* fix conflict

* fix files

* fix some errors

* merge and solve conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>

* new pr

* fix

* fix

* fix segment error

* fix

* add dependences

* fix

* fix link error.

* fix some cmake problem

* fix

* fix

* fix dependecy

* fix

* fix

* fix circle dependence

* fix

* fix

* fix rocm

* fix

* add python library

* fix cmake

* merge

* fix

* fix

* fix conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>
jiahy0825 pushed a commit to jiahy0825/Paddle that referenced this pull request Oct 16, 2023
* [NewIR] Support Ir run program node (PaddlePaddle#56791)

* support build model in python

* fix ci bugs

* fix ci bugs

* fix compile bugs

* fix ci bugs

* add infermeta for data

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix bugs when run ir program mutiple times

* perfect code

* frontend demo debugging

* support program split and go into run program node.

* simple run the dy2static test in newir_api mode.

* remove frame.proto changes

* merge

* fix ir-run-program-node

* fix some code

* fix output error

* fix some errors

* fix

* fix

* fix

* fix conflict

* fix files

* fix some errors

* merge and solve conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>

* new pr

* fix

* fix

* fix segment error

* fix

* add dependences

* fix

* fix link error.

* fix some cmake problem

* fix

* fix

* fix dependecy

* fix

* fix

* fix circle dependence

* fix

* fix

* fix rocm

* fix

* add python library

* fix cmake

* merge

* fix

* fix

* fix conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>
danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023
* support build model in python

* fix ci bugs

* fix ci bugs

* fix compile bugs

* fix ci bugs

* add infermeta for data

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix bugs when run ir program mutiple times

* perfect code

* frontend demo debugging

* support program split and go into run program node.

* simple run the dy2static test in newir_api mode.

* remove frame.proto changes

* merge

* fix ir-run-program-node

* fix some code

* fix output error

* fix some errors

* fix

* fix

* fix

* fix conflict

* fix files

* fix some errors

* merge and solve conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>
danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023
danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023
* [NewIR] Support Ir run program node (PaddlePaddle#56791)

* support build model in python

* fix ci bugs

* fix ci bugs

* fix compile bugs

* fix ci bugs

* add infermeta for data

* fix ci bugs

* fix ci bugs

* fix ci bugs

* fix bugs when run ir program mutiple times

* perfect code

* frontend demo debugging

* support program split and go into run program node.

* simple run the dy2static test in newir_api mode.

* remove frame.proto changes

* merge

* fix ir-run-program-node

* fix some code

* fix output error

* fix some errors

* fix

* fix

* fix

* fix conflict

* fix files

* fix some errors

* merge and solve conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>

* new pr

* fix

* fix

* fix segment error

* fix

* add dependences

* fix

* fix link error.

* fix some cmake problem

* fix

* fix

* fix dependecy

* fix

* fix

* fix circle dependence

* fix

* fix

* fix rocm

* fix

* add python library

* fix cmake

* merge

* fix

* fix

* fix conflict

---------

Co-authored-by: YuanRisheng <yuanrisheng@baidu.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants