Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Eager Grad] Support eager grad interface #40170

Merged
merged 47 commits into from
Mar 17, 2022

Conversation

veyron95
Copy link
Contributor

@veyron95 veyron95 commented Mar 4, 2022

PR types

New features

PR changes

APIs

Describe

  1. Refator RunBackward(...) interface to adapt both backward(...) and grad(...) mode
  2. Add cpp unit test file grad_test.cc to make sure the basic function of grad(...) interface in cpp-end is work
  3. Expose run_partial_grad(...) interface in eager_functions.cc for python user to use
  4. Support fluid.dygraph.grad(...) interface for Eager mode ( by calling run_partial_grad(...) interface )
  5. Supplement python unit test cases to make sure the execution path from python user interface to cpp-end is work
  6. Add ClearTensorWrappers and IsClearTensorWrappers virtual func in GradNodeBase
  7. Support no_grad_vars, retrain_graph arguments

Note: Have not support double grad yet, create_graph under developing

@veyron95 veyron95 changed the title [Eager Grad] Support eager grad interface, draft version [WIP, Eager Grad] Support eager grad interface Mar 7, 2022
@veyron95 veyron95 changed the title [WIP, Eager Grad] Support eager grad interface [Eager Grad] Support eager grad interface Mar 11, 2022
Copy link
Contributor

@JiabinYang JiabinYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments and make sure your deps build logic got approve from jim

paddle/fluid/eager/accumulation/accumulation_node.cc Outdated Show resolved Hide resolved
paddle/fluid/eager/auto_code_generator/eager_generator.cc Outdated Show resolved Hide resolved
paddle/fluid/eager/auto_code_generator/eager_generator.cc Outdated Show resolved Hide resolved
paddle/fluid/eager/auto_code_generator/eager_generator.cc Outdated Show resolved Hide resolved
paddle/fluid/eager/backward.cc Outdated Show resolved Hide resolved
paddle/fluid/eager/auto_code_generator/eager_generator.cc Outdated Show resolved Hide resolved
paddle/fluid/eager/auto_code_generator/eager_generator.cc Outdated Show resolved Hide resolved
@@ -2176,10 +2193,11 @@ static std::string GenerateGradNodeHeaderContents(
TENSOR_WRAPPER_MEMBER_TEMPLATE, struct_tensor_wrapper_name);

const char* SET_TENSOR_WRAPPER_BODY_TEMPLATE =
"%s = egr::TensorWrapper(%s, %s /*full_reserved*/);";
"%s = egr::TensorWrapper(%s, %s /*full_reserved*/);\n"
" TensorWrappersSet.emplace_back(%s);";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. emplace_back() will invoke "copy constructor" in case of lvalue, or "move constructor" in case of rvalue. So we usually use "emplace_back" together with "std::move", otherwise it's not gonna move (like in this case).

  2. Let's say you used "emplace_back(std::move(%s))" here, then you're in big trouble. The problem is variable "x" after "std::move(x)" will become a dangling name, which doesn't actually own any underlying memory. Thus if anyone else uses "x" after your std::move, For instance "SetAttribute(x)", it's gonna cause trouble.

  3. The idea could work with "std::vector<TensorWrapper*>". However, I strongly recommend the other way around (see comments above).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much, this comments help me a lot, i do not use vector to hold all the tensor_wrapper now.

paddle/fluid/eager/backward.cc Outdated Show resolved Hide resolved
// all the next_nodes of current node will be inserted to
// potential_stop_node
if (is_potential_stop_nodes) {
potential_stop_nodes->emplace(next_node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrecommended use of "emplace", refer to the other comment on "emplace_back and std::move"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, but std::unordered_set<GradNodeBase*>* potential_stop_nodes could not call emplace_back.

paddle/fluid/eager/backward.cc Show resolved Hide resolved
paddle/fluid/eager/backward.cc Show resolved Hide resolved
paddle/fluid/eager/backward.cc Outdated Show resolved Hide resolved
paddle/fluid/eager/backward.cc Outdated Show resolved Hide resolved
jim19930609
jim19930609 previously approved these changes Mar 15, 2022
Copy link
Contributor

@chenwhql chenwhql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for PADDLE_ENFORCE

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@veyron95 veyron95 merged commit 4db8cf2 into PaddlePaddle:develop Mar 17, 2022
veyron95 added a commit that referenced this pull request Mar 17, 2022
liqitong-a pushed a commit to liqitong-a/Paddle that referenced this pull request Mar 17, 2022
* [Eager] Support eager grad interface, draft version

* Support eager grad interface with allow_unused and multi startup_op

* Fix code format

* Fix allow_unused case, return PyNone if tensor not initialize

* Support output's stop_gradient related to create_graph

* Support grad exception case in eager mode, fix coverage CI

* Update ToPyObject, return PyNone if not initialize

* AccumulationNode add FLAGS_retain_grad_for_all_tensor

* Fix ci issue

* Fix CI issue

* fix, use core.eager.Tensor

* Add func SetBufferSlotRankZeros for GradTensorHolder

* Support retain_graph by using ClearTensorWrappers

* Support retain_graph by using ClearTensorWrappers

* Update retain_graph and no_grad_vars related test case

* Update code gen logic for ClearTensorWrappers

* Fix by override statement

* fix override func args

* Support retain_graph, update unit tests

* Updated ClearTensorWrappers logic

* fix grad python interface

* Use deep copy and update unit tests

* Polish code

* Polish code

* Fix CI issue, Deep copy only use when user set grad_tensors

* Fix CI, use Backward instead RunBackward

* Fix CI, Declare kernel explicitly in test file

* Polish, remove vector of TensorWrapper

* Refactor the logic of grad/backward, polish codes

* Update code after merge upstream develop

* Polish after merge upstream develop

* Update to adapt new GradNodeBase superclass

* Fix error introduced during conflict resolution

* Update purify potential_startup_nodes logic

* Fix errors

* Polish code

* Remove useless args for ToPyObject

* Remove useless TensorWrappersSet

* Fix code-format, re-install pre-commit

* Fix pre-process logic for potential_startup_ops

* Update unit tests, use eager mode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants