Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export env to python #7792

Merged
merged 71 commits into from
Apr 2, 2022
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
29ccd30
the Env is never destroyed.
lixinqi Mar 13, 2022
8509484
export Env into python
lixinqi Mar 14, 2022
434b7af
more unittests
lixinqi Mar 14, 2022
86296cb
export unittest.TestCase in framework/unittest.py
lixinqi Mar 16, 2022
454f5e7
SwitchToShuttingDownPhase
lixinqi Mar 16, 2022
d1d9ad7
optional is_normal_exit
lixinqi Mar 16, 2022
a58348d
VirtualMachine::CloseVMThreads
lixinqi Mar 17, 2022
8bb83a1
merge master
lixinqi Mar 18, 2022
fe64379
Delete env_api.h
lixinqi Mar 18, 2022
9ca83c6
reshape_only_one_dim_infered
lixinqi Mar 21, 2022
9b5cecc
Merge branch 'master' into export_env_to_python
lixinqi Mar 22, 2022
72426fc
address pr comments
lixinqi Mar 22, 2022
da8b44d
Merge branch 'master' of github.com:Oneflow-Inc/oneflow into export_e…
chengtbf Mar 23, 2022
cf14a1e
rollback flow.env.all_device_placement
chengtbf Mar 23, 2022
e3297f4
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 23, 2022
6f5d7c6
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 23, 2022
5d0c648
Merge branch 'master' into export_env_to_python
lixinqi Mar 24, 2022
97cf982
no distributed running test_shutting_down.py
lixinqi Mar 24, 2022
a119bf0
auto format by CI
oneflow-ci-bot Mar 24, 2022
d2b36a4
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 24, 2022
d8862b9
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 24, 2022
6e22eb4
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 25, 2022
e58bee9
Merge branch 'master' of github.com:Oneflow-Inc/oneflow
lixinqi Mar 25, 2022
df91894
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 25, 2022
93191d7
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 25, 2022
da35647
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 25, 2022
a3d45e3
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 26, 2022
1ec6dc3
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 26, 2022
3d6c7cd
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 26, 2022
f7e2241
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 26, 2022
c1afee6
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 26, 2022
7c4cb89
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 26, 2022
7daab5a
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 27, 2022
07bb2a2
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 27, 2022
47738f7
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 27, 2022
87c748f
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 27, 2022
9d5ab85
Merge branch 'master' of github.com:Oneflow-Inc/oneflow
lixinqi Mar 28, 2022
80a4542
Merge branch 'master' into export_env_to_python
strint Mar 28, 2022
57fff70
Merge branch 'master' into export_env_to_python
lixinqi Mar 28, 2022
ce55514
Merge branch 'export_env_to_python' of github.com:Oneflow-Inc/oneflow…
lixinqi Mar 28, 2022
c679884
expand lifetime of module oneflow in test_shutting_down.py
lixinqi Mar 28, 2022
53d088a
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 28, 2022
c6ecc12
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 28, 2022
204aa54
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 28, 2022
3124ecf
merge master
lixinqi Mar 29, 2022
8c6c03e
Merge branch 'export_env_to_python' of github.com:Oneflow-Inc/oneflow…
lixinqi Mar 29, 2022
72c30fd
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 29, 2022
5d6212f
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 29, 2022
19c18d4
refine del depend on of
strint Mar 29, 2022
002bacf
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 29, 2022
5cfcadd
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 29, 2022
aef2edc
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 29, 2022
b2d87f2
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 30, 2022
f00f991
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 30, 2022
ae9d601
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 30, 2022
89f3ce2
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 30, 2022
ad18575
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 30, 2022
b67a60e
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 30, 2022
fb8b9fa
Merge branch 'master' into export_env_to_python
lixinqi Mar 31, 2022
ec2c402
Merge branch 'export_env_to_python' of github.com:Oneflow-Inc/oneflow…
lixinqi Mar 31, 2022
45fd613
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 31, 2022
b730ef0
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 31, 2022
fbd921c
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 31, 2022
a365516
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 31, 2022
244ee42
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 31, 2022
e99bac0
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 31, 2022
595d13f
Merge branch 'master' into export_env_to_python
mergify[bot] Mar 31, 2022
340f6f9
Merge branch 'master' into export_env_to_python
strint Apr 1, 2022
9a078cd
Merge branch 'master' into export_env_to_python
mergify[bot] Apr 1, 2022
61cc655
Merge branch 'master' into export_env_to_python
mergify[bot] Apr 1, 2022
e07072c
Merge branch 'master' into export_env_to_python
strint Apr 2, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 25 additions & 3 deletions oneflow/api/python/env/env.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,19 +16,41 @@ limitations under the License.
#include <pybind11/pybind11.h>
#include "oneflow/api/python/env/env.h"
#include "oneflow/api/python/of_api_registry.h"
#include "oneflow/core/common/global.h"
#include "oneflow/core/vm/vm_util.h"
#include "oneflow/core/vm/virtual_machine.h"
#include "oneflow/core/framework/shut_down_util.h"

namespace py = pybind11;

namespace oneflow {

Maybe<void> SwitchToShuttingDownPhase(EnvGlobalObjectsScope* env, bool is_normal_exit) {
if (is_normal_exit) {
JUST(vm::ClusterSync());
auto* vm = JUST(GlobalMaybe<VirtualMachine>());
JUST(vm->CloseVMThreads());
}
JUST(env->init_is_normal_exit(is_normal_exit));
SetShuttingDown(true);
return Maybe<void>::Ok();
Comment on lines +30 to +37
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

旧版的逻辑写在python层。如果遇到系统异常退出,则完全不执行DeleteEnv。为了对齐此逻辑,我们让EnvGlobalObjectsScope的析构在!is_normal_exit的时候不执行那一系列的Global::Delete();

}

ONEFLOW_API_PYBIND11_MODULE("", m) {
m.def("CurrentResource", &CurrentResource);
m.def("EnvResource", &EnvResource);
m.def("EnableEagerEnvironment", &EnableEagerEnvironment);

m.def("IsEnvInited", &IsEnvInited);
m.def("InitEnv", &InitEnv);
m.def("DestroyEnv", &DestroyEnv, py::call_guard<py::gil_scoped_release>());
py::class_<EnvGlobalObjectsScope, std::shared_ptr<EnvGlobalObjectsScope>>(m, "Env")
.def(py::init([](const std::string& env_proto_str) {
return CreateEnv(env_proto_str).GetPtrOrThrow();
}))
.def(
"SwitchToShuttingDownPhase",
[](EnvGlobalObjectsScope* env, bool is_normal_exit) {
SwitchToShuttingDownPhase(env, is_normal_exit).GetOrThrow();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可以 return SwitchToShuttingDownPhase(env, is_normal_exit); 了(46 行 GetPtrOrThrow() 还不能去掉因为 py::init 有特殊要求)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的

},
py::call_guard<py::gil_scoped_release>());

m.def("CurrentMachineId", &CurrentMachineId);

Expand Down
20 changes: 4 additions & 16 deletions oneflow/api/python/env/env.h
Original file line number Diff line number Diff line change
Expand Up @@ -46,25 +46,13 @@ inline Maybe<void> EnableEagerEnvironment(bool enable_eager_execution) {
return Maybe<void>::Ok();
}

inline Maybe<bool> IsEnvInited() { return Global<EnvGlobalObjectsScope>::Get() != nullptr; }

inline Maybe<void> DestroyEnv() {
if (Global<EnvGlobalObjectsScope>::Get() == nullptr) { return Maybe<void>::Ok(); }
OF_ENV_BARRIER();
Global<EnvGlobalObjectsScope>::Delete();
return Maybe<void>::Ok();
}

inline Maybe<void> InitEnv(const std::string& env_proto_str) {
inline Maybe<EnvGlobalObjectsScope> CreateEnv(const std::string& env_proto_str) {
EnvProto env_proto;
CHECK_OR_RETURN(TxtString2PbMessage(env_proto_str, &env_proto))
<< "failed to parse env_proto" << env_proto_str;
CHECK_ISNULL_OR_RETURN(Global<EnvGlobalObjectsScope>::Get());
// Global<T>::New is not allowed to be called here
// because glog is not constructed yet and has bad bahavior.
Global<EnvGlobalObjectsScope>::SetAllocated(new EnvGlobalObjectsScope());
JUST(Global<EnvGlobalObjectsScope>::Get()->Init(env_proto));
return Maybe<void>::Ok();
auto env = std::make_shared<EnvGlobalObjectsScope>();
JUST(env->Init(env_proto));
return env;
}

inline Maybe<long long> CurrentMachineId() { return GlobalProcessCtx::Rank(); }
Expand Down
6 changes: 0 additions & 6 deletions oneflow/api/python/init.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,6 @@ bool Int2IntListMapContaining(const Int2IntListMap& bigger, const Int2IntListMap
} // namespace

PYBIND11_MODULE(_oneflow_internal, m) {
m.def("MasterSendAbort", []() {
if (Global<EnvGlobalObjectsScope>::Get() != nullptr) {
return ClusterInstruction::MasterSendAbort();
}
});

using IntList = std::vector<int64_t>;
using Int2IntListMap = std::unordered_map<int64_t, std::shared_ptr<IntList>>;

Expand Down
1 change: 0 additions & 1 deletion oneflow/api/python/session/session.h
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,6 @@ inline Maybe<void> CreateMultiClientSessionContext() {

inline Maybe<void> InitMultiClientSessionContext(const std::string& config_proto_str) {
CHECK_NOTNULL_OR_RETURN(Global<MultiClientSessionContext>::Get());
CHECK_NOTNULL_OR_RETURN(Global<EnvGlobalObjectsScope>::Get());
CHECK_NOTNULL_OR_RETURN(Global<EnvDesc>::Get()) << "env not found";

ConfigProto config_proto;
Expand Down
4 changes: 2 additions & 2 deletions oneflow/core/framework/random_generator_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ limitations under the License.
#include "oneflow/core/framework/instructions_builder.h"
#include "oneflow/core/framework/tensor_util.h"
#include "oneflow/core/functional/functional.h"
#include "oneflow/core/job/env_global_objects_scope.h"
#include "oneflow/core/vm/virtual_machine.h"
#include "oneflow/core/register/ofblob.h"
#include "oneflow/core/vm/vm_util.h"
#ifdef WITH_CUDA
Expand All @@ -35,7 +35,7 @@ namespace one {
namespace {

Maybe<void> CPUSynchronize() {
if (Global<EnvGlobalObjectsScope>::Get() != nullptr) { return vm::CurrentRankSync(); }
if (Global<VirtualMachine>::Get() != nullptr) { return vm::CurrentRankSync(); }
return Maybe<void>::Ok();
}

Expand Down
108 changes: 0 additions & 108 deletions oneflow/core/intrusive/channel.h

This file was deleted.

116 changes: 0 additions & 116 deletions oneflow/core/intrusive/channel_test.cpp

This file was deleted.

1 change: 1 addition & 0 deletions oneflow/core/job/env_global_objects_scope.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,7 @@ Maybe<void> EnvGlobalObjectsScope::Init(const EnvProto& env_proto) {
}

EnvGlobalObjectsScope::~EnvGlobalObjectsScope() {
if (is_normal_exit_.has_value() && !CHECK_JUST(is_normal_exit_)) { return; }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

也许应该命名为is_abnormal_exit

auto session_ctx = Global<MultiClientSessionContext>::Get();
if (session_ctx != nullptr) {
VLOG(1) << "Multi client session has not closed , env close it at env scope destruction.";
Expand Down
10 changes: 10 additions & 0 deletions oneflow/core/job/env_global_objects_scope.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ limitations under the License.
#define ONEFLOW_CORE_JOB_CLUSTER_OBJECTS_SCOPE_H_
#include "oneflow/core/common/util.h"
#include "oneflow/core/common/maybe.h"
#include "oneflow/core/common/optional.h"
#include "oneflow/core/job/env_desc.h"
#include "oneflow/core/framework/device.h"

Expand All @@ -31,6 +32,15 @@ class EnvGlobalObjectsScope final {
~EnvGlobalObjectsScope();

Maybe<void> Init(const EnvProto& env_proto);

Maybe<void> init_is_normal_exit(bool is_normal_exit) {
CHECK_OR_RETURN(!is_normal_exit_.has_value());
is_normal_exit_ = is_normal_exit;
return Maybe<void>::Ok();
}

private:
Optional<bool> is_normal_exit_;
};

} // namespace oneflow
Expand Down
Loading