-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
python/mxnet/symbol.py
Outdated
if group2ctx is not None: | ||
if attr_dict is None: | ||
attr_dict = self.attr_dict() | ||
arg_ctx = [group2ctx.get(attr_dict[name]['__ctx_group__'], ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be moved to backend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
python/mxnet/symbol.py
Outdated
num_shared_exec_aux_states = 0 | ||
shared_exec_aux_state_handles = ctypes.POINTER(NDArrayHandle)() | ||
else: | ||
shared_exec_in_arg_handles = [nd.handle for nd in shared_exec.arg_arrays] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These data structures are available in backend I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right. GraphExecutor::data_entry_ has all the NDArrays of the full graph. In order to make use that, I want confirm that shared_exec's indexed graph is same as the current executor's indexed graph, right?
src/executor/graph_executor.cc
Outdated
} | ||
} | ||
g = nnvm::pass::InferShape(g, arg_shapes, "__shape__"); | ||
g = nnvm::pass::InferType(g, arg_dtypes, "__dtype__"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm new to MXNet,
why can't we use arg_shape_map and arg_dtype_map directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand the question. But the shape attr in the graph is designed as a vector. Its indices correspond to other data structures of the graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I thought the shape attr is a map.
src/c_api/c_api_executor.cc
Outdated
NDArray** shared_exec_in_arg_ptrs = | ||
reinterpret_cast<NDArray**>(shared_exec_in_arg_handles); | ||
for (mx_uint i = 0; i < num_shared_exec_in_args; ++i) { | ||
shared_exec_in_args.push_back(*shared_exec_in_arg_ptrs[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use constructor instead of pushback
src/executor/graph_executor.cc
Outdated
if (mutable_nodes.count(nid)) { // aux_states | ||
if (has_shared_exec) { | ||
const NDArray& aux_nd = shared_exec_aux_states[aux_top]; | ||
CHECK_EQ(inferred_shape, aux_nd.shape()) << "Inferred shape does not match shared_exec.aux_array's shape"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improve error msg
8bf8f28
to
66db99a
Compare
const std::vector<Context>& aux_state_ctxes, | ||
const std::unordered_map<std::string, TShape>& arg_shape_map, | ||
const std::unordered_map<std::string, int>& arg_dtype_map, | ||
const std::vector<OpReqType>& grad_req_types, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need a map for storage_type, too...
b4e232a
to
35bf4d0
Compare
python/mxnet/symbol.py
Outdated
"""Binds current symbol to get an executor, allocate all the arguments needed. | ||
def simple_bind(self, ctx, grad_req='write', type_dict=None, group2ctx=None, | ||
param_names=None, shared_exec=None, shared_data_arrays=None, **kwargs): | ||
"""This function is DEPRECATED. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is also deprecated??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll also need a storage_type_dict for the sparse feature. Maybe we can add that in our branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, maybe I accidentally copied the description of simple_bind_v1 and forgot to delete deprecation statement.
src/c_api/c_api_executor.cc
Outdated
arg_shape_map[provided_arg_shape_names[i]] = | ||
TShape(provided_arg_shape_data+provided_arg_shape_idx[i], | ||
provided_arg_shape_data+provided_arg_shape_idx[i+1]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Map also has emplace() to avoid copy assign constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
src/c_api/c_api_executor.cc
Outdated
|
||
// create para name set for sharing data array memory | ||
std::unordered_set<std::string> param_name_set; | ||
for (mx_uint i = 0; i < num_param_names; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use std::unordered_set<std::string> param_name_set(num_param_names)
to avoid resizing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Could you resolve the conflicts and fix lint? |
2867f6d
to
e78166d
Compare
@eric-haibin-lin Fixed. |
Generally looks good. Please remove *_v1 in front end and add more tests. |
provided_grad_req_types = [c_str(grad_req)] | ||
elif isinstance(grad_req, list): | ||
if len(grad_req) == 0: | ||
raise RuntimeError('grad_req in simple_bind cannot be an empty list') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this and check if len(grad_req) == num_inputs in backend.
fe0baa5
to
b9854e4
Compare
Is this good to merge? |
@eric-haibin-lin This still needs tests for executor group. After discussing with @piiswrong , we can do it in this way. Submit this PR in your repo and you can use it for sparse tensor, while I finish writing tests for the executor group. Let me know what you think. |
@reminisce are you gonna submit another PR to dmlc/mxnet after your test is done? What's the implication when the sparse branch merges with dmlc/mxnet? |
@eric-haibin-lin If I submit this PR to your repo, I will submit test PR to your repo too. You can merge with dmlc/master after your sparse tensor is done with my changes on symbol binding. Does this work for you? |
Yeah that sounds good! |
I submitted this PR to eric-haibin-lin#31 for sparse tensor development. Please do not merge this PR into dmlc/master. |
8682567
to
a07229b
Compare
eef9f9e
to
24765ec
Compare
f0677b2
to
8f8889f
Compare
if (nd.is_none()) return default_ctx; | ||
return nd.ctx(); | ||
}; | ||
std::vector<Context> in_arg_ctxes(in_args.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? I think the vector after transform will have 2*in_args.size() length.
Shouldn't you use reserve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the documentation, transform would do the following thing, where the output container's space should have been resized, instead of being reserved. If reserved, the code does not crash but the output vector's effective size does not change. Is my understanding correct?
template<class InputIt, class OutputIt, class UnaryOperation>
OutputIt transform(InputIt first1, InputIt last1, OutputIt d_first,
UnaryOperation unary_op)
{
while (first1 != last1) {
*d_first++ = unary_op(*first1++);
}
return d_first;
}
const std::string& arg_name = idx[nid].source->attrs.name; | ||
if (mutable_nodes.count(nid)) { // aux_states | ||
if (nullptr != shared_exec) { | ||
const NDArray& aux_nd = shared_exec->aux_state_map().at(arg_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if arg_name doesn't exist in shared_exec->aux_state_map()? how was it handled before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This situation should not happen. If happened, map::at() function would throw an exception, which will be handled by the c_api function.
Before this, it was done in python, where the arg_name is expected to exist in shared_exec.arg_dict; if not, the dict would throw an exception.
https://github.com/dmlc/mxnet/blob/35d5e54c43aa6bdb189839d4b1a1978d85263fc7/python/mxnet/module/executor_group.py#L625
} else { // in_args | ||
if (shared_arg_names.count(arg_name)) { // model parameter | ||
if (nullptr != shared_exec) { | ||
const NDArray& in_arg_nd = shared_exec->in_arg_map().at(arg_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
I'll merge this for now. We may need to improve some error messages if users get confused by it. |
Yes better error message is needed, i am aware that it segfaults when infer shape fails in graph_executor. |
why would it segfault? @reminisce Could you fix that first? |
The functions InferShape and InferType could fail if users provide insufficient information. If it fails, the program continues with 0-size ndarrays and may cause segfaults. I will add checks for shape and type inference results. |
f542fe0
to
66be337
Compare
Add init functions for simple bind in graph_executor Add simple_bind c_api Add simple bind c-api Assign zeros to in_args, arg_grads, and aux_states Add simple_bind2 python interface Fix python interface bugs Interface changes Fix Fix core dump Add bind_ith_exec c_api Change simple_bind2 Fix seg fault Finish simple_bind Change _bind_ith_exec Refactor simple_bind initialization flow for bind Consolidate bind and simple_bind graph init flow Fix bug Clean up Add comments Clean up Clean up Minor correction Rename APIs in graph executor Refactor Rebase Delete deprecated functions Move more front-end work to backend Bug fix Fix failed tests Minor fix Fix lint Fix lint Revert unnecessary changes Revert Revert Clean up Fix lint Fix bind_ith_exec calling simple_bind Fix bugs for _bind_ith_exec
* Add unit test * Fix * Small fix
* Add bucketing test * Skip pylint * Use cpu to train
66be337
to
d3c1d5f
Compare
* Initial checkin Add init functions for simple bind in graph_executor Add simple_bind c_api Add simple bind c-api Assign zeros to in_args, arg_grads, and aux_states Add simple_bind2 python interface Fix python interface bugs Interface changes Fix Fix core dump Add bind_ith_exec c_api Change simple_bind2 Fix seg fault Finish simple_bind Change _bind_ith_exec Refactor simple_bind initialization flow for bind Consolidate bind and simple_bind graph init flow Fix bug Clean up Add comments Clean up Clean up Minor correction Rename APIs in graph executor Refactor Rebase Delete deprecated functions Move more front-end work to backend Bug fix Fix failed tests Minor fix Fix lint Fix lint Revert unnecessary changes Revert Revert Clean up Fix lint Fix bind_ith_exec calling simple_bind Fix bugs for _bind_ith_exec * Add unit test (apache#1) * Add unit test * Fix * Small fix * Fix lint * Fix lint * Fix bugs of missing ndarrays in shared_buffer * Fix lint * Simple bind (apache#3) * Add bucketing test * Skip pylint * Use cpu to train * Fix bug * Remove merge message * Fix lint * Add logging to test_bucketing.py * Reduce model size (apache#4) * Add checks for shape/type inferences * Add printing error messages for shape/type inference failure
This PR
Benchmark Environment: p2.xlarge
Step 1. Create an executor group as shared_group
Step 2. Call DataExecutorGroup constructor 500 times and pass the executor group created in Step 1 in as shared_group
Timed Step 2 results:
New simple_bind: 18.3 ms per executor group creation
Old simple_bind: 19.4 ms per executor group creation
Benchmark script