-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Runtime] add set_output_zero_copy #8497
Merged
Merged
Changes from 65 commits
Commits
Show all changes
72 commits
Select commit
Hold shift + click to select a range
629ec50
optimize resize vertor
sunjiweiswift 5005373
tmp
sunjiweiswift a1b749f
DoMultiLevelTiling
sunjiweiswift f1fc313
modify size_t to int
sunjiweiswift 65a7a00
modify
sunjiweiswift 2368df9
modify level fill
sunjiweiswift e8ba850
Update utils.cc
sunjiweiswift a832739
format lower count
sunjiweiswift 258f382
Merge branch 'main' of https://github.com/sunjiweiswift/tvm
sunjiweiswift 2de6c99
delete blank lines
sunjiweiswift cb99388
delete blank lines
sunjiweiswift ece0e1d
Merge branch 'main' of https://github.com/sunjiweiswift/tvm
sunjiweiswift 9da6fa3
re-commit message
sunjiweiswift 718e58b
Merge pull request #1 from apache/main
sunjiweiswift 7377e43
Update graph_executor.h
sunjiweiswift 8853436
Merge pull request #2 from apache/main
sunjiweiswift 4a007ab
add setoutputzero
sunjiweiswift 8ca606f
add set output zero
sunjiweiswift 6afb609
Update graph_executor.cc
sunjiweiswift d71dece
Update graph_executor.h
sunjiweiswift 145219c
delete const_cast
sunjiweiswift e45c77b
add common function chechDltensor
sunjiweiswift b7a27c5
Update graph_executor.h
sunjiweiswift bf6ed08
Update graph_executor.cc
sunjiweiswift 80fc91f
add output_ sort
sunjiweiswift ab5f957
Update graph_executor.cc
sunjiweiswift 07e80ad
add a.nodeid == b.nodeid
sunjiweiswift e67b839
add unit test for set output zero
sunjiweiswift 052fa56
add include <algorithm>
sunjiweiswift 847634e
modify Setoutput zero copy
sunjiweiswift b2d9471
modify by clang-format
sunjiweiswift 5d0461a
add unit test for set output zero
sunjiweiswift 4ebf2bd
rrealy ut go back
sunjiweiswift c221b51
rrealy ut go back
sunjiweiswift 92294d3
modify input->output
sunjiweiswift dd54915
delete sort output input
sunjiweiswift 66ef5fe
modify build_module_test.cc
sunjiweiswift 7918c7b
re-pr
sunjiweiswift c7e00cb
empty commit
sunjiweiswift 2558aee
empty commit
sunjiweiswift bf85d3e
empty commit
sunjiweiswift df24fc3
modify input to ouput
sunjiweiswift c1bf14c
modify zero ouput copy disorder issus
sunjiweiswift c666527
Merge remote-tracking branch 'upstream/main'
sunjiweiswift 85b4fc3
Merge remote-tracking branch 'upstream/main'
sunjiweiswift 81143b9
modify nid->eid to record output, add var to record the dltensor both…
sunjiweiswift 6f7b068
character too long >= 100
sunjiweiswift 0d25674
modify zero copy UT add set input zero copy
sunjiweiswift 6fc5047
modify zero copy UT add set input zero copy
sunjiweiswift 969c80f
modify zero copy UT add set input zero copy
sunjiweiswift 889106d
Merge branch 'main' of https://github.com/sunjiweiswift/tvm
sunjiweiswift 5f858cc
empty commit
sunjiweiswift 1762cb5
trigger CI
sunjiweiswift 0575cb8
Merge pull request #4 from apache/main
sunjiweiswift 2640e76
trigger CI
sunjiweiswift a10562f
trigger CI
sunjiweiswift 07128aa
empty commit
sunjiweiswift c0e89f5
empty commit
sunjiweiswift 3e46c0e
trigger CI
6b3a126
trigger CI
sunjiweiswift 37b69b1
trigger CI
sunjiweiswift d66c4e1
Merge pull request #5 from apache/main
sunjiweiswift e622619
trigger CI
sunjiweiswift 8f9287f
trigger CI
sunjiweiswift 1644d91
resolve conflicts
sunjiweiswift 1c4f9e3
Merge pull request #6 from apache/main
sunjiweiswift 13a1355
modify C style
sunjiweiswift cb09eab
add runtime test
sunjiweiswift 3205590
add runtime test
sunjiweiswift aab0ef7
add runtime test
sunjiweiswift 8c0dfb6
realy build generatr the json
sunjiweiswift 2603263
realy build generatr the json
sunjiweiswift File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like testing in this way. Hard-coded the expected output (e.g., assembly, JSON, etc) may make future maintenance difficult. IMHO, it should be sufficient to just build two modules and set one of them to zero copy, so that the only difference between these two modules should just be the execution latency, and their outputs should be the same.
Also, it would be good to also have a Python test so that we could also demonstrate how this could be used in Python; otherwise no one will know this feature at all as there's no documentation neither.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
graph json is not the expected output. It is used to construct the graph executor as the serialized input graph
I think set_output_zero only is only used when calling libtvm_runtime.so(tvm graph executor) in other frameworks and allocates input and output memory in advance. Not a python api
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I see. In this case it would be better to go through the Relay build process that also generates the JSON.
Why not
set_output_zero
can be a Python API even it is only used when calling libtvm_runtime.so for now? Since it has the corresponding process inGraphExecutor::GetFunction
, I suppose it's doable to support a Python interface.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed json generate by relay build
Python has no pointer function. Generally, there is no such thing as zero copy in python. At the same time, set input zero copy is not python API is also the reason I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't understand why there is no Python API. It is totally reasonable to pass pre-allocated
ndarray
storage forget_output
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I will submit another python API PR later