[PyTorch]Add PyTorchTVM: compile torchscript to tvm and export as pytorch_op #8777

Meteorix · 2021-08-18T02:54:59Z

To increase the TVM accessibility for PyTorch users, we add PyTorchTVM module to support the following workflow:

convert a torchscript module to tvm graph
build and tune tvm graph
export well-tuned tvm graph as a pytorch op
torch jit trace the tvm pytorch op with other pytorch modules, then save/load/serve as normal pytorch model

The example usage is here: apps/pt_class/tests/test_pt_script.py. We hope to further discuss the user api with the community. Please help review @Laurawly @junrushao1994 @tqchen, thanks!

Credit: the original author is @kongroo .

junrushao · 2021-08-18T05:52:05Z

CC @masahi @alexwong @comaniac @yzhliu

comaniac · 2021-08-18T06:11:47Z

It would be good for this scope of new feature to start with an RFC.

junrushao · 2021-08-18T06:23:53Z

Yeah it is super exciting feature for TVM, so would get more visibility in the community if we have like a RFC

Meteorix · 2021-08-19T06:27:44Z

Thanks, I will write an RFC this week.

Meteorix · 2021-08-25T09:05:25Z

Update:

jcf94 · 2021-08-25T09:31:09Z

Looks great!
We have ever implemented a similar approach to put a optimized tvm graph runtime back to TensorFlow and Pytorch through custom ops. For some reason we're not able to split those codes out from another project. Glad to have this feature in main branch!

cc @minminsun

jroesch · 2021-09-22T06:48:29Z

@Meteorix any updates on when this will be ready for review? happy to help shepherd these changes and work with you to get them merged.

junrushao · 2021-09-22T06:51:22Z

@jroesch we just had long discussion in the RFC and the RFC was merged weeks ago. @Meteorix was a bit little in the recent weeks but will follow up soon

Meteorix · 2021-09-22T08:34:39Z

@Meteorix any updates on when this will be ready for review? happy to help shepherd these changes and work with you to get them merged.

Yes, it's ready for review!

masahi · 2021-09-22T10:14:38Z

apps/pt_class/CMakeLists.txt

+cmake_minimum_required(VERSION 3.2)
+project(tf_tvmdsoop C CXX)
+
+set(TFTVM_COMPILE_FLAGS -std=c++14)


Update 'TF' or tf references

masahi · 2021-09-22T10:16:19Z

apps/pt_class/prepare_and_test_pt_tvm_class.sh

+python3 -c "import tvm; print(tvm.runtime.enabled('gpu'))" | grep -e 1
+if [ "$?" -eq 0 ]; then
+    echo "Build PT_TVMCLASS with gpu support and execute tests"
+    CMAKE_OPTIONS="-DUSE_CUDA=/data00/liuxin.ai/cuda_111 -DPython3_EXECUTABLE=python3 -DTVM_ROOT=${TVM_ROOT}"


Update /data00/liuxin.ai/cuda_111

masahi · 2021-09-22T10:17:17Z

apps/pt_class/tests/test_pt_compile.py

+
+
+model = resnet50().half().cuda()
+x = torch.rand([1, 3, 244, 244]).half().cuda()


kongroo · 2021-10-08T10:23:54Z

I've fixed some namespace and style issues. Could you please help review this PR? @junrushao1994 @jcf94 @msakai @jroesch

And I have some questions to discuss:

The forward function is not thread-safe. Should we use a mutex to make it thread-safe?
We load the tvm module from files (mod.so, graph.json, params). But if we pass the relative path of the .so file, it may cause unexpected results. Consider this case: we have export_dir1/mod.so and export_dir2/mod.so, chdir into export_dir1 and load ./mod.so, then chdir into export_dir2 and try to load ./mod.so, but export_dir2/mod.so will not be loaded! One possible solution is to translate the filepath to absolute path before dlopen in src/runtime/dso_library.cc. What's your opinion?
We store tvm graph modules in a map tvm_modules_ and use input tensors' shapes as the key. But this requires all the input tensors to have a fixed shape. In order to support dynamic shapes, we may need to iterate all the keys of tvm_modules_ to find a matched one. Is it necessary to support dynamic shapes? If it is, how can we do it efficiently?

masahi · 2021-10-22T00:41:24Z

Sorry I forgot about this PR, will take another look soon. cc @junrushao1994 @jroesch

masahi

Code mostly looks good. I have a couple of questions around target/device.

To answer @kongroo's question:

For multi-threaded applications, users should create multiple instances of TVM / PT module per thread from the same dll.
Not sure if that is a TVM's problem or an OS issue (related? https://stackoverflow.com/questions/16525016/how-to-dynamic-load-the-library-with-same-name-but-in-different-directory-in-lin), but always enforcing an absolute path sounds good to me (either in dso_library.cc or at the application level).
Currently, our performance on dynamic input is terrible. Moreover, we need to deal with the same problem in other contexts too, such as in CUTLASS BYOC [BYOC] CUTLASS integration #9261 or AutoTIR. So I suggest deferring this problem for now.

cmake/modules/contrib/PT_TVMDSOOP.cmake

python/tvm/contrib/torch/__init__.py

python/tvm/contrib/torch/pytorch_tvm.py

python/tvm/contrib/torch/module.py

python/tvm/contrib/torch/pytorch_tvm.py

python/tvm/contrib/torch/module.py

src/contrib/torch/pt_call_tvm/tvm_class.cc

masahi

Great!

masahi · 2021-11-02T02:05:34Z

@kongroo Please make sure to pass the CI.

@jroesch @junrushao1994 @tqchen @comaniac Please take a look if you want to review. Otherwise I'm going to merge this week, I think we can bring this to the v0.8 release.

masahi · 2021-11-02T19:57:36Z

@kongroo Looks like you've hit an unfortunate flaky test error, please kick another job.

kongroo · 2021-11-04T03:12:55Z

@kongroo Looks like you've hit an unfortunate flaky test error, please kick another job.

Finally CI passed...

masahi · 2021-11-05T22:55:12Z

Thanks @Meteorix @kongroo this is merged!

…orch_op (apache#8777) * add pt_op * add compile api * perf: support set_output_zero_copy * fix: cpu device_id mismatch * fix: pt_class test script * refactor: unify namespace to tvm.contrib.torch * add ASF header * build: set pt tvmdsoop default off * build: remove unset_log_macros.h * refactor: change header order * refactor: fix python code format * style: resolve pylint issues * style: add blank line * style: fix pylint invalid_name * trigger CI * test: add more test scripts * style: add empty lines * test: update test for trace tvm module * style: fix linting issues * style: remove single quote * style: disable pylint invalid-name * trigger CI * trigger CI Co-authored-by: kongroo <imjcqt@gmail.com>

Meteorix requested review from areusch, comaniac, jroesch, junrushao, tqchen, yzhliu, zhiics and a team as code owners August 18, 2021 02:54

tqchen added the status: need RFC need RFC discussion label Aug 18, 2021

Meteorix mentioned this pull request Aug 24, 2021

[RFC]PyTorchTVM apache/tvm-rfcs#25

Merged

Meteorix and others added 3 commits August 31, 2021 17:44

add pt_op

3ba9d1a

add compile api

39a609b

perf: support set_output_zero_copy

e38511c

kongroo force-pushed the meteorix_main_2 branch from c8461f4 to e38511c Compare September 1, 2021 03:10

leandron removed the status: need RFC need RFC discussion label Sep 3, 2021

areusch removed their request for review September 14, 2021 20:42

masahi self-assigned this Sep 22, 2021

masahi changed the title ~~[PyTorch][WIP]Add PyTorchTVM: compile torchscript to tvm and export as pytorch_op~~ [PyTorch]Add PyTorchTVM: compile torchscript to tvm and export as pytorch_op Sep 22, 2021

masahi reviewed Sep 22, 2021

View reviewed changes

apps/pt_class/tests/test_pt_compile.py Outdated

model = resnet50().half().cuda()

x = torch.rand([1, 3, 244, 244]).half().cuda()

Copy link

Member

masahi Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

224?

style: fix pylint invalid_name

4db2d9e

kongroo force-pushed the meteorix_main_2 branch from a9ba531 to 4db2d9e Compare October 8, 2021 09:32

tqchen added the status: need review label Oct 9, 2021

trigger CI

bc3baeb

masahi mentioned this pull request Oct 22, 2021

[Torch, CI] Upgrade to PyTorch 1.10 #9349

Closed

4 tasks

masahi reviewed Oct 25, 2021

View reviewed changes

src/contrib/torch/pt_call_tvm/tvm_class.cc Show resolved Hide resolved

kongroo added 3 commits November 1, 2021 15:42

test: add more test scripts

c2d2812

style: add empty lines

910dd33

test: update test for trace tvm module

8a3fed1

masahi mentioned this pull request Nov 2, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

masahi approved these changes Nov 2, 2021

View reviewed changes

kongroo added 3 commits November 2, 2021 11:00

style: fix linting issues

835b8a1

style: remove single quote

f94e8e9

style: disable pylint invalid-name

8f7ec5a

kongroo added 2 commits November 3, 2021 10:38

trigger CI

3320d00

trigger CI

391167f

Meteorix requested a review from icemelon as a code owner November 3, 2021 04:26

masahi merged commit e7024fb into apache:main Nov 5, 2021

masahi mentioned this pull request Apr 13, 2022

Manually add libtorch op test #10758

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PyTorch]Add PyTorchTVM: compile torchscript to tvm and export as pytorch_op #8777

[PyTorch]Add PyTorchTVM: compile torchscript to tvm and export as pytorch_op #8777

Meteorix commented Aug 18, 2021

junrushao commented Aug 18, 2021

comaniac commented Aug 18, 2021

junrushao commented Aug 18, 2021

Meteorix commented Aug 19, 2021

Meteorix commented Aug 25, 2021

jcf94 commented Aug 25, 2021

jroesch commented Sep 22, 2021

junrushao commented Sep 22, 2021

Meteorix commented Sep 22, 2021

masahi Sep 22, 2021

masahi Sep 22, 2021

masahi Sep 22, 2021

kongroo commented Oct 8, 2021

masahi commented Oct 22, 2021

masahi left a comment •

edited

Loading

masahi left a comment

masahi commented Nov 2, 2021

masahi commented Nov 2, 2021

kongroo commented Nov 4, 2021

masahi commented Nov 5, 2021



		model = resnet50().half().cuda()
		x = torch.rand([1, 3, 244, 244]).half().cuda()

[PyTorch]Add PyTorchTVM: compile torchscript to tvm and export as pytorch_op #8777

[PyTorch]Add PyTorchTVM: compile torchscript to tvm and export as pytorch_op #8777

Conversation

Meteorix commented Aug 18, 2021

junrushao commented Aug 18, 2021

comaniac commented Aug 18, 2021

junrushao commented Aug 18, 2021

Meteorix commented Aug 19, 2021

Meteorix commented Aug 25, 2021

jcf94 commented Aug 25, 2021

jroesch commented Sep 22, 2021

junrushao commented Sep 22, 2021

Meteorix commented Sep 22, 2021

masahi Sep 22, 2021

Choose a reason for hiding this comment

masahi Sep 22, 2021

Choose a reason for hiding this comment

masahi Sep 22, 2021

Choose a reason for hiding this comment

kongroo commented Oct 8, 2021

masahi commented Oct 22, 2021

masahi left a comment • edited Loading

Choose a reason for hiding this comment

masahi left a comment

Choose a reason for hiding this comment

masahi commented Nov 2, 2021

masahi commented Nov 2, 2021

kongroo commented Nov 4, 2021

masahi commented Nov 5, 2021

masahi left a comment •

edited

Loading