Sync with apache/incubator-tvm 6/15/2020 #116

trevor-m · 2020-06-09T22:15:54Z

Sync to match upstream apache/incubator-tvm on 6/152020.

Changes made on top of cherry-picks:

Small patch to TRT integration to account for changes in strided_slice, reshape
Increased CI container stack limit from 8mb to 16mb to fix failing TFLite tests test_forward_qnn_mobilenet_v2_net and test_forward_mediapipe_hand_landmark (overflow in eliminate common subexpr).
Skip test_tensor_array_write_read, test_tensor_array_concat, test_tensor_array_scatter, test_tensor_array_gather, test_tensor_array_split when TF version is > 1.15.
Disable tensorflow.test_forward_sdd because stack limit of 100mb is exceeded by WellFormedChecker
Removed all module.is_empty() code - we added this to neo-ai/tvm for nnvm-trt integration which is no longer needed.

* [TFLITE]Select/Where op support for tflite frontend * Review comment fixed * Review comment fixed

…FLite (apache#5510) * [FRONTEND][TFLite] Fully connected op conversion made in sync with TFLite * [1] Test case added * [2] Review comments handled * [3] Prints removed

…Core (apache#5485)

The objects that the raw pointers point to can be deallocated and new objects can be allocated at the same address, all while these pointers are still in the cache. This can lead to unexpected behavior, for example to calculated bound conflicts with previously cached values. Caching PrimExpr will prevent the objects from being deallocated while the cache is active.

…pache#5534)

* [WEB] Remove the old web runtime * [WEB][RUNTIME] TVM WebAssembly Runtime This PR introduces a brand new TVM web runtime based on the WASM standard API. Main highlights: - The new runtime is rewritten using the Typescript. - The new runtime now directly interfaces with WebAssembly's standard API, instead of relying on emscripten's API. This change will make the js runtime more portable to runtime variants. For example, we could also try to make it interface with the tvm's rust runtime implementation. - System library can be provided through WASI - We also build a hack to enable Emscripten to generate a WASI like bundle for runtime environment on the Web. - The wasm generation now uses the mainlin LLVM. - Dynamic link(dlopen) is not used due to limitation of wasm, instead we rely on the recent new RPC refactor to directly restart a new session for each wasm binary sent to the RPC. * Address review comments * Skip tensorcore test

* [RELAY]LogSumExp Op Support * [ONNX]LogSumExp Op Support

…pache#5523) * [std::string --> String] IRModule is updated with String * [1] Packedfunction updated * [2] Lint error fixed * [3] Remove std::string variant

…ation (apache#5540)

* TFlite e2e FP32 Object detection model * Fix test * [Relay-TFLite] Quantized activations * Flexbuffer parsing * Lint * Relaxing checks. * Github reviews * comments Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-212.us-west-2.compute.internal>

…pache#5535) * Changes to cpp_rpc to make it work on Android (+ Hexagon offloading) - Implement getNextString to break up std::string into words. stringstream just doesn't work on Android. - string::find_last_of doesn't look for the last substring, but the last character from a given string. - Use SIGTERM to terminate processes (this isn't necessary, but using SIGKILL is not a good practice). - Convert "./rpc" to a full path. When a module is uploaded and offloaded to Hexagon, the dlopen on Hexagon needs an absolute path (or a path without directories). * Only set the absolute patch on non-Windows platforms Windows has different macros for the maximum path length.

* [CRT]fix to reduce RAM size during loading model * Release graph_json memory immediately after reading

* [RPC] Improve RPCServer AsyncIO support. When the RPCServer is in the async IO mode, it is possible for the server to directly serve async function that may return its value via a callback in the future. This mode is particular useful to the web environment, where blocking is not an option. This PR introduces the Async support to the RPCSession, allowing the AsyncIO driven servers to serve the async functions. These functions will still be presented as synchronized version on the client side. Followup PR will refactor the web runtime to make use of this feature. * Address comments

…he#5526) * Add tvm-sys * Use as_mut_ptr * Address CR feedback * Update rust/tvm-sys/src/datatype.rs Co-authored-by: Nick Hynes <nhynes@berkeley.edu> * Final CR comments * Fix find and replace error in frontend Co-authored-by: Nick Hynes <nhynes@berkeley.edu>

…ache#5483)

- Added the warp level reduction support - Upgraded shfl intrinsics to the sync version. - This is the building block for scheduling softmax like operations. Signed-off-by: Wei Pan <weip@nvidia.com>

…e#5546) modified to run specifically on ARM cortex-M hardware, which currently is just the STM32F746 discovery board. Signed-off-by: Tom Gall <tom.gall@linaro.org>

…#5548)

This PR introduces WebGPU support to tvm. The WebGPU runtime is directly built in javascript(as WebGPU uses JS as the first class citizen API) and exposes back to the tvm's runtime via PackedFuncs. One important note is that `ctx.sync` is not async. This is due to the fact that WebGPU is a purely async API and we cannot block in the web environment. So the current best way to use the js api is to wrap things in an async function. When copy a GPU array to CPU, `await ctx.sync()` need to be called to wait for copy completion. We use a AsyncIO rpc server to serve the async functions to the clients.

* [TOPI][RELAY][TENSORFLOW]Math ops added * Extra newline removed * CI fix * Review comments fixed * Review comments fixed

…5492) * [RUNTIME] Hexagon driver for offloading kernels to simulator * Add sim_dev as external project when building with Hexagon/sim support * Change target CPU for sim_dev to v60

This PR prepares for our migration to use the clang-format as part of the linter system.

…d match (apache#5552) * Add additional check before re-using the cached match in merge composite * clean up ExtractPattern calls

…quire TF 1.x

* [TENSORFLOW]Conv3d Transpose OP added * Testcase updated, tf cpu supports only ndhwc

* [TF] Support symbolic inputs of Fill * Rebase and simplify. Value has been converted to constant if it is tf.Constant

* edit onnx parser to infer values in post order to speed up onnx imports with many calls to infer_value * fix pylint

* support aten::type_as in the pytorch frontend * use _convert_data_type to convert torch type to tvm type and add more types in the type_as test

This PR updates the remaining TIR node's member to use String instead of std::string.

…he#5799)

* [ONNX] Skip ADD inside Gemm op when vector is zero * [ONNX] Skip multiply with 1.0f constant for GEMM import

…e#5796)

* Versions above 0.7.4 are broken due to changes in the quantization operations in the model, which are current not supported by TVM. Fixes apache#5774.

…xceeded by WellFormedChecker

trevor-m · 2020-06-16T16:04:04Z

@zhiics CI is passing now - please see description for list of modifications I had to make. I have also reenabled the sphinx task for docs.

kevinthesun

LGTM

This reverts commit c3c1472.

siju-samuel and others added 30 commits June 8, 2020 23:33

[TFLITE]Select op support for tflite frontend (apache#5486)

349b1eb

* [TFLITE]Select/Where op support for tflite frontend * Review comment fixed * Review comment fixed

[FRONTEND][TFLite] Fully connected op conversion made in sync with T…

a5cfce7

…FLite (apache#5510) * [FRONTEND][TFLite] Fully connected op conversion made in sync with TFLite * [1] Test case added * [2] Review comments handled * [3] Prints removed

[TOPI][Winograd] Optimization of Conv2d Winograd algorithm on Tensor …

6b2323e

…Core (apache#5485)

fix a few bugs with shape inference and types in the onnx importer (a…

cfb41e6

…pache#5534)

[Frontend][TFLite] ADD_N operator (apache#5474)

ce4d49a

[RELAY][ONNX]ReduceLogSumExp Operator support (apache#5453)

132017d

* [RELAY]LogSumExp Op Support * [ONNX]LogSumExp Op Support

[RPC][BUGFIX] Fix remote device sync (apache#5538)

476623a

[Refactor][std::string --> String] IRModule is updated with String (a…

1c8b943

…pache#5523) * [std::string --> String] IRModule is updated with String * [1] Packedfunction updated * [2] Lint error fixed * [3] Remove std::string variant

[RUNTIME] Store nullptr PackedFunc as nullptr for better error propag…

5040831

…ation (apache#5540)

Add Onnx Pad v11 (apache#5539)

612b828

fix restructured text (apache#5541)

a420710

[CRT]fix to reduce RAM size during loading model (apache#5507)

9754024

* [CRT]fix to reduce RAM size during loading model * Release graph_json memory immediately after reading

Load platform specific lib for tvmdsoop instead of only so (apache#5542)

72ade90

[TE] Fix MakeLoopNest for warp memory (apache#5382)

74a687d

[TIR][Printer] text format printer considering future parsing use (ap…

7630339

…ache#5483)

[Optimization] Warp level reduction support for CUDA (apache#5498)

57e9178

- Added the warp level reduction support - Upgraded shfl intrinsics to the sync version. - This is the building block for scheduling softmax like operations. Signed-off-by: Wei Pan <weip@nvidia.com>

A clone of test/python/unittest/test_runtime_micro.py, however (apach…

c1cb6de

…e#5546) modified to run specifically on ARM cortex-M hardware, which currently is just the STM32F746 discovery board. Signed-off-by: Tom Gall <tom.gall@linaro.org>

[CI] Install wasmtime for WebAssembly tests (apache#5494)

37b3c97

Apparently, ONNX Conv with no 'pads' defaults to zero padding (apache…

fb7c648

…#5548)

[TOPI][RELAY][TENSORFLOW]Math ops added (apache#5502)

8e21d89

* [TOPI][RELAY][TENSORFLOW]Math ops added * Extra newline removed * CI fix * Review comments fixed * Review comments fixed

[RUNTIME] Hexagon driver for offloading kernels to simulator (apache#…

76a3069

…5492) * [RUNTIME] Hexagon driver for offloading kernels to simulator * Add sim_dev as external project when building with Hexagon/sim support * Change target CPU for sim_dev to v60

[LINT] clang-format the h,cc,m files. (apache#5557)

7003426

This PR prepares for our migration to use the clang-format as part of the linter system.

[BYOC, MergeComposite] Add additional check before re-using the cache…

b346536

…d match (apache#5552) * Add additional check before re-using the cached match in merge composite * clean up ExtractPattern calls

Trevor Morris and others added 22 commits June 15, 2020 19:31

Increase stack limit for failing tflite tests. Skip TF tests which re…

2283275

…quire TF 1.x

[PYTORCH]aten::norm support added (apache#5776)

6f63123

[TENSORFLOW]Conv3d Transpose OP added (apache#5775)

79721f8

* [TENSORFLOW]Conv3d Transpose OP added * Testcase updated, tf cpu supports only ndhwc

[TF] Support symbolic inputs of Fill (apache#5762)

15709c2

* [TF] Support symbolic inputs of Fill * Rebase and simplify. Value has been converted to constant if it is tf.Constant

[COMMUNITY] @wpan11nv -> Reviewer (apache#5790)

5522ad6

Edit onnx parser to infer values in post order (apache#5755)

534eccf

* edit onnx parser to infer values in post order to speed up onnx imports with many calls to infer_value * fix pylint

[TIR][REFACTOR] Cleanup unused classes (apache#5789)

ae745ea

Fix tf parser (apache#5794)

a9aa8ac

support aten::type_as in the pytorch frontend (apache#5787)

e21351c

* support aten::type_as in the pytorch frontend * use _convert_data_type to convert torch type to tvm type and add more types in the type_as test

[TIR][REFACTIR] Update TIR nodes std::string->String. (apache#5793)

9eb29b6

This PR updates the remaining TIR node's member to use String instead of std::string.

[TEST] Temporary disable fp16 type_as test for PyTorch Frontend (apac…

ca14048

…he#5799)

[ONNX] Skip multiply with 1.0f constant for GEMM import (apache#5800)

29e2ec7

* [ONNX] Skip ADD inside Gemm op when vector is zero * [ONNX] Skip multiply with 1.0f constant for GEMM import

[TIR][REFACTOR] Add tir prefix to type keys (apache#5802)

34a581f

[QUANTIZE] Add config switch for nn.dense layer type. (apache#5801)

f250700

[topi] fix sparse dense schedule on cuda (apache#5803)

33fcf79

Allow RPCWrappedFunc to rewrite runtime::String as std::string (apach…

8e18755

…e#5796)

[topi] fix strategy for sparse dense cuda (apache#5782)

2ca5680

[CI] Move cpu-only frontend tests to a CPU stage (apache#5807)

eecc5d2

[MXNET]conv3d and conv3d_transpose addedx (apache#5814)

89160b9

Pin hand landmark network to version 0.7.4. (apache#5813)

ffb4004

* Versions above 0.7.4 are broken due to changes in the quantization operations in the model, which are current not supported by TVM. Fixes apache#5774.

[CI] Limit number of threads in all jobs (apache#5815)

bc5a78d

Update dmlc_tvm_commit_id.txt

13290ab

trevor-m changed the title ~~Sync with apache/incubator-tvm 6/12/2020~~ Sync with apache/incubator-tvm 6/15/2020 Jun 15, 2020

Disable tensorflow.test_forward_sdd because stack limit of 100mb is e…

dbb760c

…xceeded by WellFormedChecker

zhiics approved these changes Jun 16, 2020

View reviewed changes

trevor-m merged commit c3c1472 into neo-ai:dev Jun 16, 2020

kevinthesun reviewed Jun 16, 2020

View reviewed changes

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Jun 18, 2020

Revert "Sync with apache/incubator-tvm 6/15/2020 (neo-ai#116)"

081eb4a

This reverts commit c3c1472.

trevor-m pushed a commit that referenced this pull request Jun 18, 2020

Revert "Sync with apache/incubator-tvm 6/15/2020 (#116)"

4e01034

This reverts commit c3c1472.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync with apache/incubator-tvm 6/15/2020 #116

Sync with apache/incubator-tvm 6/15/2020 #116

trevor-m commented Jun 9, 2020 •

edited

Loading

trevor-m commented Jun 16, 2020

kevinthesun left a comment

Sync with apache/incubator-tvm 6/15/2020 #116

Sync with apache/incubator-tvm 6/15/2020 #116

Conversation

trevor-m commented Jun 9, 2020 • edited Loading

trevor-m commented Jun 16, 2020

kevinthesun left a comment

Choose a reason for hiding this comment

trevor-m commented Jun 9, 2020 •

edited

Loading