Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug about native compilation on NVIDIA Jetson AGX #132

Closed
chiehpower opened this issue Jul 14, 2020 · 13 comments · Fixed by #138
Closed

Bug about native compilation on NVIDIA Jetson AGX #132

chiehpower opened this issue Jul 14, 2020 · 13 comments · Fixed by #138
Assignees
Labels
documentation Improvements or additions to documentation platform: aarch64 Bugs regarding the x86_64 builds of TRTorch question Further information is requested

Comments

@chiehpower
Copy link

chiehpower commented Jul 14, 2020

🐛 Bug

After I installed the bazel from scratch on AGX device, I directly build it by bazel. However, I got the error like below.

$ bazel build //:libtrtorch --distdir third_party/distdir/aarch64-linux-gnu         

Starting local Bazel server and connecting to it...
INFO: Repository trtorch_py_deps instantiated at:
  no stack (--record_rule_instantiation_callstack not enabled)
Repository rule pip_import defined at:
  /home/nvidia/.cache/bazel/_bazel_nvidia/d7326de2ca76e35cc08b88f9bba7ab43/external/rules_python/python/pip.bzl:51:29: in <toplevel>
ERROR: An error occurred during the fetch of repository 'trtorch_py_deps':
   pip_import failed: Collecting torch==1.5.0 (from -r /home/nvidia/ssd256/github/TRTorch/py/requirements.txt (line 1))
 (  Could not find a version that satisfies the requirement torch==1.5.0 (from -r /home/nvidia/ssd256/github/TRTorch/py/requirements.txt (line 1)) (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2)
No matching distribution found for torch==1.5.0 (from -r /home/nvidia/ssd256/github/TRTorch/py/requirements.txt (line 1))
)
ERROR: no such package '@trtorch_py_deps//': pip_import failed: Collecting torch==1.5.0 (from -r /home/nvidia/ssd256/github/TRTorch/py/requirements.txt (line 1))
 (  Could not find a version that satisfies the requirement torch==1.5.0 (from -r /home/nvidia/ssd256/github/TRTorch/py/requirements.txt (line 1)) (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2)
No matching distribution found for torch==1.5.0 (from -r /home/nvidia/ssd256/github/TRTorch/py/requirements.txt (line 1))
)
INFO: Elapsed time: 8.428s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)

If I used python3 setup.py install, I got the error below:

running install
building libtrtorch
INFO: Build options --compilation_mode, --cxxopt, --define, and 1 more have changed, discarding analysis cache.
INFO: Repository tensorrt instantiated at:
  no stack (--record_rule_instantiation_callstack not enabled)
Repository rule http_archive defined at:
  /home/nvidia/.cache/bazel/_bazel_nvidia/d7326de2ca76e35cc08b88f9bba7ab43/external/bazel_tools/tools/build_defs/repo/http.bzl:336:31: in <toplevel>
WARNING: Download from https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/7.1/tars/TensorRT-7.1.3.4.Ubuntu-18.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz failed: class java.io.IOException GET returned 403 Forbidden
ERROR: An error occurred during the fetch of repository 'tensorrt':
   java.io.IOException: Error downloading [https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/7.1/tars/TensorRT-7.1.3.4.Ubuntu-18.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz] to /home/nvidia/.cache/bazel/_bazel_nvidia/d7326de2ca76e35cc08b88f9bba7ab43/external/tensorrt/TensorRT-7.1.3.4.Ubuntu-18.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz: GET returned 403 Forbidden
INFO: Repository libtorch_pre_cxx11_abi instantiated at:
  no stack (--record_rule_instantiation_callstack not enabled)
Repository rule http_archive defined at:
  /home/nvidia/.cache/bazel/_bazel_nvidia/d7326de2ca76e35cc08b88f9bba7ab43/external/bazel_tools/tools/build_defs/repo/http.bzl:336:31: in <toplevel>
ERROR: /home/nvidia/ssd256/github/TRTorch/core/BUILD:10:11: //core:core depends on @tensorrt//:nvinfer in repository @tensorrt which failed to fetch. no such package '@tensorrt//': java.io.IOException: Error downloading [https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/7.1/tars/TensorRT-7.1.3.4.Ubuntu-18.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz] to /home/nvidia/.cache/bazel/_bazel_nvidia/d7326de2ca76e35cc08b88f9bba7ab43/external/tensorrt/TensorRT-7.1.3.4.Ubuntu-18.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz: GET returned 403 Forbidden
ERROR: Analysis of target '//cpp/api/lib:libtrtorch.so' failed; build aborted: Analysis failed
INFO: Elapsed time: 18.044s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded, 62 targets configured)

Is there any idea about this?

To Reproduce

Steps to reproduce the behavior:

  1. Install bazel from here
  2. Use this command:
bazel build //:libtrtorch --distdir third_party/distdir/aarch64-linux-gnu         

Environment

Build information about the TRTorch compiler can be found by turning on debug messages

  • PyTorch Version: 1.15.0
  • JetPack Version: 4.4
  • How you installed PyTorch: from here
  • Python version: 3.6
  • CUDA version: 10.2
  • GPU models and configuration: AGX jetson device
  • TRT version default is 7.1.0.16 on JetPack 4.4
  • bazel version: 3.4.0

Thank you

BR,
Chieh

@chiehpower
Copy link
Author

After I modified the workspace content, comment out somethings and uncomment and I got the new error messages.

Here is my WORKSPACE

workspace(name = "TRTorch")

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")

git_repository(
    name = "rules_python",
    remote = "https://github.com/bazelbuild/rules_python.git",
    commit = "4fcc24fd8a850bdab2ef2e078b1de337eea751a6",
    shallow_since = "1589292086 -0400"
)

load("@rules_python//python:repositories.bzl", "py_repositories")
py_repositories()

load("@rules_python//python:pip.bzl", "pip_repositories", "pip3_import")
pip_repositories()

http_archive(
    name = "rules_pkg",
    url = "https://github.com/bazelbuild/rules_pkg/releases/download/0.2.4/rules_pkg-0.2.4.tar.gz",
    sha256 = "4ba8f4ab0ff85f2484287ab06c0d871dcb31cc54d439457d28fd4ae14b18450a",
)

load("@rules_pkg//:deps.bzl", "rules_pkg_dependencies")
rules_pkg_dependencies()

git_repository(
    name = "googletest",
    remote = "https://github.com/google/googletest",
    commit = "703bd9caab50b139428cea1aaff9974ebee5742e",
    shallow_since = "1570114335 -0400"
)

# CUDA should be installed on the system locally
new_local_repository(
    name = "cuda",
    path = "/usr/local/cuda-10.2/",
    build_file = "@//third_party/cuda:BUILD",
)

new_local_repository(
    name = "cublas",
    path = "/usr",
    build_file = "@//third_party/cublas:BUILD",
)

#############################################################################################################
# Tarballs and fetched dependencies (default - use in cases when building from precompiled bin and tarballs)
#############################################################################################################

http_archive(
    name = "libtorch",
    build_file = "@//third_party/libtorch:BUILD",
    strip_prefix = "libtorch",
    urls = ["https://download.pytorch.org/libtorch/cu102/libtorch-cxx11-abi-shared-with-deps-1.5.1.zip"],
    sha256 = "cf0691493d05062fe3239cf76773bae4c5124f4b039050dbdd291c652af3ab2a"
)

http_archive(
    name = "libtorch_pre_cxx11_abi",
    build_file = "@//third_party/libtorch:BUILD",
    strip_prefix = "libtorch",
    sha256 = "818977576572eadaf62c80434a25afe44dbaa32ebda3a0919e389dcbe74f8656",
    urls = ["https://download.pytorch.org/libtorch/cu102/libtorch-shared-with-deps-1.5.1.zip"],
)

# Download these tarballs manually from the NVIDIA website
# Either place them in the distdir directory in third_party and use the --distdir flag
# or modify the urls to "file:///<PATH TO TARBALL>/<TARBALL NAME>.tar.gz

#http_archive(
#    name = "cudnn",
#    urls = ["https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.0.1.13/10.2_20200626/cudnn-10.2-linux-x64-v8.0.1.13.tgz"],
#    build_file = "@//third_party/cudnn/archive:BUILD",
#    sha256 = "0c106ec84f199a0fbcf1199010166986da732f9b0907768c9ac5ea5b120772db",
#    strip_prefix = "cuda"
#)

#http_archive(
#    name = "tensorrt",
#    urls = ["https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/7.1/tars/TensorRT-7.1.3.4.Ubuntu-18.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz"],
#    build_file = "@//third_party/tensorrt/archive:BUILD",
#    sha256 = "9205bed204e2ae7aafd2e01cce0f21309e281e18d5bfd7172ef8541771539d41",
#    strip_prefix = "TensorRT-7.1.3.4"
#)

####################################################################################
# Locally installed dependencies (use in cases of custom dependencies or aarch64)
####################################################################################

# NOTE: In the case you are using just the pre-cxx11-abi path or just the cxx11 abi path
# with your local libtorch, just point deps at the same path to satisfy bazel.

# NOTE: NVIDIA's aarch64 PyTorch (python) wheel file uses the CXX11 ABI unlike PyTorch's standard
# x86_64 python distribution. If using NVIDIA's version just point to the root of the package
# for both versions here and do not use --config=pre-cxx11-abi

#new_local_repository(
#    name = "libtorch",
#    path = "/usr/local/lib/python3.6/dist-packages/torch",
#    build_file = "third_party/libtorch/BUILD"
#)

#new_local_repository(
#    name = "libtorch_pre_cxx11_abi",
#    path = "/usr/local/lib/python3.6/dist-packages/torch",
#    build_file = "third_party/libtorch/BUILD"
#)

new_local_repository(
    name = "cudnn",
    path = "/usr/",
    build_file = "@//third_party/cudnn/local:BUILD"
)

new_local_repository(
   name = "tensorrt",
   path = "/usr/",
   build_file = "@//third_party/tensorrt/local:BUILD"
)

#########################################################################
# Testing Dependencies (optional - comment out on aarch64)
#########################################################################
#pip3_import(
#    name = "trtorch_py_deps",
#    requirements = "//py:requirements.txt"
#)

#load("@trtorch_py_deps//:requirements.bzl", "pip_install")
#pip_install()

#pip3_import(
#   name = "py_test_deps",
#   requirements = "//tests/py:requirements.txt"
#)

#load("@py_test_deps//:requirements.bzl", "pip_install")
#pip_install()

Output:

/home/nvidia/ssd256/github/TRTorch/py $ python3 setup.py install   
                                                                             
running install
building libtrtorch
INFO: Analyzed target //cpp/api/lib:libtrtorch.so (32 packages loaded, 1947 targets configured).
INFO: Found 1 target...
ERROR: /home/nvidia/ssd256/github/TRTorch/cpp/api/lib/BUILD:3:10: Linking of rule '//cpp/api/lib:libtrtorch.so' failed (Exit 1) gcc failed: error executing command /usr/bin/gcc @bazel-out/aarch64-opt/bin/cpp/api/lib/libtrtorch.so-2.params

Use --sandbox_debug to see verbose messages from the sandbox
/usr/bin/ld.gold: warning: skipping incompatible bazel-out/aarch64-opt/bin/_solib_aarch64/_U@libtorch_Upre_Ucxx11_Uabi_S_S_Ctorch___Uexternal_Slibtorch_Upre_Ucxx11_Uabi_Slib/libtorch.so while searching for torch
/usr/bin/ld.gold: error: cannot find -ltorch
/usr/bin/ld.gold: warning: skipping incompatible bazel-out/aarch64-opt/bin/_solib_aarch64/_U@libtorch_Upre_Ucxx11_Uabi_S_S_Ctorch___Uexternal_Slibtorch_Upre_Ucxx11_Uabi_Slib/libtorch_cuda.so while searching for torch_cuda
/usr/bin/ld.gold: error: cannot find -ltorch_cuda
/usr/bin/ld.gold: warning: skipping incompatible bazel-out/aarch64-opt/bin/_solib_aarch64/_U@libtorch_Upre_Ucxx11_Uabi_S_S_Ctorch___Uexternal_Slibtorch_Upre_Ucxx11_Uabi_Slib/libtorch_cpu.so while searching for torch_cpu
/usr/bin/ld.gold: error: cannot find -ltorch_cpu
/usr/bin/ld.gold: warning: skipping incompatible bazel-out/aarch64-opt/bin/_solib_aarch64/_U@libtorch_Upre_Ucxx11_Uabi_S_S_Ctorch___Uexternal_Slibtorch_Upre_Ucxx11_Uabi_Slib/libtorch_global_deps.so while searching for torch_global_deps
/usr/bin/ld.gold: error: cannot find -ltorch_global_deps
/usr/bin/ld.gold: warning: skipping incompatible bazel-out/aarch64-opt/bin/_solib_aarch64/_U@libtorch_Upre_Ucxx11_Uabi_S_S_Cc10_Ucuda___Uexternal_Slibtorch_Upre_Ucxx11_Uabi_Slib/libc10_cuda.so while searching for c10_cuda
/usr/bin/ld.gold: error: cannot find -lc10_cuda
/usr/bin/ld.gold: warning: skipping incompatible bazel-out/aarch64-opt/bin/_solib_aarch64/_U@libtorch_Upre_Ucxx11_Uabi_S_S_Cc10___Uexternal_Slibtorch_Upre_Ucxx11_Uabi_Slib/libc10.so while searching for c10
/usr/bin/ld.gold: error: cannot find -lc10
collect2: error: ld returned 1 exit status
Target //cpp/api/lib:libtrtorch.so failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 285.202s, Critical Path: 55.46s
INFO: 51 processes: 51 linux-sandbox.
FAILED: Build did NOT complete successfully


@narendasan
Copy link
Collaborator

Sorry about this, we are still in the process of writing the documentation on how to build on aarch64. You should use all local repository sources for dependencies. Right now you are trying to pull x86_64 libraries from pytorch.org. Comment out all http_archive sources as well as the test deps like you did and use all new_local_repository sources with PyTorch 1.5.0 python package for Jetson installed (https://forums.developer.nvidia.com/t/pytorch-for-jetson-nano-version-1-5-0-now-available/72048) and configure the paths for both libtorch and libtorch_pre_cxx11_abi to point to the root of the python package directory. The default should work if you install pytorch with sudo pip install.

@narendasan narendasan added the documentation Improvements or additions to documentation label Jul 14, 2020
@narendasan narendasan added the platform: aarch64 Bugs regarding the x86_64 builds of TRTorch label Jul 14, 2020
@narendasan narendasan added the question Further information is requested label Jul 14, 2020
@chiehpower
Copy link
Author

chiehpower commented Jul 15, 2020

Sorry about this, we are still in the process of writing the documentation on how to build on aarch64. You should use all local repository sources for dependencies. Right now you are trying to pull x86_64 libraries from pytorch.org. Comment out all http_archive sources as well as the test deps like you did and use all new_local_repository sources with PyTorch 1.5.0 python package for Jetson installed (https://forums.developer.nvidia.com/t/pytorch-for-jetson-nano-version-1-5-0-now-available/72048) and configure the paths for both libtorch and libtorch_pre_cxx11_abi to point to the root of the python package directory. The default should work if you install pytorch with sudo pip install.

Hi @narendasan ,

Thanks for your reply!

First test

As I followed your advice to comment out all of http_archive including rules_pkg, it got error.

http_archive(
    name = "rules_pkg",
    url = "https://github.com/bazelbuild/rules_pkg/releases/download/0.2.4/rules_pkg-0.2.4.tar.gz",
    sha256 = "4ba8f4ab0ff85f2484287ab06c0d871dcb31cc54d439457d28fd4ae14b18450a",
)

Error message:

$ python3 setup.py install                                        
running install
building libtrtorch
Starting local Bazel server and connecting to it...
ERROR: Failed to load Starlark extension '@rules_pkg//:deps.bzl'.
Cycle in the workspace file detected. This indicates that a repository is used prior to being defined.
The following chain of repository dependencies lead to the missing definition.
 - @rules_pkg
This could either mean you have to add the '@rules_pkg' repository with a statement like `http_archive` in your WORKSPACE file (note that transitive dependencies are not added automatically), or move an existing definition earlier in your WORKSPACE file.
ERROR: cycles detected during target parsing
INFO: Elapsed time: 4.307s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)

Second test

I remove the old repository and git clone again in order to confirm my repository which is the latest version.
In the WORKSPACE, comment out all of http_archive except rules_pkg part, and uncommnet all of new_local_repository.
I enclosed my WORKSPACE file if you can check.
WORKSPACE

Commnad:

sudo bazel build //:libtrtorch --distdir third_party/distdir/aarch64-linux-gnu

Output:

Extracting Bazel installation...
Starting local Bazel server and connecting to it...
INFO: non-existent distdir /home/nvidia/ssd256/github/TRTorch/third_party/distdir/aarch64-linux-gnu
INFO: non-existent distdir /home/nvidia/ssd256/github/TRTorch/third_party/distdir/aarch64-linux-gnu
INFO: non-existent distdir /home/nvidia/ssd256/github/TRTorch/third_party/distdir/aarch64-linux-gnu
INFO: Analyzed target //:libtrtorch (39 packages loaded, 2472 targets configured).
INFO: Found 1 target...
ERROR: /home/nvidia/ssd256/github/TRTorch/cpp/trtorchc/BUILD:10:10: C++ compilation of rule '//cpp/trtorchc:trtorchc' failed (Exit 1) gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer '-std=c++0x' -MD -MF ... (remaining 63 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
cpp/trtorchc/main.cpp: In function 'bool checkRtol(const at::Tensor&, std::vector<at::Tensor>, float)':
cpp/trtorchc/main.cpp:23:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kDEBUG, std::string("Max Difference: ") + std::to_string(diff.abs().max().item<float>()));
              ^~~~~~~
cpp/trtorchc/main.cpp:23:36: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kDEBUG, std::string("Max Difference: ") + std::to_string(diff.abs().max().item<float>()));
                                    ^~~~~~~
cpp/trtorchc/main.cpp:24:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kDEBUG, std::string("Acceptable Threshold: ") + std::to_string(threshold));
              ^~~~~~~
cpp/trtorchc/main.cpp:24:36: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kDEBUG, std::string("Acceptable Threshold: ") + std::to_string(threshold));
                                    ^~~~~~~
cpp/trtorchc/main.cpp: In function 'std::vector<long int> parseSingleDim(std::__cxx11::string)':
cpp/trtorchc/main.cpp:54:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kERROR, "Shapes need dimensions delimited by comma in parentheses, \"(N,..,C,H,W)\"\n e.g \"(3,3,200,200)\"");
              ^~~~~~~
cpp/trtorchc/main.cpp:54:36: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kERROR, "Shapes need dimensions delimited by comma in parentheses, \"(N,..,C,H,W)\"\n e.g \"(3,3,200,200)\"");
                                    ^~~~~~~
cpp/trtorchc/main.cpp: In function 'trtorch::ExtraInfo::InputRange parseDynamicDim(std::__cxx11::string)':
cpp/trtorchc/main.cpp:78:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Dynamic shapes need three sets of dimensions delimited by semi-colons, \"[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]\"\n e.g \"[(3,3,100,100);(3,3,200,200);(3,3,300,300)]\"");
                  ^~~~~~~
cpp/trtorchc/main.cpp:78:40: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Dynamic shapes need three sets of dimensions delimited by semi-colons, \"[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]\"\n e.g \"[(3,3,100,100);(3,3,200,200);(3,3,300,300)]\"");
                                        ^~~~~~~
cpp/trtorchc/main.cpp: In function 'std::__cxx11::string get_cwd()':
cpp/trtorchc/main.cpp:91:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Unable to get current directory");
                  ^~~~~~~
cpp/trtorchc/main.cpp:91:40: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Unable to get current directory");
                                        ^~~~~~~
cpp/trtorchc/main.cpp: In function 'std::__cxx11::string real_path(std::__cxx11::string)':
cpp/trtorchc/main.cpp:103:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, std::string("Unable to find file ") + abs_path);
                  ^~~~~~~
cpp/trtorchc/main.cpp:103:40: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, std::string("Unable to find file ") + abs_path);
                                        ^~~~~~~
cpp/trtorchc/main.cpp: In function 'int main(int, char**)':
cpp/trtorchc/main.cpp:117:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::set_is_colored_output_on(true);
              ^~~~~~~
cpp/trtorchc/main.cpp:118:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kWARNING);
              ^~~~~~~
cpp/trtorchc/main.cpp:118:57: error: 'trtorch::logging' has not been declared
     trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kWARNING);
                                                         ^~~~~~~
cpp/trtorchc/main.cpp:119:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::set_logging_prefix("");
              ^~~~~~~
cpp/trtorchc/main.cpp:175:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kDEBUG);
                  ^~~~~~~
cpp/trtorchc/main.cpp:175:61: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kDEBUG);
                                                             ^~~~~~~
cpp/trtorchc/main.cpp:177:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kINFO);
                  ^~~~~~~
cpp/trtorchc/main.cpp:177:61: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kINFO);
                                                             ^~~~~~~
cpp/trtorchc/main.cpp:179:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kERROR);
                  ^~~~~~~
cpp/trtorchc/main.cpp:179:61: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kERROR);
                                                             ^~~~~~~
cpp/trtorchc/main.cpp:190:22: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Dimensions should be specified in one of these types \"(N,..,C,H,W)\" \"[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]\"\n e.g \"(3,3,300,300)\" \"[(3,3,100,100);(3,3,200,200);(3,3,300,300)]\"");
                      ^~~~~~~
cpp/trtorchc/main.cpp:190:44: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Dimensions should be specified in one of these types \"(N,..,C,H,W)\" \"[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]\"\n e.g \"(3,3,300,300)\" \"[(3,3,100,100);(3,3,200,200);(3,3,300,300)]\"");
                                            ^~~~~~~
cpp/trtorchc/main.cpp:215:32: error: 'trtorch::ptq' has not been declared
     auto calibrator = trtorch::ptq::make_int8_cache_calibrator(calibration_cache_file_path);
                                ^~~
cpp/trtorchc/main.cpp:229:26: error: 'trtorch::logging' has not been declared
                 trtorch::logging::log(trtorch::logging::Level::kERROR, "If targeting INT8 default operating precision with trtorchc, a calibration cache file must be provided");
                          ^~~~~~~
cpp/trtorchc/main.cpp:229:48: error: 'trtorch::logging' has not been declared
                 trtorch::logging::log(trtorch::logging::Level::kERROR, "If targeting INT8 default operating precision with trtorchc, a calibration cache file must be provided");
                                                ^~~~~~~
cpp/trtorchc/main.cpp:234:22: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid default operating precision, options are [ float | float32 | f32 | half | float16 | f16 | int8 | i8 ]");
                      ^~~~~~~
cpp/trtorchc/main.cpp:234:44: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid default operating precision, options are [ float | float32 | f32 | half | float16 | f16 | int8 | i8 ]");
                                            ^~~~~~~
cpp/trtorchc/main.cpp:248:22: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid device type, options are [ gpu | dla ]");
                      ^~~~~~~
cpp/trtorchc/main.cpp:248:44: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid device type, options are [ gpu | dla ]");
                                            ^~~~~~~
cpp/trtorchc/main.cpp:264:22: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid engine capability, options are [ default | safe_gpu | safe_dla ]");
                      ^~~~~~~
cpp/trtorchc/main.cpp:264:44: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid engine capability, options are [ default | safe_gpu | safe_dla ]");
                                            ^~~~~~~
cpp/trtorchc/main.cpp:295:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Error loading the model (path may be incorrect)");
                  ^~~~~~~
cpp/trtorchc/main.cpp:295:40: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Error loading the model (path may be incorrect)");
                                        ^~~~~~~
cpp/trtorchc/main.cpp:301:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Module is not currently supported by TRTorch");
                  ^~~~~~~
cpp/trtorchc/main.cpp:301:40: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Module is not currently supported by TRTorch");
                                        ^~~~~~~
cpp/trtorchc/main.cpp:355:30: error: 'trtorch::logging' has not been declared
                     trtorch::logging::log(trtorch::logging::Level::kWARNING, std::string("Maximum numerical deviation for output exceeds set threshold (") + threshold_ss.str() + std::string(")"));
                              ^~~~~~~
cpp/trtorchc/main.cpp:355:52: error: 'trtorch::logging' has not been declared
                     trtorch::logging::log(trtorch::logging::Level::kWARNING, std::string("Maximum numerical deviation for output exceeds set threshold (") + threshold_ss.str() + std::string(")"));
                                                    ^~~~~~~
cpp/trtorchc/main.cpp:359:22: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kWARNING, "Due to change in operating data type, numerical precision is not checked");
                      ^~~~~~~
cpp/trtorchc/main.cpp:359:44: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kWARNING, "Due to change in operating data type, numerical precision is not checked");
                                            ^~~~~~~
Target //:libtrtorch failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 190.913s, Critical Path: 39.86s
INFO: 68 processes: 68 processwrapper-sandbox.
FAILED: Build did NOT complete successfully

Obviously the actions are different with previously what I did but it still got wrong.

@narendasan
Copy link
Collaborator

I guess I wasnt specific enough, my bad, yes you do need to keep the rules_pkg source, I meant really comment out the http_archive sources for TensorRT, cuDNN and LibTorch and use the new_local_repository versions. Once you use that you do not need to use the distdir command

@chiehpower
Copy link
Author

I guess I wasnt specific enough, my bad, yes you do need to keep the rules_pkg source, I meant really comment out the http_archive sources for TensorRT, cuDNN and LibTorch and use the new_local_repository versions. Once you use that you do not need to use the distdir command

Hi @narendasan ,

Thanks for your advice and reply!

I have two tests.

Test to install wheel file

location: ./TRTorch/py

Commnad:

$ sudo python3 setup.py install                                    

Output:

[sudo] password for nvidia: 
running install
building libtrtorch
Starting local Bazel server and connecting to it...
INFO: Analyzed target //cpp/api/lib:libtrtorch.so (32 packages loaded, 1947 targets configured).
INFO: Found 1 target...
Target //cpp/api/lib:libtrtorch.so up-to-date:
  bazel-bin/cpp/api/lib/libtrtorch.so
INFO: Elapsed time: 187.041s, Critical Path: 54.48s
INFO: 52 processes: 52 processwrapper-sandbox.
INFO: Build completed successfully, 509 total actions
creating version file
copying library into module
running build
running build_py
copying trtorch/_version.py -> build/lib.linux-aarch64-3.6/trtorch
running egg_info
writing trtorch.egg-info/PKG-INFO
writing dependency_links to trtorch.egg-info/dependency_links.txt
writing requirements to trtorch.egg-info/requires.txt
writing top-level names to trtorch.egg-info/top_level.txt
/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py:304: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
  warnings.warn(msg.format('we could not find ninja.'))
reading manifest file 'trtorch.egg-info/SOURCES.txt'
writing manifest file 'trtorch.egg-info/SOURCES.txt'
copying trtorch/lib/libtrtorch.so -> build/lib.linux-aarch64-3.6/trtorch/lib
running build_ext
running install_lib
creating /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/logging.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/__init__.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_compiler.py -> /usr/local/lib/python3.6/dist-packages/trtorch
creating /usr/local/lib/python3.6/dist-packages/trtorch/lib
copying build/lib.linux-aarch64-3.6/trtorch/lib/libtrtorch.so -> /usr/local/lib/python3.6/dist-packages/trtorch/lib
copying build/lib.linux-aarch64-3.6/trtorch/_C.cpython-36m-aarch64-linux-gnu.so -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_version.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_extra_info.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_types.py -> /usr/local/lib/python3.6/dist-packages/trtorch
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/logging.py to logging.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/__init__.py to __init__.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_compiler.py to _compiler.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_version.py to _version.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_extra_info.py to _extra_info.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_types.py to _types.cpython-36.pyc
running install_egg_info
Copying trtorch.egg-info to /usr/local/lib/python3.6/dist-packages/trtorch-0.0.2-py3.6.egg-info
running install_scripts

There is no error message, but the thing is that I cannot find the name of trtorch folder in the /usr/local/lib/python3.6/dist-packages/. Also, If I type import trtorch in python3, it cannot find this module.

$ python3                                                     
Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import torchvision
>>> import trtorch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/trtorch/__init__.py", line 11, in <module>
    from trtorch._compiler import *
  File "/usr/local/lib/python3.6/dist-packages/trtorch/_compiler.py", line 5, in <module>
    import trtorch._C
ImportError: libtrtorch.so: cannot open shared object file: No such file or directory

Test the command by bazel

location: ./TRTorch

Commnad:

$ bazel build //:libtrtorch --verbose_failures                                

Output:

INFO: Analyzed target //:libtrtorch (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
ERROR: /home/nvidia/ssd256/github/TRTorch/cpp/trtorchc/BUILD:10:10: C++ compilation of rule '//cpp/trtorchc:trtorchc' failed (Exit 1): gcc failed: error executing command 
  (cd /home/nvidia/.cache/bazel/_bazel_nvidia/d7326de2ca76e35cc08b88f9bba7ab43/sandbox/linux-sandbox/82/execroot/TRTorch && \
  exec env - \
    PATH=/home/nvidia/cmake-3.13.0/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/src/tensorrt/bin/ \
    PWD=/proc/self/cwd \
  /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer '-std=c++0x' -MD -MF bazel-out/aarch64-fastbuild/bin/cpp/trtorchc/_objs/trtorchc/main.pic.d '-frandom-seed=bazel-out/aarch64-fastbuild/bin/cpp/trtorchc/_objs/trtorchc/main.pic.o' -fPIC -iquote . -iquote bazel-out/aarch64-fastbuild/bin -iquote external/tensorrt -iquote bazel-out/aarch64-fastbuild/bin/external/tensorrt -iquote external/cuda -iquote bazel-out/aarch64-fastbuild/bin/external/cuda -iquote external/cudnn -iquote bazel-out/aarch64-fastbuild/bin/external/cudnn -iquote external/libtorch -iquote bazel-out/aarch64-fastbuild/bin/external/libtorch -iquote external/bazel_tools -iquote bazel-out/aarch64-fastbuild/bin/external/bazel_tools -Ibazel-out/aarch64-fastbuild/bin/cpp/api/_virtual_includes/trtorch -Ibazel-out/aarch64-fastbuild/bin/external/libtorch/_virtual_includes/ATen -Ibazel-out/aarch64-fastbuild/bin/external/libtorch/_virtual_includes/c10_cuda -Ibazel-out/aarch64-fastbuild/bin/external/libtorch/_virtual_includes/c10 -Ibazel-out/aarch64-fastbuild/bin/external/libtorch/_virtual_includes/caffe2 -isystem external/tensorrt/include/aarch64-linux-gnu -isystem bazel-out/aarch64-fastbuild/bin/external/tensorrt/include/aarch64-linux-gnu -isystem external/cuda/include -isystem bazel-out/aarch64-fastbuild/bin/external/cuda/include -isystem external/cudnn/include -isystem bazel-out/aarch64-fastbuild/bin/external/cudnn/include -isystem external/libtorch/include -isystem bazel-out/aarch64-fastbuild/bin/external/libtorch/include -isystem external/libtorch/include/torch/csrc/api/include -isystem bazel-out/aarch64-fastbuild/bin/external/libtorch/include/torch/csrc/api/include '-fdiagnostics-color=always' '-std=c++14' -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c cpp/trtorchc/main.cpp -o bazel-out/aarch64-fastbuild/bin/cpp/trtorchc/_objs/trtorchc/main.pic.o)
Execution platform: @local_config_platform//:host

Use --sandbox_debug to see verbose messages from the sandbox gcc failed: error executing command 
  (cd /home/nvidia/.cache/bazel/_bazel_nvidia/d7326de2ca76e35cc08b88f9bba7ab43/sandbox/linux-sandbox/82/execroot/TRTorch && \
  exec env - \
    PATH=/home/nvidia/cmake-3.13.0/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/src/tensorrt/bin/ \
    PWD=/proc/self/cwd \
  /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer '-std=c++0x' -MD -MF bazel-out/aarch64-fastbuild/bin/cpp/trtorchc/_objs/trtorchc/main.pic.d '-frandom-seed=bazel-out/aarch64-fastbuild/bin/cpp/trtorchc/_objs/trtorchc/main.pic.o' -fPIC -iquote . -iquote bazel-out/aarch64-fastbuild/bin -iquote external/tensorrt -iquote bazel-out/aarch64-fastbuild/bin/external/tensorrt -iquote external/cuda -iquote bazel-out/aarch64-fastbuild/bin/external/cuda -iquote external/cudnn -iquote bazel-out/aarch64-fastbuild/bin/external/cudnn -iquote external/libtorch -iquote bazel-out/aarch64-fastbuild/bin/external/libtorch -iquote external/bazel_tools -iquote bazel-out/aarch64-fastbuild/bin/external/bazel_tools -Ibazel-out/aarch64-fastbuild/bin/cpp/api/_virtual_includes/trtorch -Ibazel-out/aarch64-fastbuild/bin/external/libtorch/_virtual_includes/ATen -Ibazel-out/aarch64-fastbuild/bin/external/libtorch/_virtual_includes/c10_cuda -Ibazel-out/aarch64-fastbuild/bin/external/libtorch/_virtual_includes/c10 -Ibazel-out/aarch64-fastbuild/bin/external/libtorch/_virtual_includes/caffe2 -isystem external/tensorrt/include/aarch64-linux-gnu -isystem bazel-out/aarch64-fastbuild/bin/external/tensorrt/include/aarch64-linux-gnu -isystem external/cuda/include -isystem bazel-out/aarch64-fastbuild/bin/external/cuda/include -isystem external/cudnn/include -isystem bazel-out/aarch64-fastbuild/bin/external/cudnn/include -isystem external/libtorch/include -isystem bazel-out/aarch64-fastbuild/bin/external/libtorch/include -isystem external/libtorch/include/torch/csrc/api/include -isystem bazel-out/aarch64-fastbuild/bin/external/libtorch/include/torch/csrc/api/include '-fdiagnostics-color=always' '-std=c++14' -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c cpp/trtorchc/main.cpp -o bazel-out/aarch64-fastbuild/bin/cpp/trtorchc/_objs/trtorchc/main.pic.o)
Execution platform: @local_config_platform//:host

Use --sandbox_debug to see verbose messages from the sandbox
cpp/trtorchc/main.cpp: In function 'bool checkRtol(const at::Tensor&, std::vector<at::Tensor>, float)':
cpp/trtorchc/main.cpp:23:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kDEBUG, std::string("Max Difference: ") + std::to_string(diff.abs().max().item<float>()));
              ^~~~~~~
cpp/trtorchc/main.cpp:23:36: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kDEBUG, std::string("Max Difference: ") + std::to_string(diff.abs().max().item<float>()));
                                    ^~~~~~~
cpp/trtorchc/main.cpp:24:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kDEBUG, std::string("Acceptable Threshold: ") + std::to_string(threshold));
              ^~~~~~~
cpp/trtorchc/main.cpp:24:36: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kDEBUG, std::string("Acceptable Threshold: ") + std::to_string(threshold));
                                    ^~~~~~~
cpp/trtorchc/main.cpp: In function 'std::vector<long int> parseSingleDim(std::__cxx11::string)':
cpp/trtorchc/main.cpp:54:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kERROR, "Shapes need dimensions delimited by comma in parentheses, \"(N,..,C,H,W)\"\n e.g \"(3,3,200,200)\"");
              ^~~~~~~
cpp/trtorchc/main.cpp:54:36: error: 'trtorch::logging' has not been declared
     trtorch::logging::log(trtorch::logging::Level::kERROR, "Shapes need dimensions delimited by comma in parentheses, \"(N,..,C,H,W)\"\n e.g \"(3,3,200,200)\"");
                                    ^~~~~~~
cpp/trtorchc/main.cpp: In function 'trtorch::ExtraInfo::InputRange parseDynamicDim(std::__cxx11::string)':
cpp/trtorchc/main.cpp:78:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Dynamic shapes need three sets of dimensions delimited by semi-colons, \"[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]\"\n e.g \"[(3,3,100,100);(3,3,200,200);(3,3,300,300)]\"");
                  ^~~~~~~
cpp/trtorchc/main.cpp:78:40: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Dynamic shapes need three sets of dimensions delimited by semi-colons, \"[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]\"\n e.g \"[(3,3,100,100);(3,3,200,200);(3,3,300,300)]\"");
                                        ^~~~~~~
cpp/trtorchc/main.cpp: In function 'std::__cxx11::string get_cwd()':
cpp/trtorchc/main.cpp:91:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Unable to get current directory");
                  ^~~~~~~
cpp/trtorchc/main.cpp:91:40: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Unable to get current directory");
                                        ^~~~~~~
cpp/trtorchc/main.cpp: In function 'std::__cxx11::string real_path(std::__cxx11::string)':
cpp/trtorchc/main.cpp:103:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, std::string("Unable to find file ") + abs_path);
                  ^~~~~~~
cpp/trtorchc/main.cpp:103:40: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, std::string("Unable to find file ") + abs_path);
                                        ^~~~~~~
cpp/trtorchc/main.cpp: In function 'int main(int, char**)':
cpp/trtorchc/main.cpp:117:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::set_is_colored_output_on(true);
              ^~~~~~~
cpp/trtorchc/main.cpp:118:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kWARNING);
              ^~~~~~~
cpp/trtorchc/main.cpp:118:57: error: 'trtorch::logging' has not been declared
     trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kWARNING);
                                                         ^~~~~~~
cpp/trtorchc/main.cpp:119:14: error: 'trtorch::logging' has not been declared
     trtorch::logging::set_logging_prefix("");
              ^~~~~~~
cpp/trtorchc/main.cpp:175:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kDEBUG);
                  ^~~~~~~
cpp/trtorchc/main.cpp:175:61: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kDEBUG);
                                                             ^~~~~~~
cpp/trtorchc/main.cpp:177:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kINFO);
                  ^~~~~~~
cpp/trtorchc/main.cpp:177:61: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kINFO);
                                                             ^~~~~~~
cpp/trtorchc/main.cpp:179:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kERROR);
                  ^~~~~~~
cpp/trtorchc/main.cpp:179:61: error: 'trtorch::logging' has not been declared
         trtorch::logging::set_reportable_log_level(trtorch::logging::Level::kERROR);
                                                             ^~~~~~~
cpp/trtorchc/main.cpp:190:22: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Dimensions should be specified in one of these types \"(N,..,C,H,W)\" \"[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]\"\n e.g \"(3,3,300,300)\" \"[(3,3,100,100);(3,3,200,200);(3,3,300,300)]\"");
                      ^~~~~~~
cpp/trtorchc/main.cpp:190:44: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Dimensions should be specified in one of these types \"(N,..,C,H,W)\" \"[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]\"\n e.g \"(3,3,300,300)\" \"[(3,3,100,100);(3,3,200,200);(3,3,300,300)]\"");
                                            ^~~~~~~
cpp/trtorchc/main.cpp:215:32: error: 'trtorch::ptq' has not been declared
     auto calibrator = trtorch::ptq::make_int8_cache_calibrator(calibration_cache_file_path);
                                ^~~
cpp/trtorchc/main.cpp:229:26: error: 'trtorch::logging' has not been declared
                 trtorch::logging::log(trtorch::logging::Level::kERROR, "If targeting INT8 default operating precision with trtorchc, a calibration cache file must be provided");
                          ^~~~~~~
cpp/trtorchc/main.cpp:229:48: error: 'trtorch::logging' has not been declared
                 trtorch::logging::log(trtorch::logging::Level::kERROR, "If targeting INT8 default operating precision with trtorchc, a calibration cache file must be provided");
                                                ^~~~~~~
cpp/trtorchc/main.cpp:234:22: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid default operating precision, options are [ float | float32 | f32 | half | float16 | f16 | int8 | i8 ]");
                      ^~~~~~~
cpp/trtorchc/main.cpp:234:44: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid default operating precision, options are [ float | float32 | f32 | half | float16 | f16 | int8 | i8 ]");
                                            ^~~~~~~
cpp/trtorchc/main.cpp:248:22: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid device type, options are [ gpu | dla ]");
                      ^~~~~~~
cpp/trtorchc/main.cpp:248:44: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid device type, options are [ gpu | dla ]");
                                            ^~~~~~~
cpp/trtorchc/main.cpp:264:22: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid engine capability, options are [ default | safe_gpu | safe_dla ]");
                      ^~~~~~~
cpp/trtorchc/main.cpp:264:44: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kERROR, "Invalid engine capability, options are [ default | safe_gpu | safe_dla ]");
                                            ^~~~~~~
cpp/trtorchc/main.cpp:295:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Error loading the model (path may be incorrect)");
                  ^~~~~~~
cpp/trtorchc/main.cpp:295:40: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Error loading the model (path may be incorrect)");
                                        ^~~~~~~
cpp/trtorchc/main.cpp:301:18: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Module is not currently supported by TRTorch");
                  ^~~~~~~
cpp/trtorchc/main.cpp:301:40: error: 'trtorch::logging' has not been declared
         trtorch::logging::log(trtorch::logging::Level::kERROR, "Module is not currently supported by TRTorch");
                                        ^~~~~~~
cpp/trtorchc/main.cpp:355:30: error: 'trtorch::logging' has not been declared
                     trtorch::logging::log(trtorch::logging::Level::kWARNING, std::string("Maximum numerical deviation for output exceeds set threshold (") + threshold_ss.str() + std::string(")"));
                              ^~~~~~~
cpp/trtorchc/main.cpp:355:52: error: 'trtorch::logging' has not been declared
                     trtorch::logging::log(trtorch::logging::Level::kWARNING, std::string("Maximum numerical deviation for output exceeds set threshold (") + threshold_ss.str() + std::string(")"));
                                                    ^~~~~~~
cpp/trtorchc/main.cpp:359:22: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kWARNING, "Due to change in operating data type, numerical precision is not checked");
                      ^~~~~~~
cpp/trtorchc/main.cpp:359:44: error: 'trtorch::logging' has not been declared
             trtorch::logging::log(trtorch::logging::Level::kWARNING, "Due to change in operating data type, numerical precision is not checked");
                                            ^~~~~~~
Target //:libtrtorch failed to build
INFO: Elapsed time: 27.041s, Critical Path: 26.67s
INFO: 1 process: 1 linux-sandbox.
FAILED: Build did NOT complete successfully

Is there any idea for my error messages?
Thank you!

BR,
Chieh

@narendasan
Copy link
Collaborator

For error 1. there was a fix today that addresses that issue, the rpath was not getting added to the compile command for the Python bindings. I will investigate your second error and see if i can replicate it.

narendasan added a commit that referenced this issue Jul 16, 2020
Addresses issue #132

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
@narendasan narendasan reopened this Jul 16, 2020
@narendasan
Copy link
Collaborator

Traced the 2nd issue to an API update that happened recently. Latest master should resolve this

@chiehpower
Copy link
Author

For error 1. there was a fix today that addresses that issue, the rpath was not getting added to the compile command for the Python bindings. I will investigate your second error and see if i can replicate it.

Hi @narendasan ,

I was just testing again that I removed previous repository and git clone again.
Use the workspace document as same as previous one.

Go to /py, and type sudo python3 setup.py install

Output:

$ sudo python3 setup.py install                                    
running install
building libtrtorch
Starting local Bazel server and connecting to it...
INFO: Analyzed target //cpp/api/lib:libtrtorch.so (32 packages loaded, 1947 targets configured).
INFO: Found 1 target...
Target //cpp/api/lib:libtrtorch.so up-to-date:
  bazel-bin/cpp/api/lib/libtrtorch.so
INFO: Elapsed time: 11.342s, Critical Path: 1.05s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
creating version file
copying library into module
running build
running build_py
creating build
creating build/lib.linux-aarch64-3.6
creating build/lib.linux-aarch64-3.6/trtorch
copying trtorch/logging.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/__init__.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_compiler.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_version.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_extra_info.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_types.py -> build/lib.linux-aarch64-3.6/trtorch
running egg_info
creating trtorch.egg-info
writing trtorch.egg-info/PKG-INFO
writing dependency_links to trtorch.egg-info/dependency_links.txt
writing requirements to trtorch.egg-info/requires.txt
writing top-level names to trtorch.egg-info/top_level.txt
writing manifest file 'trtorch.egg-info/SOURCES.txt'
/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py:304: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
  warnings.warn(msg.format('we could not find ninja.'))
reading manifest file 'trtorch.egg-info/SOURCES.txt'
writing manifest file 'trtorch.egg-info/SOURCES.txt'
creating build/lib.linux-aarch64-3.6/trtorch/lib
copying trtorch/lib/libtrtorch.so -> build/lib.linux-aarch64-3.6/trtorch/lib
running build_ext
building 'trtorch._C' extension
creating build/temp.linux-aarch64-3.6
creating build/temp.linux-aarch64-3.6/trtorch
creating build/temp.linux-aarch64-3.6/trtorch/csrc
aarch64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -UNDEBUG -I/home/nvidia/ssd256/github/TRTorch/py/../ -I/home/nvidia/ssd256/github/TRTorch/py/../bazel-TRTorch/external/tensorrt/include -I/usr/local/lib/python3.6/dist-packages/torch/include -I/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.6/dist-packages/torch/include/TH -I/usr/local/lib/python3.6/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c trtorch/csrc/trtorch_py.cpp -o build/temp.linux-aarch64-3.6/trtorch/csrc/trtorch_py.o -Wno-deprecated -Wno-deprecated-declarations -D_GLIBCXX_USE_CXX11_ABI=0 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++14
<command-line>:0:0: warning: "_GLIBCXX_USE_CXX11_ABI" redefined
<command-line>:0:0: note: this is the location of the previous definition
aarch64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-aarch64-3.6/trtorch/csrc/trtorch_py.o -L/home/nvidia/ssd256/github/TRTorch/py/trtorch/lib/ -L/usr/local/lib/python3.6/dist-packages/torch/lib -L/usr/local/cuda/lib64 -ltrtorch -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-aarch64-3.6/trtorch/_C.cpython-36m-aarch64-linux-gnu.so -Wno-deprecated -Wno-deprecated-declarations -Wl,--no-as-needed -ltrtorch -Wl,-rpath,$ORIGIN/lib -D_GLIBCXX_USE_CXX11_ABI=0
running install_lib
copying build/lib.linux-aarch64-3.6/trtorch/logging.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/__init__.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_compiler.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/lib/libtrtorch.so -> /usr/local/lib/python3.6/dist-packages/trtorch/lib
copying build/lib.linux-aarch64-3.6/trtorch/_C.cpython-36m-aarch64-linux-gnu.so -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_version.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_extra_info.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_types.py -> /usr/local/lib/python3.6/dist-packages/trtorch
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/logging.py to logging.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/__init__.py to __init__.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_compiler.py to _compiler.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_version.py to _version.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_extra_info.py to _extra_info.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_types.py to _types.cpython-36.pyc
running install_egg_info
removing '/usr/local/lib/python3.6/dist-packages/trtorch-0.0.2-py3.6.egg-info' (and everything under it)
Copying trtorch.egg-info to /usr/local/lib/python3.6/dist-packages/trtorch-0.0.2-py3.6.egg-info
running install_scripts

I go other place (Not in py folder)
Try again in python3

$ python3                                                              
Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import trtorch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/trtorch/__init__.py", line 11, in <module>
    from trtorch._compiler import *
  File "/usr/local/lib/python3.6/dist-packages/trtorch/_compiler.py", line 5, in <module>
    import trtorch._C
ImportError: /usr/local/lib/python3.6/dist-packages/trtorch/lib/libtrtorch.so: undefined symbol: _ZN2at11show_configEv
>>> exit()

What step did I do wrong?

Thank you for your update.
The 2nd issue I already passed it! (I can successfully bazel it without any wrong.)

$  bazel build //:libtrtorch --verbose_failures               
Starting local Bazel server and connecting to it...
INFO: Analyzed target //:libtrtorch (39 packages loaded, 2472 targets configured).
INFO: Found 1 target...
Target //:libtrtorch up-to-date:
  bazel-bin/libtrtorch.tar.gz
INFO: Elapsed time: 169.444s, Critical Path: 43.70s
INFO: 72 processes: 72 linux-sandbox.
INFO: Build completed successfully, 76 total actions

However, the python3 issue is still existing.

BR,
Chieh

@narendasan
Copy link
Collaborator

I think the issue with the python build is an ABI incompatibility issue. Try using python3 setup.py install --use-cxx11-abi

@chiehpower
Copy link
Author

I think the issue with the python build is an ABI incompatibility issue. Try using python3 setup.py install --use-cxx11-abi

Hi @narendasan ,

Here is what I tested.

Command:

sudo python3 setup.py install --use-cxx11-abi              

Output:

running install
using CXX11 ABI build
building libtrtorch
INFO: Analyzed target //cpp/api/lib:libtrtorch.so (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //cpp/api/lib:libtrtorch.so up-to-date:
  bazel-bin/cpp/api/lib/libtrtorch.so
INFO: Elapsed time: 0.517s, Critical Path: 0.02s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
creating version file
copying library into module
running build
running build_py
copying trtorch/_version.py -> build/lib.linux-aarch64-3.6/trtorch
running egg_info
writing trtorch.egg-info/PKG-INFO
writing dependency_links to trtorch.egg-info/dependency_links.txt
writing requirements to trtorch.egg-info/requires.txt
writing top-level names to trtorch.egg-info/top_level.txt
/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py:304: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
  warnings.warn(msg.format('we could not find ninja.'))
reading manifest file 'trtorch.egg-info/SOURCES.txt'
writing manifest file 'trtorch.egg-info/SOURCES.txt'
copying trtorch/lib/libtrtorch.so -> build/lib.linux-aarch64-3.6/trtorch/lib
running build_ext
running install_lib
copying build/lib.linux-aarch64-3.6/trtorch/lib/libtrtorch.so -> /usr/local/lib/python3.6/dist-packages/trtorch/lib
copying build/lib.linux-aarch64-3.6/trtorch/_version.py -> /usr/local/lib/python3.6/dist-packages/trtorch
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_version.py to _version.cpython-36.pyc
running install_egg_info
removing '/usr/local/lib/python3.6/dist-packages/trtorch-0.0.2-py3.6.egg-info' (and everything under it)
Copying trtorch.egg-info to /usr/local/lib/python3.6/dist-packages/trtorch-0.0.2-py3.6.egg-info
running install_scripts

Test in python3

$ python3                                                                                                 
Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import trtorch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nvidia/ssd256/github/TRTorch/py/trtorch/__init__.py", line 10, in <module>
    from trtorch._version import __version__
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 674, in exec_module
  File "<frozen importlib._bootstrap_external>", line 780, in get_code
  File "<frozen importlib._bootstrap_external>", line 832, in get_data
PermissionError: [Errno 13] Permission denied: '/home/nvidia/ssd256/github/TRTorch/py/trtorch/_version.py'

If I tested by sudo python3, then I got this.

$ sudo python3                                                                                             
Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import trtorch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nvidia/ssd256/github/TRTorch/py/trtorch/__init__.py", line 11, in <module>
    from trtorch._compiler import *
  File "/home/nvidia/ssd256/github/TRTorch/py/trtorch/_compiler.py", line 5, in <module>
    import trtorch._C
ModuleNotFoundError: No module named 'trtorch._C'

But I think normally I don't use sudo to implement python3.

@narendasan
Copy link
Collaborator

narendasan commented Jul 17, 2020

Make sure you have uninstalled all previous copies of trtorch before doing setup.py.

I would do something like:

pip3 uninstall trtorch # make sure both user and system packages are uninstalled
sudo python3 setup.py clean
python3 setup.py install --use-cxx11-abi 

If that doesnt work try making a pip package and installing that.

python3 setup.py bdist_wheel --use-cxx11-abi 

@chiehpower
Copy link
Author

Make sure you have uninstalled all previous copies of trtorch before doing setup.py.

I would do something like:

pip3 uninstall trtorch # make sure both user and system packages are uninstalled
sudo python3 setup.py clean
python3 setup.py install --use-cxx11-abi 

If that doesnt work try making a pip package and installing that.

python3 setup.py bdist_wheel

Hi @narendasan ,

I have tested as followed below.

First

$ sudo -H pip3 uninstall trtorch                                            
Uninstalling trtorch-0.0.2:
  /usr/local/lib/python3.6/dist-packages/trtorch
  /usr/local/lib/python3.6/dist-packages/trtorch-0.0.2-py3.6.egg-info
Proceed (y/n)? y
  Successfully uninstalled trtorch-0.0.2
$ sudo python3 setup.py clean
running clean
Removing build
Removing trtorch/__pycache__
Removing trtorch/lib
Removing trtorch.egg-info
$ sudo python3 setup.py bdist_wheel                                 
running bdist_wheel
building libtrtorch
INFO: Build options --cxxopt, --define, and --linkopt have changed, discarding analysis cache.
INFO: Analyzed target //cpp/api/lib:libtrtorch.so (1 packages loaded, 1947 targets configured).
INFO: Found 1 target...
Target //cpp/api/lib:libtrtorch.so up-to-date:
  bazel-bin/cpp/api/lib/libtrtorch.so
INFO: Elapsed time: 193.873s, Critical Path: 51.32s
INFO: 52 processes: 52 processwrapper-sandbox.
INFO: Build completed successfully, 56 total actions
creating version file
copying library into module
/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py:304: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
  warnings.warn(msg.format('we could not find ninja.'))
running build
running build_py
creating build
creating build/lib.linux-aarch64-3.6
creating build/lib.linux-aarch64-3.6/trtorch
copying trtorch/logging.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/__init__.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_compiler.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_version.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_extra_info.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_types.py -> build/lib.linux-aarch64-3.6/trtorch
running egg_info
creating trtorch.egg-info
writing trtorch.egg-info/PKG-INFO
writing dependency_links to trtorch.egg-info/dependency_links.txt
writing requirements to trtorch.egg-info/requires.txt
writing top-level names to trtorch.egg-info/top_level.txt
writing manifest file 'trtorch.egg-info/SOURCES.txt'
reading manifest file 'trtorch.egg-info/SOURCES.txt'
writing manifest file 'trtorch.egg-info/SOURCES.txt'
creating build/lib.linux-aarch64-3.6/trtorch/lib
copying trtorch/lib/libtrtorch.so -> build/lib.linux-aarch64-3.6/trtorch/lib
running build_ext
building 'trtorch._C' extension
creating build/temp.linux-aarch64-3.6
creating build/temp.linux-aarch64-3.6/trtorch
creating build/temp.linux-aarch64-3.6/trtorch/csrc
aarch64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -UNDEBUG -I/home/nvidia/ssd256/github/TRTorch/py/../ -I/home/nvidia/ssd256/github/TRTorch/py/../bazel-TRTorch/external/tensorrt/include -I/usr/local/lib/python3.6/dist-packages/torch/include -I/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.6/dist-packages/torch/include/TH -I/usr/local/lib/python3.6/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c trtorch/csrc/trtorch_py.cpp -o build/temp.linux-aarch64-3.6/trtorch/csrc/trtorch_py.o -Wno-deprecated -Wno-deprecated-declarations -D_GLIBCXX_USE_CXX11_ABI=0 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++14
<command-line>:0:0: warning: "_GLIBCXX_USE_CXX11_ABI" redefined
<command-line>:0:0: note: this is the location of the previous definition
aarch64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-aarch64-3.6/trtorch/csrc/trtorch_py.o -L/home/nvidia/ssd256/github/TRTorch/py/trtorch/lib/ -L/usr/local/lib/python3.6/dist-packages/torch/lib -L/usr/local/cuda/lib64 -ltrtorch -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-aarch64-3.6/trtorch/_C.cpython-36m-aarch64-linux-gnu.so -Wno-deprecated -Wno-deprecated-declarations -Wl,--no-as-needed -ltrtorch -Wl,-rpath,$ORIGIN/lib -D_GLIBCXX_USE_CXX11_ABI=0
installing to build/bdist.linux-aarch64/wheel
running install
building libtrtorch
INFO: Analyzed target //cpp/api/lib:libtrtorch.so (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //cpp/api/lib:libtrtorch.so up-to-date:
  bazel-bin/cpp/api/lib/libtrtorch.so
INFO: Elapsed time: 0.415s, Critical Path: 0.01s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
creating version file
copying library into module
running install_lib
creating build/bdist.linux-aarch64
creating build/bdist.linux-aarch64/wheel
creating build/bdist.linux-aarch64/wheel/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/logging.py -> build/bdist.linux-aarch64/wheel/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/__init__.py -> build/bdist.linux-aarch64/wheel/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_compiler.py -> build/bdist.linux-aarch64/wheel/trtorch
creating build/bdist.linux-aarch64/wheel/trtorch/lib
copying build/lib.linux-aarch64-3.6/trtorch/lib/libtrtorch.so -> build/bdist.linux-aarch64/wheel/trtorch/lib
copying build/lib.linux-aarch64-3.6/trtorch/_C.cpython-36m-aarch64-linux-gnu.so -> build/bdist.linux-aarch64/wheel/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_version.py -> build/bdist.linux-aarch64/wheel/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_extra_info.py -> build/bdist.linux-aarch64/wheel/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_types.py -> build/bdist.linux-aarch64/wheel/trtorch
running install_egg_info
Copying trtorch.egg-info to build/bdist.linux-aarch64/wheel/trtorch-0.0.2-py3.6.egg-info
running install_scripts
adding license file "LICENSE" (matched pattern "LICEN[CS]E*")
creating build/bdist.linux-aarch64/wheel/trtorch-0.0.2.dist-info/WHEEL
creating 'dist/trtorch-0.0.2-cp36-cp36m-linux_aarch64.whl' and adding 'build/bdist.linux-aarch64/wheel' to it
adding 'trtorch/_C.cpython-36m-aarch64-linux-gnu.so'
adding 'trtorch/__init__.py'
adding 'trtorch/_compiler.py'
adding 'trtorch/_extra_info.py'
adding 'trtorch/_types.py'
adding 'trtorch/_version.py'
adding 'trtorch/logging.py'
adding 'trtorch/lib/libtrtorch.so'
adding 'trtorch-0.0.2.dist-info/LICENSE'
adding 'trtorch-0.0.2.dist-info/METADATA'
adding 'trtorch-0.0.2.dist-info/WHEEL'
adding 'trtorch-0.0.2.dist-info/top_level.txt'
adding 'trtorch-0.0.2.dist-info/RECORD'
removing build/bdist.linux-aarch64/wheel
$ pip3 install trtorch-0.0.2-cp36-cp36m-linux_aarch64.whl                                                          
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Defaulting to user installation because normal site-packages is not writeable
Processing ./trtorch-0.0.2-cp36-cp36m-linux_aarch64.whl
Requirement already satisfied: torch==1.5.0 in /usr/local/lib/python3.6/dist-packages (from trtorch==0.0.2) (1.5.0)
Requirement already satisfied: future in /usr/local/lib/python3.6/dist-packages (from torch==1.5.0->trtorch==0.0.2) (0.18.2)
Requirement already satisfied: numpy in /usr/lib/python3/dist-packages (from torch==1.5.0->trtorch==0.0.2) (1.13.3)
Installing collected packages: trtorch
Successfully installed trtorch-0.0.2

Test in python3 (I tested on other place not in py folder.)

Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import trtorch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nvidia/.local/lib/python3.6/site-packages/trtorch/__init__.py", line 11, in <module>
    from trtorch._compiler import *
  File "/home/nvidia/.local/lib/python3.6/site-packages/trtorch/_compiler.py", line 5, in <module>
    import trtorch._C
ImportError: /home/nvidia/.local/lib/python3.6/site-packages/trtorch/lib/libtrtorch.so: undefined symbol: _ZN2at11show_configEv

I usually saw the error which was relevant about trtorch._C.


Second

$ pip3 uninstall trtorch                                                                                                     
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Found existing installation: trtorch 0.0.2
Uninstalling trtorch-0.0.2:
  Would remove:
    /home/nvidia/.local/lib/python3.6/site-packages/trtorch-0.0.2.dist-info/*
    /home/nvidia/.local/lib/python3.6/site-packages/trtorch/*
Proceed (y/n)? y
  Successfully uninstalled trtorch-0.0.2
$ sudo python3 setup.py clean                                                                             
running clean
Removing build
Removing dist
Removing trtorch/lib
Removing trtorch.egg-info
$ sudo python3 setup.py install --use-cxx11-abi 
running install
using CXX11 ABI build
building libtrtorch
INFO: Build options --cxxopt, --define, and --linkopt have changed, discarding analysis cache.
INFO: Analyzed target //cpp/api/lib:libtrtorch.so (0 packages loaded, 1947 targets configured).
INFO: Found 1 target...
Target //cpp/api/lib:libtrtorch.so up-to-date:
  bazel-bin/cpp/api/lib/libtrtorch.so
INFO: Elapsed time: 175.798s, Critical Path: 54.30s
INFO: 52 processes: 52 processwrapper-sandbox.
INFO: Build completed successfully, 56 total actions
creating version file
copying library into module
running build
running build_py
creating build
creating build/lib.linux-aarch64-3.6
creating build/lib.linux-aarch64-3.6/trtorch
copying trtorch/logging.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/__init__.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_compiler.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_version.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_extra_info.py -> build/lib.linux-aarch64-3.6/trtorch
copying trtorch/_types.py -> build/lib.linux-aarch64-3.6/trtorch
running egg_info
creating trtorch.egg-info
writing trtorch.egg-info/PKG-INFO
writing dependency_links to trtorch.egg-info/dependency_links.txt
writing requirements to trtorch.egg-info/requires.txt
writing top-level names to trtorch.egg-info/top_level.txt
writing manifest file 'trtorch.egg-info/SOURCES.txt'
/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py:304: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
  warnings.warn(msg.format('we could not find ninja.'))
reading manifest file 'trtorch.egg-info/SOURCES.txt'
writing manifest file 'trtorch.egg-info/SOURCES.txt'
creating build/lib.linux-aarch64-3.6/trtorch/lib
copying trtorch/lib/libtrtorch.so -> build/lib.linux-aarch64-3.6/trtorch/lib
running build_ext
building 'trtorch._C' extension
creating build/temp.linux-aarch64-3.6
creating build/temp.linux-aarch64-3.6/trtorch
creating build/temp.linux-aarch64-3.6/trtorch/csrc
aarch64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -UNDEBUG -I/home/nvidia/ssd256/github/TRTorch/py/../ -I/home/nvidia/ssd256/github/TRTorch/py/../bazel-TRTorch/external/tensorrt/include -I/usr/local/lib/python3.6/dist-packages/torch/include -I/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.6/dist-packages/torch/include/TH -I/usr/local/lib/python3.6/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c trtorch/csrc/trtorch_py.cpp -o build/temp.linux-aarch64-3.6/trtorch/csrc/trtorch_py.o -Wno-deprecated -Wno-deprecated-declarations -D_GLIBCXX_USE_CXX11_ABI=1 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++14
aarch64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-aarch64-3.6/trtorch/csrc/trtorch_py.o -L/home/nvidia/ssd256/github/TRTorch/py/trtorch/lib/ -L/usr/local/lib/python3.6/dist-packages/torch/lib -L/usr/local/cuda/lib64 -ltrtorch -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-aarch64-3.6/trtorch/_C.cpython-36m-aarch64-linux-gnu.so -Wno-deprecated -Wno-deprecated-declarations -Wl,--no-as-needed -ltrtorch -Wl,-rpath,$ORIGIN/lib -D_GLIBCXX_USE_CXX11_ABI=1
running install_lib
creating /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/logging.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/__init__.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_compiler.py -> /usr/local/lib/python3.6/dist-packages/trtorch
creating /usr/local/lib/python3.6/dist-packages/trtorch/lib
copying build/lib.linux-aarch64-3.6/trtorch/lib/libtrtorch.so -> /usr/local/lib/python3.6/dist-packages/trtorch/lib
copying build/lib.linux-aarch64-3.6/trtorch/_C.cpython-36m-aarch64-linux-gnu.so -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_version.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_extra_info.py -> /usr/local/lib/python3.6/dist-packages/trtorch
copying build/lib.linux-aarch64-3.6/trtorch/_types.py -> /usr/local/lib/python3.6/dist-packages/trtorch
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/logging.py to logging.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/__init__.py to __init__.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_compiler.py to _compiler.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_version.py to _version.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_extra_info.py to _extra_info.cpython-36.pyc
byte-compiling /usr/local/lib/python3.6/dist-packages/trtorch/_types.py to _types.cpython-36.pyc
running install_egg_info
Copying trtorch.egg-info to /usr/local/lib/python3.6/dist-packages/trtorch-0.0.2-py3.6.egg-info
running install_scripts
$ python3                                                                                                                    
Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import trtorch
>>> trtorch.__version__
'0.0.2'
>>> exit()

Finally it seems to work. (Not sure why the way of installing by wheel cannot work.)

Thank you for your helping!
I will keep testing about TRTorch.

BR,
Chieh

@chiehpower
Copy link
Author

The issue can re-open at any time if there is any further discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation platform: aarch64 Bugs regarding the x86_64 builds of TRTorch question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants