Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the new device API, fixing the a nested dict issue in the existing compile phase, adding new lowering pass for bn #288

Merged
merged 8 commits into from
Jan 25, 2021

Conversation

narendasan
Copy link
Collaborator

Description

This PR adds a new device sub struct to the CompileSpec structure reflecting the new DLA changes.
It also addresses a nested dictionary issue in the compile phase of the backend api.
Finally adds a lowering pass that addresses new dimension checks for the batchnorm operator.

This PR also bumps bazel and TensorRT versions in preparation for the next release.

Fixes #286

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes

to_backend

Also fixes nested dictionary bug reported in #286

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
batchnorm

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
BREAKING CHANGE: Version of bazel has been bumped to 4.0.0
Version of TensorRT has been bumped to 7.2.2.3

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>

docs: Update docs for to_backend API for new device API and new PyTorch
API

Changes the docs to show the new device dictionary API and how to use
the new to backend api (changed from PyTorch 1.6.0)

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
@github-actions github-actions bot added component: api [Python] Issues re: Python API component: build system Issues re: Build system component: core Issues re: The core compiler component: lowering Issues re: The lowering / preprocessing passes component: tests Issues re: Tests documentation Improvements or additions to documentation labels Jan 22, 2021
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link
Contributor

@andi4191 andi4191 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be a combination of 3 PRs
a) TensorRT and bazel version bumps
b) Batch normalization dimension check changes
c) Device struct changes

I would have recommend splitting the Batch normalization changes from this PR.

py/trtorch/csrc/tensorrt_classes.h Show resolved Hide resolved
@@ -59,7 +69,7 @@ struct Device : torch::CustomClassHolder {
allow_gpu_fallback(false) // allow_gpu_fallback
{}

ADD_FIELD_GET_SET(device_type, DeviceType);
ADD_ENUM_GET_SET(device_type, DeviceType, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "1" here?
Can we use enumeration?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This macro creates a getter and setter pair that returns integers so that the function is torchbind compatible. the 1 is the max allowable value so that you dont get invalid ones.

ADD_FIELD_GET_SET(refit, bool);
ADD_FIELD_GET_SET(debug, bool);
ADD_FIELD_GET_SET(strict_types, bool);
ADD_ENUM_GET_SET(capability, EngineCapability, 3);
ADD_ENUM_GET_SET(capability, EngineCapability, 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using enumerations instead of hardcoded numbers?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do like a static cast to int of the enums, sounds like a good idea

@narendasan
Copy link
Collaborator Author

The batchnorm and device checks are both in service of fixing to_backend so I think they are related. Like you couldnt use the to_backend api without both of these changes together so I think its alright that they are in the same PR. The version bump is minor, so i am not sure if its a big enough deal to review separately.

@@ -19,8 +19,11 @@ def setUp(self):
"refit": False,
"debug": False,
"strict_types": False,
"allow_gpu_fallback": True,
"device_type": "gpu",
"device": {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andi4191 This struct looks correct to you right?

Copy link
Contributor

@andi4191 andi4191 Jan 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are missing dla_core.

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

@narendasan narendasan requested a review from andi4191 January 23, 2021 01:12
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link
Contributor

@andi4191 andi4191 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@narendasan narendasan merged commit 20022d4 into master Jan 25, 2021
@narendasan narendasan deleted the to_backend_device branch January 25, 2021 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: api [Python] Issues re: Python API component: build system Issues re: Build system component: core Issues re: The core compiler component: lowering Issues re: The lowering / preprocessing passes component: tests Issues re: Tests documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🐛 [Bug] An wrong function call for torch._C._jit_to_tensorrt on spec in tutorial code.
2 participants