-
Notifications
You must be signed in to change notification settings - Fork 357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding the new device API, fixing the a nested dict issue in the existing compile phase, adding new lowering pass for bn #288
Conversation
to_backend Also fixes nested dictionary bug reported in #286 Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>
batchnorm Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>
BREAKING CHANGE: Version of bazel has been bumped to 4.0.0 Version of TensorRT has been bumped to 7.2.2.3 Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>
Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com> docs: Update docs for to_backend API for new device API and new PyTorch API Changes the docs to show the new device dictionary API and how to use the new to backend api (changed from PyTorch 1.6.0) Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>
Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be a combination of 3 PRs
a) TensorRT and bazel version bumps
b) Batch normalization dimension check changes
c) Device struct changes
I would have recommend splitting the Batch normalization changes from this PR.
py/trtorch/csrc/tensorrt_classes.h
Outdated
@@ -59,7 +69,7 @@ struct Device : torch::CustomClassHolder { | |||
allow_gpu_fallback(false) // allow_gpu_fallback | |||
{} | |||
|
|||
ADD_FIELD_GET_SET(device_type, DeviceType); | |||
ADD_ENUM_GET_SET(device_type, DeviceType, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "1" here?
Can we use enumeration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This macro creates a getter and setter pair that returns integers so that the function is torchbind compatible. the 1 is the max allowable value so that you dont get invalid ones.
py/trtorch/csrc/tensorrt_classes.h
Outdated
ADD_FIELD_GET_SET(refit, bool); | ||
ADD_FIELD_GET_SET(debug, bool); | ||
ADD_FIELD_GET_SET(strict_types, bool); | ||
ADD_ENUM_GET_SET(capability, EngineCapability, 3); | ||
ADD_ENUM_GET_SET(capability, EngineCapability, 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using enumerations instead of hardcoded numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do like a static cast to int of the enums, sounds like a good idea
The batchnorm and device checks are both in service of fixing to_backend so I think they are related. Like you couldnt use the to_backend api without both of these changes together so I think its alright that they are in the same PR. The version bump is minor, so i am not sure if its a big enough deal to review separately. |
@@ -19,8 +19,11 @@ def setUp(self): | |||
"refit": False, | |||
"debug": False, | |||
"strict_types": False, | |||
"allow_gpu_fallback": True, | |||
"device_type": "gpu", | |||
"device": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andi4191 This struct looks correct to you right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are missing dla_core.
Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>
Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
This PR adds a new device sub struct to the
CompileSpec
structure reflecting the new DLA changes.It also addresses a nested dictionary issue in the compile phase of the backend api.
Finally adds a lowering pass that addresses new dimension checks for the batchnorm operator.
This PR also bumps bazel and TensorRT versions in preparation for the next release.
Fixes #286
Type of change
Please delete options that are not relevant and/or add your own.
Checklist: