-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: engine caching #2995
feat: engine caching #2995
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to Python style guidelines:
--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py 2024-07-16 20:09:24.911867+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py 2024-07-16 20:11:22.623270+00:00
@@ -100,11 +100,13 @@
gm = post_lowering(gm, sample_inputs)
logger.debug("Lowered Input graph:\n " + str(gm.graph))
- torchtrt_inputs = prepare_inputs(torch_inputs, disable_memory_format_check=True)
+ torchtrt_inputs = prepare_inputs(
+ torch_inputs, disable_memory_format_check=True
+ )
trt_compiled = compile_module(
gm,
torchtrt_inputs,
settings=settings,
)
cb8d30b
to
59ba4a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have tests for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments. Functionality looks good. One question - Does the hash change if the module has the same graph but different weights ?
Nope, all weights are set to 0 before hashing, so if the architectures are the same they will be considered isomorphic |
748b4c6
to
a86260e
Compare
I dont really get where we refit for functionality in the code now if we store weight stripped (or zero'd) graphs. Also if we set all weights to 0 what happens in data-dependent cases? Can we detect these cases? |
The current code calls refit after
The current engine caching happens after partition. |
2562e2c
to
6533d5c
Compare
6533d5c
to
53c5fb4
Compare
53c5fb4
to
e91e766
Compare
revert backend changes update dynamo path add save_engine_cache and load_engine_cache args support customizing engine cache class refactor and add LRU to clear cache fix bug
ccddbc6
to
fc525e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Description
Engine caching feature. More details see: #2957
Type of change
Checklist: