Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: engine caching #2995

Merged
merged 14 commits into from
Aug 29, 2024
Merged

feat: engine caching #2995

merged 14 commits into from
Aug 29, 2024

Conversation

zewenli98
Copy link
Collaborator

Description

Engine caching feature. More details see: #2957

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

@zewenli98 zewenli98 requested review from narendasan and peri044 July 10, 2024 21:50
@github-actions github-actions bot added component: conversion Issues re: Conversion stage component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: torch_compile labels Jul 10, 2024
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py	2024-07-16 20:09:24.911867+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py	2024-07-16 20:11:22.623270+00:00
@@ -100,11 +100,13 @@

            gm = post_lowering(gm, sample_inputs)

            logger.debug("Lowered Input graph:\n " + str(gm.graph))

-            torchtrt_inputs = prepare_inputs(torch_inputs, disable_memory_format_check=True)
+            torchtrt_inputs = prepare_inputs(
+                torch_inputs, disable_memory_format_check=True
+            )
            trt_compiled = compile_module(
                gm,
                torchtrt_inputs,
                settings=settings,
            )

@zewenli98 zewenli98 changed the title [WIP] feat: engine caching feat: engine caching Jul 23, 2024
@zewenli98 zewenli98 force-pushed the engine_cache branch 2 times, most recently from cb8d30b to 59ba4a2 Compare August 14, 2024 09:46
Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have tests for this?

py/torch_tensorrt/dynamo/_settings.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/_compiler.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/_compiler.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/_compiler.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/_compiler.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/_engine_caching.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/_engine_caching.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/_engine_caching.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/_engine_caching.py Show resolved Hide resolved
py/torch_tensorrt/dynamo/_engine_caching.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@peri044 peri044 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments. Functionality looks good. One question - Does the hash change if the module has the same graph but different weights ?

py/torch_tensorrt/dynamo/_compiler.py Show resolved Hide resolved
py/torch_tensorrt/dynamo/_engine_caching.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/_compiler.py Outdated Show resolved Hide resolved
@zewenli98
Copy link
Collaborator Author

Added some comments. Functionality looks good. One question - Does the hash change if the module has the same graph but different weights ?

Nope, all weights are set to 0 before hashing, so if the architectures are the same they will be considered isomorphic

@narendasan
Copy link
Collaborator

Added some comments. Functionality looks good. One question - Does the hash change if the module has the same graph but different weights ?

Nope, all weights are set to 0 before hashing, so if the architectures are the same they will be considered isomorphic

I dont really get where we refit for functionality in the code now if we store weight stripped (or zero'd) graphs. Also if we set all weights to 0 what happens in data-dependent cases? Can we detect these cases?

@zewenli98
Copy link
Collaborator Author

I dont really get where we refit for functionality in the code now if we store weight stripped (or zero'd) graphs.

The current code calls refit after interpreter.run() but I plan to move it into interpreter.run().

Also if we set all weights to 0 what happens in data-dependent cases? Can we detect these cases?

The current engine caching happens after partition.
If an input is data dependent, does it generate different TRT engines given different actual inputs?

@github-actions github-actions bot added the component: tests Issues re: Tests label Aug 28, 2024
Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@zewenli98 zewenli98 merged commit 95cc532 into main Aug 29, 2024
67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: api [Python] Issues re: Python API component: conversion Issues re: Conversion stage component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: tests Issues re: Tests component: torch_compile
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants