Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: data parallel inference examples #2805

Merged
merged 1 commit into from
May 17, 2024
Merged

feat: data parallel inference examples #2805

merged 1 commit into from
May 17, 2024

Conversation

bowang007
Copy link
Collaborator

Description

This PR shows a simple example about using accelerate library for data parallel inference.

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/examples/distributed_inference/data_parallel_gpt2.py	2024-05-02 00:29:27.054073+00:00
+++ /home/runner/work/TensorRT/TensorRT/examples/distributed_inference/data_parallel_gpt2.py	2024-05-02 00:31:18.785078+00:00
@@ -13,12 +13,26 @@

distributed_state = PartialState()

model = GPT2LMHeadModel.from_pretrained("gpt2").eval().to(distributed_state.device)

-model.forward = torch.compile(model.forward, backend="torch_tensorrt", options={"truncate_long_and_double": True, "enabled_precisions": {torch.float16}, "debug": True}, dynamic=False,)
+model.forward = torch.compile(
+    model.forward,
+    backend="torch_tensorrt",
+    options={
+        "truncate_long_and_double": True,
+        "enabled_precisions": {torch.float16},
+        "debug": True,
+    },
+    dynamic=False,
+)

with distributed_state.split_between_processes([input_id1, input_id2]) as prompt:
    cur_input = torch.clone(prompt[0]).to(distributed_state.device)

-    gen_tokens = model.generate(cur_input, do_sample=True, temperature=0.9, max_length=100,)
+    gen_tokens = model.generate(
+        cur_input,
+        do_sample=True,
+        temperature=0.9,
+        max_length=100,
+    )
    gen_text = tokenizer.batch_decode(gen_tokens)[0]

@bowang007 bowang007 changed the title feat: data parallel inference sample feat: data parallel inference examples May 2, 2024
Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Need a requirements.txt
  2. Annotate the script with description of whats happening https://github.com/pytorch/TensorRT/blob/main/examples/dynamo/torch_compile_advanced_usage.py
  3. Add a reference to index.rst so that it gets rendered in the docs:
    tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion

@github-actions github-actions bot added the documentation Improvements or additions to documentation label May 7, 2024
Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bowang007 bowang007 force-pushed the multi_gpu_support branch from 4bc05b7 to dfbf6ea Compare May 16, 2024 23:07
@bowang007 bowang007 force-pushed the multi_gpu_support branch from dfbf6ea to 7b4b504 Compare May 16, 2024 23:08
@bowang007 bowang007 merged commit db24b3b into main May 17, 2024
35 of 36 checks passed
@HolyWu
Copy link
Contributor

HolyWu commented May 17, 2024

@bowang007 You didn't properly clean up the merge conflicts, therefore db24b3b had <<<<<<< HEAD, ======= and >>>>>>> dfbf6ea84 (feat: data parallel inference sample) remaining in docsrc/index.rst.

bowang007 added a commit that referenced this pull request May 17, 2024
peri044 pushed a commit that referenced this pull request May 21, 2024
laikhtewari pushed a commit that referenced this pull request May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants