Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add inference kv cache support for transformer TE path #6627

Merged
merged 10 commits into from
Jun 6, 2023

Commits on Jun 2, 2023

  1. Add kv cache support for transformer TE path

    Signed-off-by: Yen-Shi Wang <yenshiw@nvidia.com>
    Yen-Shi Wang committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    5cb0bb5 View commit details
    Browse the repository at this point in the history
  2. [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci
    pre-commit-ci[bot] authored and Yen-Shi Wang committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    6b817eb View commit details
    Browse the repository at this point in the history
  3. Mark get_data_parallel_group as WAR

    Signed-off-by: Yen-Shi Wang <yenshiw@nvidia.com>
    Yen-Shi Wang committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    a770914 View commit details
    Browse the repository at this point in the history
  4. [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci
    pre-commit-ci[bot] authored and Yen-Shi Wang committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    512e6ee View commit details
    Browse the repository at this point in the history
  5. Initialize process group for FP8 training

    Signed-off-by: Tim Moon <tmoon@nvidia.com>
    timmoon10 authored and Yen-Shi Wang committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    cb4ee8d View commit details
    Browse the repository at this point in the history
  6. Update Megatron GPT eval script for non-FP8 path

    Signed-off-by: Yen-Shi Wang <yenshiw@nvidia.com>
    Yen-Shi Wang committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    c047d8c View commit details
    Browse the repository at this point in the history

Commits on Jun 3, 2023

  1. Merge branch 'main' into dev-yenshiw-te-fp8-inference

    Signed-off-by: Yen-Shi Wang <6960565+yen-shi@users.noreply.github.com>
    yen-shi authored Jun 3, 2023
    Configuration menu
    Copy the full SHA
    2134a9a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    9e5518e View commit details
    Browse the repository at this point in the history

Commits on Jun 4, 2023

  1. Configuration menu
    Copy the full SHA
    cc26e5b View commit details
    Browse the repository at this point in the history

Commits on Jun 6, 2023

  1. Configuration menu
    Copy the full SHA
    e2a52be View commit details
    Browse the repository at this point in the history