Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ds-inference Int8 support through ZeroQuant technology #2217

Merged
merged 25 commits into from
Aug 30, 2022

Commits on Aug 8, 2022

  1. Fix the layer-past for GPT based models

    Reza Yazdani committed Aug 8, 2022
    Configuration menu
    Copy the full SHA
    cf2fe01 View commit details
    Browse the repository at this point in the history

Commits on Aug 13, 2022

  1. Configuration menu
    Copy the full SHA
    c2cf304 View commit details
    Browse the repository at this point in the history

Commits on Aug 15, 2022

  1. fixing some issue with loading checkpoint and bias-add

    Reza Yazdani committed Aug 15, 2022
    Configuration menu
    Copy the full SHA
    d98f1f9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ebc82bb View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    43a7023 View commit details
    Browse the repository at this point in the history
  4. Empty-Commit

    Reza Yazdani committed Aug 15, 2022
    Configuration menu
    Copy the full SHA
    00aa188 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    9bed645 View commit details
    Browse the repository at this point in the history

Commits on Aug 18, 2022

  1. fix sevral issues after merging with master

    Reza Yazdani committed Aug 18, 2022
    Configuration menu
    Copy the full SHA
    84e0d03 View commit details
    Browse the repository at this point in the history

Commits on Aug 19, 2022

  1. several fixes for generating the INT8 sharded checkpoint

    Reza Yazdani committed Aug 19, 2022
    Configuration menu
    Copy the full SHA
    f6cb028 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d47bea6 View commit details
    Browse the repository at this point in the history

Commits on Aug 20, 2022

  1. move quantizer declaration before inference branch

    Reza Yazdani committed Aug 20, 2022
    Configuration menu
    Copy the full SHA
    cb72d9c View commit details
    Browse the repository at this point in the history

Commits on Aug 24, 2022

  1. Configuration menu
    Copy the full SHA
    32b9322 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    57779ef View commit details
    Browse the repository at this point in the history
  3. Merge branch 'ds-inference/ZeroQuant-Int8' of github.com:microsoft/De…

    …epSpeed into ds-inference/ZeroQuant-Int8
    Reza Yazdani committed Aug 24, 2022
    Configuration menu
    Copy the full SHA
    f4e48e6 View commit details
    Browse the repository at this point in the history

Commits on Aug 25, 2022

  1. reducing the CPU memory usage when loading checkpoint (this solves th…

    …e issue when there is not enough CPU memory to load large models
    Reza Yazdani committed Aug 25, 2022
    Configuration menu
    Copy the full SHA
    dbcb6ec View commit details
    Browse the repository at this point in the history
  2. some minor modification to the ckpt names

    Reza Yazdani committed Aug 25, 2022
    Configuration menu
    Copy the full SHA
    cd80ecc View commit details
    Browse the repository at this point in the history

Commits on Aug 26, 2022

  1. remove masking and some configuration changes

    Reza Yazdani committed Aug 26, 2022
    Configuration menu
    Copy the full SHA
    82a37d6 View commit details
    Browse the repository at this point in the history
  2. remove dead code

    Reza Yazdani committed Aug 26, 2022
    Configuration menu
    Copy the full SHA
    9d12656 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    4ae356e View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2022

  1. Configuration menu
    Copy the full SHA
    d7ff364 View commit details
    Browse the repository at this point in the history
  2. fix some issue with int8 ckpt-loading

    Reza Yazdani committed Aug 28, 2022
    Configuration menu
    Copy the full SHA
    b17a3b5 View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2022

  1. Configuration menu
    Copy the full SHA
    a541e52 View commit details
    Browse the repository at this point in the history

Commits on Aug 30, 2022

  1. Configuration menu
    Copy the full SHA
    2845bad View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c77f5e0 View commit details
    Browse the repository at this point in the history
  3. change the mp_size to tp_size at inference config & add some doc-stri…

    …ng at init_inference
    Reza Yazdani committed Aug 30, 2022
    Configuration menu
    Copy the full SHA
    f3f4b1d View commit details
    Browse the repository at this point in the history