Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fp16 nchw for cudnn-fp16 backend (support GTX 16xx GPUs) #849

Merged
merged 20 commits into from
May 13, 2019

Commits on Nov 11, 2018

  1. Merge pull request #3 from LeelaChessZero/master

    use bestmove_is_sent_ for Search::IsSearchActive() (LeelaChessZero#502)
    ankan-ban committed Nov 11, 2018
    Configuration menu
    Copy the full SHA
    e3ad2c0 View commit details
    Browse the repository at this point in the history

Commits on Nov 21, 2018

  1. Merge pull request #4 from LeelaChessZero/master

    get latest
    ankan-ban committed Nov 21, 2018
    Configuration menu
    Copy the full SHA
    b2e5114 View commit details
    Browse the repository at this point in the history

Commits on Dec 16, 2018

  1. Merge pull request #7 from LeelaChessZero/master

    get latest
    ankan-ban committed Dec 16, 2018
    Configuration menu
    Copy the full SHA
    beed96e View commit details
    Browse the repository at this point in the history

Commits on Dec 21, 2018

  1. Merge pull request #8 from LeelaChessZero/master

    get latest
    ankan-ban committed Dec 21, 2018
    Configuration menu
    Copy the full SHA
    80ac4a1 View commit details
    Browse the repository at this point in the history

Commits on Jan 15, 2019

  1. Merge pull request #10 from LeelaChessZero/master

    get latest
    ankan-ban committed Jan 15, 2019
    Configuration menu
    Copy the full SHA
    0f7bc50 View commit details
    Browse the repository at this point in the history

Commits on Feb 13, 2019

  1. Merge pull request #11 from LeelaChessZero/master

    Get latest
    ankan-ban committed Feb 13, 2019
    Configuration menu
    Copy the full SHA
    e4737e3 View commit details
    Browse the repository at this point in the history
  2. misc changes to cudnn backend

    - replace all cudaMemcpyAsync used for loading weights with cudaMemcpy as  source (in CPU memory) could be deleted before the async version of the function actually does the copy.
    - minor naming/style changes.
    - add comment explaining what the policy map layer does and how the layout conversion from CHW to HWC works.
    ankan-ban committed Feb 13, 2019
    Configuration menu
    Copy the full SHA
    49eb8e8 View commit details
    Browse the repository at this point in the history
  3. fix typo in comment

    ankan-ban committed Feb 13, 2019
    Configuration menu
    Copy the full SHA
    acfd7c1 View commit details
    Browse the repository at this point in the history
  4. clang-format

    ankan-ban committed Feb 13, 2019
    Configuration menu
    Copy the full SHA
    33f3d57 View commit details
    Browse the repository at this point in the history

Commits on Feb 14, 2019

  1. address review comment

    ankan-ban committed Feb 14, 2019
    Configuration menu
    Copy the full SHA
    1976777 View commit details
    Browse the repository at this point in the history

Commits on Feb 19, 2019

  1. Merge pull request #13 from LeelaChessZero/master

    get latest
    ankan-ban committed Feb 19, 2019
    Configuration menu
    Copy the full SHA
    8f46984 View commit details
    Browse the repository at this point in the history

Commits on May 4, 2019

  1. Merge pull request #14 from LeelaChessZero/master

    get latest
    ankan-ban committed May 4, 2019
    Configuration menu
    Copy the full SHA
    b8dd014 View commit details
    Browse the repository at this point in the history

Commits on May 11, 2019

  1. support cudnn-fp16 backend on GPUs without tensor cores

    - try NCHW layout and winograd alogirhtm for convolutions (same as what we use for fp32).
    - it's slower than NHWC/fp16 on GPUs with tensor cores, but should give some speedup on GP100 and TU11x GPUs.
    ankan-ban committed May 11, 2019
    Configuration menu
    Copy the full SHA
    7211cda View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    dd8c0ae View commit details
    Browse the repository at this point in the history
  3. fix another build break

    - not sure why Visual C works fine!
    ankan-ban committed May 11, 2019
    Configuration menu
    Copy the full SHA
    a73cfe8 View commit details
    Browse the repository at this point in the history

Commits on May 12, 2019

  1. add check for cards with no tensor cores

    - GP100 (SM6.0)
    - GTX 16xx GPUs (unfortunately same sm 7.5 version so need a string compare)
    ankan-ban committed May 12, 2019
    Configuration menu
    Copy the full SHA
    017c07c View commit details
    Browse the repository at this point in the history
  2. clang format

    ankan-ban committed May 12, 2019
    Configuration menu
    Copy the full SHA
    445c7c6 View commit details
    Browse the repository at this point in the history
  3. add backend-opt to force nhwc on or off

    default is auto-select (-1).
    ankan-ban committed May 12, 2019
    Configuration menu
    Copy the full SHA
    d8049a6 View commit details
    Browse the repository at this point in the history
  4. fix typo

    ankan-ban committed May 12, 2019
    Configuration menu
    Copy the full SHA
    9daed1a View commit details
    Browse the repository at this point in the history
  5. address review comment

    Use bool option instead of int and use IsDefault mechanism to check if the option was forced or not.
    ankan-ban committed May 12, 2019
    Configuration menu
    Copy the full SHA
    124eef1 View commit details
    Browse the repository at this point in the history