Skip to content
This repository has been archived by the owner on Mar 12, 2021. It is now read-only.

ND-reverse #416

Merged
merged 12 commits into from
Sep 24, 2019
Merged

ND-reverse #416

merged 12 commits into from
Sep 24, 2019

Commits on Sep 23, 2019

  1. Configuration menu
    Copy the full SHA
    da8960b View commit details
    Browse the repository at this point in the history
  2. removed a redundant variable

    kraftpunk97 authored and maleadt committed Sep 23, 2019
    Configuration menu
    Copy the full SHA
    6149161 View commit details
    Browse the repository at this point in the history
  3. Simplified kernel

    kraftpunk97 authored and maleadt committed Sep 23, 2019
    Configuration menu
    Copy the full SHA
    21d44f8 View commit details
    Browse the repository at this point in the history
  4. Whitespace fixes.

    maleadt committed Sep 23, 2019
    Configuration menu
    Copy the full SHA
    27bbc33 View commit details
    Browse the repository at this point in the history
  5. More compact testing.

    maleadt committed Sep 23, 2019
    Configuration menu
    Copy the full SHA
    0fcbf0c View commit details
    Browse the repository at this point in the history
  6. Clean-up reverse kernels.

    maleadt committed Sep 23, 2019
    Configuration menu
    Copy the full SHA
    e9d4abb View commit details
    Browse the repository at this point in the history
  7. Drop specific typing.

    maleadt committed Sep 23, 2019
    Configuration menu
    Copy the full SHA
    008760f View commit details
    Browse the repository at this point in the history
  8. Unify kernels.

    maleadt committed Sep 23, 2019
    Configuration menu
    Copy the full SHA
    0893ba3 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    b7a7556 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    0ad5491 View commit details
    Browse the repository at this point in the history
  11. Unify in-place and out-of-place kernels.

    Using shared memory for out-of-place is better since that enables memory coalescing on the output write.
    maleadt committed Sep 23, 2019
    Configuration menu
    Copy the full SHA
    ba586b9 View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2019

  1. Drop the shared memory, use separate inplace/outofplace kernels.

    Wrt. shared memory: newer hardware can coalesce both forward and backward access patterns,
    which we would get without shared memory in the 1D case, while for the ND case we don't
    get either of those if just writing backwards in the shared memory buffer.
    
    Wrt. separate kernels: the in-place version would trample over values to be read by other blocks.
    Grid synchronization is much to costly, so just swap numbers on half the number of threads.
    maleadt committed Sep 24, 2019
    Configuration menu
    Copy the full SHA
    1966169 View commit details
    Browse the repository at this point in the history