Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multi-dimensional reduction and refactor cuTENSOR support #2430

Merged
merged 6 commits into from
Mar 21, 2024

Conversation

tbennun
Copy link
Contributor

@tbennun tbennun commented Feb 29, 2024

This PR adds a new layer that performs reductions on specific tensor dimensions. For training (i.e., backpropagation), only the "sum" operation is supported.

Fixes Issue #2429.

@tbennun tbennun requested a review from benson31 February 29, 2024 03:55
@tbennun tbennun mentioned this pull request Feb 29, 2024
@tbennun tbennun requested a review from bvanessen March 2, 2024 01:11
Copy link
Collaborator

@benson31 benson31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the include suggestion, but then :shipit: !


#include <cuda_runtime.h>
#include <cutensor.h>

/**
* The interface below is designed for CUTENSOR v1.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... should we update to v2? Separate PR obviously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to, it’s more capable but we’re not using those capabilities

include/lbann/utils/cutensor_support.hpp Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this is the right place for these utilities. The RowMajor/ColMajor stuff is largely setup for the cuTENSOR/cuTT stuff. I might prefer src/layers/{helpers,common,utils} or something like any of those? On the other hand, there's a lot of stuff in this directory that's probably a bit misfiled, so it's not unreasonable to leave it here.

@tbennun
Copy link
Contributor Author

tbennun commented Mar 11, 2024

@bvanessen tests are passing

@tbennun tbennun merged commit 811af60 into LLNL:develop Mar 21, 2024
1 check passed
@tbennun tbennun deleted the multidim-reduction branch March 21, 2024 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants