-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jmm/one buffer #494
Jmm/one buffer #494
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious to see the fix to the unit test, but once that's in we should merge
// This wastes about 1/2 a meshblock in memory | ||
coarse_s = ParArrayND<T>(Kokkos::subview(comm_data_, offset++, Kokkos::ALL(), | ||
Kokkos::ALL(), Kokkos::ALL(), Kokkos::ALL(), | ||
Kokkos::ALL(), Kokkos::ALL())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting around the wasted memory would probably mean allocating from fluxes and this coarse data from one long array, right? However, I don't think Kokkos supports changing layouts like that with a subview.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I tried that actually. Kokkos yelled at me because types.
} | ||
n_outer += (pmb->pmy_mesh->multilevel); | ||
comm_data_ = ParArray7D<T>(base_name + ".comm_data", n_outer, GetDim(6), GetDim(5), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is comm_data_
only the collection of 4 arrays - the 3 fluxes and the coarse array? Does the benefit of combining these allocations outweight the cost of wasted memory for coarse_s
? Could coarse_s
remain it's own array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is less relevant for this PR, but is allocating space for all the fluxes necessary? Could the flux divergence be computed in each dimension one by one so that only one flux array is needed in memory? If possible, this would allow larger problem sizes but might be too big a code change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is comm_data_ only the collection of 4 arrays - the 3 fluxes and the coarse array? Does the benefit of combining these allocations outweight the cost of wasted memory for coarse_s? Could coarse_s remain it's own array?
That could be done. I'm not sure what the right trade-off is. I would say we accept the memory overhead for now, and we can split out the coarse buffer later if we desire.
This is less relevant for this PR, but is allocating space for all the fluxes necessary? Could the flux divergence be computed in each dimension one by one so that only one flux array is needed in memory? If possible, this would allow larger problem sizes but might be too big a code change.
This is an interesting idea. I agree not for this PR, but something to think about.
… meshblock pointer isn't available
PR Summary
Built on the discussions in the hackathon and on @pgrete 's unification of communication buffers in #493 . This combines the flux and coarse cell buffers into a single kokkos view. A couple of things to note:
LayoutRight
in our kokkos views.I haven't done any profiling yet.
PR Checklist