Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possibility for a CopyToHost::postCopy() operation #45801

Merged
merged 2 commits into from
Sep 4, 2024

Conversation

makortel
Copy link
Contributor

@makortel makortel commented Aug 26, 2024

PR description:

Following #45708 (comment) this PR adds the possibility for a CopyToHost<T>::postCopy() function, that is called by the implicit data product device-to-host copy operation after the copy has finished (but only if the postCopy() function is defined). This facility allows data products that need to be updated after a memcpy() (e.g. because they have pointers to itself) to be used without blocking synchronization calls in CopyToHost::copyAsync(). I expect (hope) we will ever have only at most few such data products.

The first commit has a C++17 SFINAE -based solution for checking if the CopyToHost<T>::postCopy() exists (that can be backported to 14_0_X), and the second commit replaces that with C++20 concepts-based solution.

Resolves cms-sw/framework-team#989

PR validation:

Added unit test runs on CPU-only and NVIDIA GPU nodes.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

To be backported to 14_1_X, and to 14_0_X (first commit only). Following #45708 (comment) not to be backported.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 26, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45801/41540

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @makortel for master.

It involves the following packages:

  • DataFormats/PortableTestObjects (heterogeneous)
  • HeterogeneousCore/AlpakaCore (heterogeneous)
  • HeterogeneousCore/AlpakaInterface (heterogeneous)
  • HeterogeneousCore/AlpakaTest (heterogeneous)

@cmsbuild, @fwyzard, @makortel can you please review it and eventually sign? Thanks.
@missirol, @mmusich, @rovere this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

enable gpu

@makortel
Copy link
Contributor Author

@cmsbuild, please test

@makortel
Copy link
Contributor Author

type -changes-dataformats

The added data format classes are transient (and part of testing suite).

@cmsbuild
Copy link
Contributor

+1

Size: This PR adds an extra 40KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-824152/41145/summary.html
COMMIT: 6c357de
CMSSW: CMSSW_14_1_X_2024-08-26-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/45801/41145/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 5 lines from the logs
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 44
  • DQMHistoTests: Total histograms compared: 3328202
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3328179
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 43 files compared)
  • Checked 191 log files, 161 edm output root files, 44 DQM output files
  • TriggerResults: no differences found

GPU Comparison Summary

Summary:

@cmsbuild
Copy link
Contributor

Milestone for this pull request has been moved to CMSSW_14_2_X. Please open a backport if it should also go in to CMSSW_14_1_X.

@cmsbuild cmsbuild modified the milestones: CMSSW_14_1_X, CMSSW_14_2_X Aug 27, 2024
@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2024

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45801/41622

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2024

Pull request #45801 was updated. @cmsbuild, @fwyzard, @makortel can you please check and sign again.

@makortel
Copy link
Contributor Author

makortel commented Sep 3, 2024

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2024

+1

Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-824152/41265/summary.html
COMMIT: 0db641f
CMSSW: CMSSW_14_2_X_2024-09-03-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/45801/41265/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 3 lines to the logs
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 44
  • DQMHistoTests: Total histograms compared: 3328315
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3328295
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 43 files compared)
  • Checked 193 log files, 163 edm output root files, 44 DQM output files
  • TriggerResults: no differences found

GPU Comparison Summary

Summary:

@fwyzard
Copy link
Contributor

fwyzard commented Sep 4, 2024

+heterogeneous

@makortel I'm fine with the changes and with this approach.

I'm wondering if there could be a more self-contained approach, where the data structure could define a postCopy() or finalize() method that is automatically called, instead of adding the static postCopy() function in CopyToHost.
But that kind of clashes with the fact that SoAs don't generally have methods :-/

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 4, 2024

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @mandrenguyen, @rappoccio, @sextonkennedy, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)

@mandrenguyen
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit b5929cd into cms-sw:master Sep 4, 2024
14 checks passed
@makortel makortel deleted the alpakaCopyPostCopy branch September 4, 2024 14:31
@makortel
Copy link
Contributor Author

makortel commented Sep 4, 2024

I'm wondering if there could be a more self-contained approach, where the data structure could define a postCopy() or finalize() method that is automatically called, instead of adding the static postCopy() function in CopyToHost.
But that kind of clashes with the fact that SoAs don't generally have methods :-/

The thing next closest to a method would be a standalone function. Just to remind, with CopyToHost I started with just function overloads, but rejected that approach when it became evident that inheritance would be a tolerated pattern in crafting the data format classes (because a function copyToHost(T const&) would bind fine for all types D that inherit from T, and I wanted those cases to result a compilation error instead because the function must be different for D). Therefore I did not consider standalone functions for this case.

Within the class template specialization approach I considered making a separate (say PostCopy<T>) class template, because I felt the operation would be a property of T rather than of "copying T to host". But then I realized the operation needs to be different for copy-to-host and copy-to-device (at minimum copy-to-device needs to be a kernel call), and furthermore for copy-to-device the modification kernel (or memset()) can be launched from the CopyToDevice<T>::copyAsync() directly (or in case of a kernel probably via a host-side function called from there).

Given that the postCopy() is, in the end, tied to the CopyToHost operation, and that I hope the postCopy() won't be needed widely, I opted in adding it to the CopyToHost class template specialization. At least the two parts of the copy operation end up close together in the code.

These decisions can, of course, be revisited if my assumptions above turn out to be wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add postCopy() option to CopyToHost and CopyToDevice class templates
4 participants