Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tools for ROCm 5.6.1 [14.0.x] #9144

Open
wants to merge 1 commit into
base: IB/CMSSW_14_0_X/master
Choose a base branch
from

Conversation

fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Apr 17, 2024

Add amd-smi and ROCProfiler binaries and libraries.

@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 17, 2024

please test

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fwyzard for branch IB/CMSSW_14_0_X/master.

@iarspider, @aandvalenzuela, @smuzaffar can you please review it and eventually sign? Thanks.
@antoniovilela, @rappoccio, @sextonkennedy you are the release manager for this.
cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 17, 2024

cms-bot internal usage

@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 17, 2024

This is a partial backport of #9143, adding the same new tools but without updating the version of ROCm.

@fwyzard fwyzard changed the title Update tools for ROCm 5.6.1 Update tools for ROCm 5.6.1 [14.0.x] Apr 17, 2024
@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-549774/38909/summary.html
COMMIT: 90ed9c6
CMSSW: CMSSW_14_0_X_2024-04-17-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9144/38909/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

Source19: https://%{repository}/%{repoversion}/main/rocprofiler-plugins-2.0.0.50601-93.el%{rhel}.%{_arch}.rpm
Source20: https://%{repository}/%{repoversion}/main/rocprofiler-samples-2.0.0.50601-93.el%{rhel}.%{_arch}.rpm
Source21: https://%{repository}/%{repoversion}/main/amd-smi-lib-1.0.0.50601-93.el%{rhel}.%{_arch}.rpm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fwyzard , we also need to to extract these new sources in the %build section ... right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦🏻‍♂️ yes

Add amd-smi and ROCProfiler binaries and libraries.
@fwyzard fwyzard force-pushed the IB/CMSSW_14_0_X/master_rocm_updates branch from 90ed9c6 to 3cfcc7c Compare April 18, 2024 17:27
@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 18, 2024

please test

@cmsbuild
Copy link
Contributor

Pull request #9144 was updated.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-549774/38941/summary.html
COMMIT: 3cfcc7c
CMSSW: CMSSW_14_0_X_2024-04-18-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9144/38941/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-549774/38941/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-549774/38941/git-merge-result

Comparison Summary

Summary:

  • You potentially removed 93 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 3138 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3318384
  • DQMHistoTests: Total failures: 206
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3318158
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 287.00399999999996 KiB( 47 files compared)
  • DQMHistoSizes: changed ( 10224.0,... ): -2.742 KiB Physics/Top
  • DQMHistoSizes: changed ( 23234.0,... ): 3.979 KiB HGCalHitCalibrationHLT/hgcal_photon_EoP_CPene_scint_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.977 KiB HGCalHitCalibrationHLT/hgcal_photon_EoP_CPene_100_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.977 KiB HGCalHitCalibrationHLT/hgcal_photon_EoP_CPene_200_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.977 KiB HGCalHitCalibrationHLT/hgcal_photon_EoP_CPene_300_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.976 KiB HGCalHitCalibrationHLT/hgcal_ele_EoP_CPene_scint_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.974 KiB HGCalHitCalibrationHLT/hgcal_ele_EoP_CPene_100_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.974 KiB HGCalHitCalibrationHLT/hgcal_ele_EoP_CPene_200_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.974 KiB HGCalHitCalibrationHLT/hgcal_ele_EoP_CPene_300_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.972 KiB HGCalHitCalibrationHLT/hgcal_EoP_CPene_scint_calib_fraction
  • DQMHistoSizes: changed ( 23234.0 ): ...
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@smuzaffar
Copy link
Contributor

+externals

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next IB/CMSSW_14_0_X/master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @rappoccio, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)

@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 22, 2024

@cms-sw/orp-l2 could you merge this for the next 14.0.x release ?

Hopefully amd-smi can provide the same information as NVML for AMD GPUs.

@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 24, 2024

hold

@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 24, 2024

given the issues with rocprofiler in ROCm 6.1, let's wait on this

@cmsbuild
Copy link
Contributor

Pull request has been put on hold by @fwyzard
They need to issue an unhold command to remove the hold state or L1 can unhold it for all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants