Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nano: fix array size branch type, support 16 bit ints, use more 8 or 16 bit integers #40478

Merged
merged 10 commits into from
Jan 16, 2023

Conversation

swertz
Copy link
Contributor

@swertz swertz commented Jan 11, 2023

PR description:

  • It was brought to our attention by the ROOT team that the branch type used for the size (counter) of native arrays should always be a signed integer (see here), while it was stored as an unsigned int in NanoAOD. In practice there may be few consequences, but this implements the fix to avoid any potential unwanted effects down the line.
  • Add support for 16-bit integers in NanoAOD and fix some inconsistencies for other types (uint, double) in the flat table producer
  • Use 8 and 16 bit integers instead of full-size ints where relevant (charges, indices, IDs, counters...). While compression did help in hiding the zeroed unused bits, I still noticed a few percent-level reduction in the size then using native types instead.
    • Note that I avoided using int8 for signed integers and always relied on int16 instead. This is because ROOT interprets int8 as a char, which messes up things like TTree::Draw and TTree:Scan (filling the histogram with characters instead of numbers).

PR validation:

Ran a nano matrix workflow.

@swertz
Copy link
Contributor Author

swertz commented Jan 11, 2023

enable nano

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40478/33652

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40478/33654

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @swertz (Sébastien Wertz) for master.

It involves the following packages:

  • DataFormats/NanoAOD (xpog)
  • PhysicsTools/NanoAOD (xpog)

@cmsbuild, @swertz, @vlimant can you please review it and eventually sign? Thanks.
@gpetruc, @missirol, @rovere this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@swertz
Copy link
Contributor Author

swertz commented Jan 11, 2023

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-64cd62/29912/summary.html
COMMIT: c2ff4e1
CMSSW: CMSSW_13_0_X_2023-01-11-1100/el8_amd64_gcc11
Additional Tests: NANO
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/40478/29912/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
140.01 step 3
140.03 step 3
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 1130 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3555538
  • DQMHistoTests: Total failures: 12
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3555504
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 211 log files, 162 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

NANO Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 24 differences found in the comparisons
  • DQMHistoTests: Total files compared: 11
  • DQMHistoTests: Total histograms compared: 10839
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 10833
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 10 files compared)
  • Checked 23 log files, 10 edm output root files, 11 DQM output files

Nano size comparison Summary:

Sample kb/ev ref kb/ev diff kb/ev ev/s/thd ref ev/s/thd diff rate mem/thd ref mem/thd
2500.31 2.224 2.240 -0.015 ( -0.7% ) 9.58 9.53 +0.5% 1.514 1.515
2500.311 2.315 2.330 -0.015 ( -0.6% ) 9.23 9.18 +0.5% 1.880 1.882
2500.312 2.269 2.284 -0.015 ( -0.6% ) 9.29 9.26 +0.3% 1.870 1.885
2500.33 1.095 1.101 -0.006 ( -0.5% ) 21.68 21.88 -0.9% 1.641 1.754
2500.331 1.390 1.396 -0.006 ( -0.5% ) 16.01 16.16 -0.9% 1.790 1.789
2500.332 1.321 1.328 -0.006 ( -0.5% ) 17.88 17.91 -0.2% 1.848 1.766
2500.401 2.146 2.161 -0.014 ( -0.7% ) 10.36 10.34 +0.2% 1.202 1.198
2500.501 1.711 1.726 -0.015 ( -0.9% ) 16.38 16.45 -0.5% 1.118 1.115
2500.511 1.129 1.134 -0.005 ( -0.4% ) 30.21 30.59 -1.2% 1.366 1.367
2500.601 2.059 2.073 -0.014 ( -0.7% ) 12.51 12.54 -0.2% 1.176 1.190

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40478/33686

@cmsbuild
Copy link
Contributor

Pull request #40478 was updated. @cmsbuild, @swertz, @vlimant can you please check and sign again.

@swertz
Copy link
Contributor Author

swertz commented Jan 12, 2023

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-64cd62/29952/summary.html
COMMIT: cc491f1
CMSSW: CMSSW_13_0_X_2023-01-12-1100/el8_amd64_gcc11
Additional Tests: NANO
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/40478/29952/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
11634.15 step 3
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 1132 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3555538
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3555516
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 211 log files, 162 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

NANO Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 11
  • DQMHistoTests: Total histograms compared: 10839
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 10839
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 10 files compared)
  • Checked 23 log files, 10 edm output root files, 11 DQM output files

Nano size comparison Summary:

Sample kb/ev ref kb/ev diff kb/ev ev/s/thd ref ev/s/thd diff rate mem/thd ref mem/thd
2500.31 2.229 2.240 -0.011 ( -0.5% ) 9.40 8.77 +7.3% 1.522 1.550
2500.311 2.319 2.330 -0.011 ( -0.5% ) 9.27 8.71 +6.4% 1.894 1.924
2500.312 2.273 2.284 -0.010 ( -0.5% ) 9.48 8.78 +8.0% 1.885 1.911
2500.33 1.095 1.101 -0.005 ( -0.5% ) 21.51 19.10 +12.7% 1.641 1.659
2500.331 1.390 1.396 -0.006 ( -0.5% ) 16.09 14.36 +12.1% 1.791 1.815
2500.332 1.321 1.328 -0.006 ( -0.5% ) 17.89 15.84 +12.9% 1.845 1.817
2500.401 2.152 2.161 -0.008 ( -0.4% ) 10.45 10.33 +1.1% 1.216 1.244
2500.501 1.716 1.726 -0.010 ( -0.6% ) 16.73 16.73 +0.0% 1.134 1.165
2500.511 1.129 1.134 -0.005 ( -0.4% ) 30.79 30.93 -0.5% 1.380 1.430
2500.601 2.063 2.073 -0.010 ( -0.5% ) 12.30 12.43 -1.0% 1.193 1.226

@swertz
Copy link
Contributor Author

swertz commented Jan 13, 2023

+1

(Finally) no changes in the NanoAOD content, as should be. The size gain is smaller than I expected though.

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

(Finally) no changes in the NanoAOD content, as should be. The size gain is smaller than I expected though.

Just to understand, what are the changes to nanoAOD reported in the "standard comparisons" for wf 1330 and 25202 , see https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_13_0_X_2023-01-12-1100+64cd62/54976/validateJR.html ?

@perrotta
Copy link
Contributor

Not really related to this PR, but triggered by the Static Analizer report of these tests: what is the purpose of this line in DataFormats/NanoAOD/src/classes.h?

Maybe @lgray knows, since he coded it in #36037

@swertz
Copy link
Contributor Author

swertz commented Jan 13, 2023

Just to understand, what are the changes to nanoAOD reported in the "standard comparisons" for wf 1330 and 25202 , see https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_13_0_X_2023-01-12-1100+64cd62/54976/validateJR.html ?

I think it's just because I changed the type of the columns, while the content doesn't change. Perhaps @vlimant can help interpret the output of the comparison script?

Not really related to this PR, but triggered by the Static Analizer report of these tests: what is the purpose of this line in DataFormats/NanoAOD/src/classes.h?

I was actually wondering the same thing, curious to hear from @lgray .

@vlimant
Copy link
Contributor

vlimant commented Jan 16, 2023

all "type" related differences look harmless to me.

@perrotta
Copy link
Contributor

+1

@aandvalenzuela
Copy link
Contributor

Hello @swertz, @vlimant
This PR caused some compiler errors in the latest ROOT6 IB. Find a proposed solution in #40545. Could you please have a look?
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants