Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unit test for TriggerResults format #41395

Merged
merged 1 commit into from
Apr 29, 2023

Conversation

wddgit
Copy link
Contributor

@wddgit wddgit commented Apr 24, 2023

PR description:

Add a new unit test for the TriggerResults data format. This generates a data file containing a TriggerResults object with known content. Then it reads it. It verifies that when we read it we obtain values that match the known written values for all the data fields in the object. In particular all containers have content so all contained types are also read.

It also reads the old files in the DataFormats/Common data repository which are listed in the shell script. The plan is that each time the data format of TriggerResults is modified a file will added.

PR validation:

This only adds a unit test in DataFormats/Common which passes. It shouldn't affect anything else.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41395/35262

  • This PR adds an extra 36KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @wddgit (W. David Dagenhart) for master.

It involves the following packages:

  • DataFormats/Common (core)

@cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please review it and eventually sign? Thanks.
@makortel, @missirol, @rovere this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@wddgit
Copy link
Contributor Author

wddgit commented Apr 24, 2023

please test

@wddgit
Copy link
Contributor Author

wddgit commented Apr 24, 2023

A couple of comments on related history:

The TriggerResults format has not changed since 2006. In 2014 something in ROOT changed that affected the checksum and we had to add a new version number in classes_def.html.

We have an existing unit test that is similar to this new one. It is in IOPool/Input/test/TestPoolInput.sh. See this line and what follows it: "for file in ${IOPoolInputData}/IOPool/Input/data/raw*.root". It reads an old existing file with RAW and then writes a new file. A second process reads the new converted file. There is not an analyzer that reads content, just PoolSource and PoolOutputModule. Here is the content of the old file:

Singularity> edmFileUtil file:/cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-04-18-1100/external/el8_amd64_gcc11/data/IOPool/Input/data/rawData_old_format_CMSSW_4_2_8.root -P
file:/cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-04-18-1100/external/el8_amd64_gcc11/data/IOPool/Input/data/rawData_old_format_CMSSW_4_2_8.root
file:/cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-04-18-1100/external/el8_amd64_gcc11/data/IOPool/Input/data/rawData_old_format_CMSSW_4_2_8.root (1 runs, 1 lumis, 1 events, 763640 bytes)
Branch 0 of Events tree: EventAuxiliary Total size = 757 Compressed size = 232
Branch 1 of Events tree: EventBranchEntryInfo Total size = 898 Compressed size = 209
Branch 2 of Events tree: EventSelections Total size = 681 Compressed size = 152
Branch 3 of Events tree: BranchListIndexes Total size = 647 Compressed size = 110
Branch 4 of Events tree: FEDRawDataCollection_source__HLT. Total size = 372335 Compressed size = 137624
Branch 5 of Events tree: L1GlobalTriggerObjectMapRecord_hltL1GtObjectMap__HLT. Total size = 24654 Compressed size = 3622
Branch 6 of Events tree: edmTriggerResults_TriggerResults__HLT. Total size = 6183 Compressed size = 640
Branch 7 of Events tree: triggerTriggerEvent_hltTriggerSummaryAOD__HLT. Total size = 15629 Compressed size = 4894

I think the intent was that new versions of this file would be added if RAW formats changed. That never happened, there is only the CMSSW_4_2_8 file. Although only the RAW format L1GlobalTriggerObjectMapRecord changed. It is a completely different type now apparently. The other formats have not changed.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-503e27/32113/summary.html
COMMIT: 06b1d2d
CMSSW: CMSSW_13_1_X_2023-04-24-1100/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/41395/32113/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 8 lines to the logs
  • Reco comparison results: 12 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3460685
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3460657
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 207 log files, 159 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

Copy link
Contributor

@makortel makortel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wddgit, overall looks good. Could you also create a DataFormats/Common/README.md with content along

DataFormats/Common

edm::TriggerResults

The edm::TriggerResults is part of the RAW data, and any changes must be backwards compatible. In order to ensure it can be read by all future CMSSW releases, there is a TestTriggerResultsFormat unit test, which makes use of TestReadTriggerResults analyzer and TestWriteTriggerResults producer. The unit test checks that the object can be read properly from

  • a file in the same release as it was written
  • files written by (some) earlier releases can be read

If the edm::TriggerResults gets changed in the future, please adjust the TestReadTriggerResults and TestWriteTriggerResults modules accordingly. It is important that every member container has some content in this test. Please also add a new file to https://github.com/cms-data/DataFormats-Common/ repository, and update the TestTriggerResultsFormat unit test to read the newly created file. The file name should contain the release or pre-release with which it was written.

@Dr15Jones Do you have comments on the README contents?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd record the exact release version in the file name as things can, in principle, change within a release cycle. I think we should also prefer (pre-)releases over IBs.

I'd also try to include a file with e.g. 13_0_0 so that we'd have a test file affected by #41348.

Comment on lines 13 to 17
DataFormatsCommonData=$CMSSW_BASE/src
for dir in $(echo $CMSSW_SEARCH_PATH | tr : '\n') ; do
if [ -d ${dir}/DataFormats/Common/data ] ; then
DataFormatsCommonData=${dir}
break
fi
done

for file in ${DataFormatsCommonData}/DataFormats/Common/data/oldTestTriggerResults*.root
do
cmsRun ${LOCAL_TEST_DIR}/test_readTriggerResults_cfg.py "$file" || die "Failed to read old file $file" $?
done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about listing the expected files explicitly here and using (recently added) edmFileInPath script to find the full path of DataFormats/Common/data/<file.root>? Then we'd see also failures if any of the files would be missing for any (e.g. infrastructural) reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. That is better.

@makortel
Copy link
Contributor

@smuzaffar Could we get a https://github.com/cms-data/DataFormats-Common/ repository to store the root file(s)?

@smuzaffar
Copy link
Contributor

https://github.com/cms-data/DataFormats-Common is available now

@wddgit
Copy link
Contributor Author

wddgit commented Apr 25, 2023

@smuzaffar Thanks for the new repo. What is the procedure to get a data file into that repo? Do I submit a PR against that repo or is there some other way to do that?

@smuzaffar
Copy link
Contributor

@wddgit , yes just open a PR for that repo then just test this PR along with cms-data one e.g. please test with cms-data/DataFormats-Common#1 should be enough. As @makortel wrote you can use edmFileInPath DataFormats/Common/data/<filename> to get the full path of the data file

@wddgit
Copy link
Contributor Author

wddgit commented Apr 25, 2023

I think the commit I just pushed addresses all the comments. I squashed it.

The shell script changed significantly. The data file was removed. The documentation file was added. The rest of it remained exactly the same as before.

@smuzaffar
Copy link
Contributor

test parameters:

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41395/35273

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

Pull request #41395 was updated. @cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please check and sign again.

@wddgit
Copy link
Contributor Author

wddgit commented Apr 25, 2023

please test with cms-data/DataFormats-Common#1

Note the unit test will fail unless merged at the same time as or after the data file PR.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-503e27/32143/summary.html
COMMIT: cc027c6
CMSSW: CMSSW_13_1_X_2023-04-25-1100/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/41395/32143/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-503e27/32143/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-503e27/32143/git-merge-result

Comparison Summary

Summary:

  • You potentially added 6 lines to the logs
  • Reco comparison results: 10 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3460685
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3460657
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 207 log files, 159 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@wddgit
Copy link
Contributor Author

wddgit commented Apr 26, 2023

Just for the record, I wanted to make the comment that the new test added by this PR will not be very good at detecting the problem discussed in issues #41348 and #41246. For it to detect that particular problem, a test input file would have to be written using a release with that problem (note that particular problem was associated with an internal ROOT change and not a data format change) and then later (potentially years later) if the persistent format of TriggerResults changed, then this test would detect the problem.

That said, that problem motivated us to think about RAW data format testing and implement the improvements in this PR. Given that we are supposed to guarantee backward compatibility for RAW data formats forever and the importance of RAW data, the new test in this PR is a logical unit test to have and run.

@makortel
Copy link
Contributor

+core

Needs cms-data/DataFormats-Common#1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

@cmsbuild cmsbuild merged commit 95b4d71 into cms-sw:master Apr 29, 2023
@wddgit wddgit deleted the testTriggerResults branch August 25, 2023 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants