Add unit test for Run 3 Scouting formats #41834

wddgit · 2023-05-31T21:17:32Z

PR description:

Add a new unit test for the Run 3 Scouting data formats. This generates a data file containing Run 3 Scouting objects with known content. Then it reads it. It verifies that when we read it we obtain values that match the known written values for all the data fields in the objects. In particular all containers have content so all contained types are also read.

It also reads the old files in the DataFormats-Scouting data repository which are listed in the shell script. The plan is that each time a Run 3 Scouting data format is modified an additional file will added. The data files required for this PR were generated with releases 12_0_4 (2022 data) and 13_0_3 (2023 data).

There are 10 events with content and vector sizes varying from event to event.

PR validation:

This only adds a unit test in DataFormats/Scouting which passes. It shouldn't affect anything else.

Note that the test will only pass if the cms-data/DataFormats-Scouting#1 pull request is merged and that external update is used. Waiting to merge this PR until a couple days after the data PR is merged might avoid some syncing difficulties in IBs and PR tests.

cmsbuild · 2023-05-31T21:23:23Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41834/35730

This PR adds an extra 28KB to repository

cmsbuild · 2023-05-31T21:23:52Z

A new Pull Request was created by @wddgit (W. David Dagenhart) for master.

It involves the following packages:

DataFormats/Scouting (core)

@cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please review it and eventually sign? Thanks.
@missirol, @rovere this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

wddgit · 2023-05-31T21:28:15Z

please test with cms-data/DataFormats-Scouting#1

cmsbuild · 2023-06-01T00:39:50Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a96aee/32912/summary.html
COMMIT: 084a0b8
CMSSW: CMSSW_13_2_X_2023-05-31-1100/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/41834/32912/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

@smuzaffar [UBSAN] fix zero size array runtime error #41771
@smuzaffar [RECONSTRUCTION-XPOG] clang-tidy Performance inefficient vector operation suggestion applied #41764
@bsunanda Run3-gex162 Use edm::LogVerbatim instead of cout in Geometry/RPCGeometry #41788
@aloeliger Add missing L1T phase 2 customizations #41780
@bsunanda Run3-242X Remove the source code of one obsolete AlCaReco producer: HcalCalIsolatedBunchFilter #41791
@smorovic [DAQ] improved file size/buffer size checks in DAQ input source (13_2_X) #41814
@mmusich Online DQM configure timeThreshold appropriately when running in 'offline' mode #41816
@VinInn Speedup HCAL Digitization #41820
@civanch Fix compilation warning for SimG4CMS/FP420 #41826
@smuzaffar [RECONSTRUCTION-UPGRADE] clang-tidy Performance inefficient vector opertion suggestion applied #41762

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a96aee/32912/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a96aee/32912/git-merge-result

Comparison Summary

Summary:

You potentially added 11 lines to the logs
Reco comparison results: 8 differences found in the comparisons
DQMHistoTests: Total files compared: 48
DQMHistoTests: Total histograms compared: 3221457
DQMHistoTests: Total failures: 10
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3221425
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
Checked 207 log files, 159 edm output root files, 48 DQM output files
TriggerResults: no differences found

wddgit · 2023-06-01T15:30:24Z

Continuing the conversation from #41631 (comment). We were discussing why it was necessary to add the call to clear the vector in the iorules for Run3ScoutingElectron when its format was changed.

I did a couple experiments with the new test in this PR. If I temporarily removed the call to clear() from the iorules for Run3ScoutingElectron and run the new test it still completes successfully. In this case it is reading with a module in cmsRun. It is not included in this PR, but I also did a separate test where I read the older format from 12_4_0 with FWLite in a script using the C++-like interface to FWLite. Again, the test passed even if I removed clear(). I didn't notice any difference, neither cmsRun nor FWLite needed to clear in my experiments...

I'm not worried about calling clear. Clearing an already empty vector shouldn't cause any problems. And I expect the performance cost would be negligible. I'm just a little worried why it was needed and I don't like it that we don't understand it... Maybe my test doesn't meet all the requirements to trigger the problem or maybe it is because the test where the problem was observed used the python interface or something is wrong in ROOT or maybe there is some other reason.

That is as far as I got with this. If you want me to pursue this further, let me know. Without taking a deep dive into ROOT or the python interface, I'm not sure what else I can do. Maybe someone else is already looking at this and I should just move on...

Note the new test in this PR does use 10 events with varying content (which was the suggestion that started this discussion).

makortel · 2023-06-01T18:22:47Z

DataFormats/Scouting/README.md

+* a file written by the same release
+* files written by (some) earlier releases
+
+If the persistent format of any Scouting data format gets changed in the future, please adjust the `TestReadRun3Scouting` and `TestWriteRun3Scouting` modules accordingly. It is important that every member container has some content in this test. Please also add a new file to the [https://github.com/cms-data/DataFormats-Scouting/](https://github.com/cms-data/DataFormats-Scouting/) repository, and update the `TestRun3ScoutingDataFormats` unit test to read the newly created file. The file name should contain the release or pre-release with which it was written. There will probably be analogous tests added in the future for run 2 and future runs which will need similar maintenance, although it is unlikely the run 2 formats will change anymore.


I'd update the text with the outcome of the discussion in cms-data/DataFormats-Scouting#1 (comment) . I'd also add here a comment along "if the latest file of Run 3 scouting before the update has not been used in data taking, the file can be deleted".

makortel · 2023-06-01T18:25:17Z

DataFormats/Scouting/test/TestRun3ScoutingFormats.sh

+# 12_4_0 was chosen because that release was used for 2022 data.
+# 13_0_3 is being used for 2023 data.


How about something along

Suggested change

# 12_4_0 was chosen because that release was used for 2022 data.

# 13_0_3 is being used for 2023 data.

# 12_4_0 was chosen because the 12_4_X release cycle was used for 2022 data, and the scouting data formats were not updated after 12_4_0.

# 13_0_3 was chosen because the 13_0_X release cycle was used for 2023 data, and the scouting data formats and ROOT were not updated after 13_0_3.

(in order to record the history a bit more accurately)?

makortel · 2023-06-01T18:27:30Z

DataFormats/Scouting/test/TestReadRun3Scouting.cc

+    // to check that values read from persistent storage match the values
+    // we know were written.
+
+    std::vector<double> expectedCaloJetsValues_;


Seems like all of these could be const

I made all the data members const in both modules.

makortel · 2023-06-01T18:28:11Z

DataFormats/Scouting/test/TestReadRun3Scouting.cc

+    if (expectedCaloJetsValues_.size() != 16) {
+      throwWithMessage("analyzeCaloJets, test configuration error, expectedCaloJetsValues must have size 16");
+    }


Looks like this check (also elsewhere where the pattern is used) could be moved to constructor.

I moved them to the constructor. Thanks.

cmsbuild · 2023-06-01T22:16:22Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41834/35750

This PR adds an extra 28KB to repository
Found files with invalid states:
- DataFormats/Scouting/test/create_Run3Scouting_test_file.py:
  - Added: 8503fc9
  - Modified: 21991a1
  - Deleted: e5ba4b3

wddgit · 2023-06-02T15:43:44Z

I just force pushed a few more changes. It only affected file TestWriteRun3Scouting.cc. There was an error in the exception message. I fixed that and also simplified it by adding a function to throw the exception which removed a little duplication. And I squashed it into the last commit.

cmsbuild · 2023-06-02T15:46:24Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41834/35758

This PR adds an extra 28KB to repository
Found files with invalid states:
- DataFormats/Scouting/test/create_Run3Scouting_test_file.py:
  - Added: 8503fc9
  - Modified: 21991a1
  - Deleted: 2844104

cmsbuild · 2023-06-02T15:46:49Z

Pull request #41834 was updated. @cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please check and sign again.

wddgit · 2023-06-02T15:49:46Z

please test with cms-data/DataFormats-Scouting#1

cmsbuild · 2023-06-02T19:10:07Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a96aee/32953/summary.html
COMMIT: 2844104
CMSSW: CMSSW_13_2_X_2023-06-02-1100/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/41834/32953/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

@eyigitba New EMTF BDT pT scale for 2023 #41835

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a96aee/32953/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a96aee/32953/git-merge-result

Comparison Summary

Summary:

You potentially added 16 lines to the logs
Reco comparison results: 6 differences found in the comparisons
DQMHistoTests: Total files compared: 48
DQMHistoTests: Total histograms compared: 3221457
DQMHistoTests: Total failures: 6
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3221429
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
Checked 207 log files, 159 edm output root files, 48 DQM output files
TriggerResults: no differences found

makortel · 2023-06-03T22:09:16Z

+core

cmsbuild · 2023-06-03T22:09:41Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

perrotta · 2023-06-04T08:11:41Z

DataFormats/Scouting/README.md

+* a file written by the same release
+* files written by (some) earlier releases
+
+If the persistent format of any Scouting data format gets changed in the future, please adjust the `TestReadRun3Scouting` and `TestWriteRun3Scouting` modules accordingly. It is important that every member container has some content in this test. Please also add a new file to the [https://github.com/cms-data/DataFormats-Scouting/](https://github.com/cms-data/DataFormats-Scouting/) repository, and update the `TestRun3ScoutingDataFormats` unit test to read the newly created file. The file name should contain the version numbers of the data format classes (from classes_def.xml) in alphabetical order and the release or pre-release with which it was written. If the latest file of Run 3 scouting before the update has not been used in data taking, the file can be deleted.


It wouldn't be bad to also rearrange classes_def.xml in order to have the classes listed there in alphabetical order, in order to avoid some head hache to who's trying to associate the versions in the name of the data file to the classes.

Thanks @perrotta for the suggestion, I agree a matching order in the classes_def.xml would help. @wddgit Could you take care of it (either as part of the Run2 scouting PR, or in a separate PR)?

I'll include it in the Run 2 Scouting PR which I am currently implementing. A good idea.

perrotta · 2023-06-04T08:17:52Z

+1

wddgit added 2 commits May 30, 2023 23:02

Implement Run3 Scouting test, CMSSW_12_4_0 version.

8503fc9

Run 3 Scouting Test CMSSW_13_0_3 version

21991a1

cmsbuild added this to the CMSSW_13_2_X milestone May 31, 2023

cmsbuild added core-pending pending-signatures tests-pending orp-pending code-checks-pending labels May 31, 2023

wddgit mentioned this pull request May 31, 2023

Add initial data files cms-data/DataFormats-Scouting#1

Merged

cmsbuild added code-checks-approved and removed code-checks-pending labels May 31, 2023

cmsbuild added tests-started requires-external and removed tests-pending labels May 31, 2023

cmsbuild added tests-approved and removed tests-started labels Jun 1, 2023

makortel reviewed Jun 1, 2023

View reviewed changes

wddgit force-pushed the run3ScoutingUnitTest branch from 084a0b8 to e5ba4b3 Compare June 1, 2023 22:10

cmsbuild added tests-pending code-checks-pending and removed tests-approved requires-external code-checks-approved labels Jun 1, 2023

cmsbuild removed the code-checks-pending label Jun 1, 2023

cmsbuild added the code-checks-pending label Jun 2, 2023

cmsbuild added code-checks-approved and removed code-checks-pending labels Jun 2, 2023

cmsbuild added tests-started requires-external and removed tests-pending labels Jun 2, 2023

cmsbuild added tests-approved and removed tests-started labels Jun 2, 2023

makortel mentioned this pull request Jun 3, 2023

For Run3 Scouting classes do similar test as for RAW cms-sw/framework-team#544

Closed

cmsbuild added core-approved fully-signed and removed core-pending pending-signatures labels Jun 3, 2023

perrotta reviewed Jun 4, 2023

View reviewed changes

cmsbuild added orp-approved and removed orp-pending labels Jun 4, 2023

cmsbuild merged commit ea94ee8 into cms-sw:master Jun 4, 2023

cmsbuild mentioned this pull request Jun 4, 2023

Utilities/StaticAnalyzers: updates to address changes in number of reports with llvm 14/16 #41861

Merged

wddgit mentioned this pull request Jun 14, 2023

Remove obsolete Scouting Data Format unit tests #41964

Merged

makortel mentioned this pull request Jun 14, 2023

Add tests or workflows that read existing scouting data #41040

Closed

wddgit deleted the run3ScoutingUnitTest branch November 17, 2023 23:20

mmusich mentioned this pull request Sep 5, 2024

Failure in reading back 12_4_HLT_X Run3ScoutingVertex collection from CMSSW_14_0_X #45888

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add unit test for Run 3 Scouting formats #41834

Add unit test for Run 3 Scouting formats #41834

wddgit commented May 31, 2023

cmsbuild commented May 31, 2023

cmsbuild commented May 31, 2023

wddgit commented May 31, 2023

cmsbuild commented Jun 1, 2023

wddgit commented Jun 1, 2023

makortel Jun 1, 2023

wddgit Jun 1, 2023

makortel Jun 1, 2023

wddgit Jun 1, 2023

makortel Jun 1, 2023

wddgit Jun 1, 2023

makortel Jun 1, 2023

wddgit Jun 1, 2023

cmsbuild commented Jun 1, 2023

wddgit commented Jun 2, 2023

cmsbuild commented Jun 2, 2023

cmsbuild commented Jun 2, 2023

wddgit commented Jun 2, 2023

cmsbuild commented Jun 2, 2023

makortel commented Jun 3, 2023

cmsbuild commented Jun 3, 2023

perrotta Jun 4, 2023

makortel Jun 5, 2023

wddgit Jun 5, 2023

perrotta commented Jun 4, 2023

		# 12_4_0 was chosen because that release was used for 2022 data.
		# 13_0_3 is being used for 2023 data.

Add unit test for Run 3 Scouting formats #41834

Add unit test for Run 3 Scouting formats #41834

Conversation

wddgit commented May 31, 2023

PR description:

PR validation:

cmsbuild commented May 31, 2023

cmsbuild commented May 31, 2023

wddgit commented May 31, 2023

cmsbuild commented Jun 1, 2023

Comparison Summary

wddgit commented Jun 1, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmsbuild commented Jun 1, 2023

wddgit commented Jun 2, 2023

cmsbuild commented Jun 2, 2023

cmsbuild commented Jun 2, 2023

wddgit commented Jun 2, 2023

cmsbuild commented Jun 2, 2023

Comparison Summary

makortel commented Jun 3, 2023

cmsbuild commented Jun 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

perrotta commented Jun 4, 2023