-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unit test for Run 3 Scouting formats #41834
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41834/35730
|
A new Pull Request was created by @wddgit (W. David Dagenhart) for master. It involves the following packages:
@cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test with cms-data/DataFormats-Scouting#1 |
Continuing the conversation from #41631 (comment). We were discussing why it was necessary to add the call to clear the vector in the iorules for Run3ScoutingElectron when its format was changed. I did a couple experiments with the new test in this PR. If I temporarily removed the call to clear() from the iorules for Run3ScoutingElectron and run the new test it still completes successfully. In this case it is reading with a module in cmsRun. It is not included in this PR, but I also did a separate test where I read the older format from 12_4_0 with FWLite in a script using the C++-like interface to FWLite. Again, the test passed even if I removed clear(). I didn't notice any difference, neither cmsRun nor FWLite needed to clear in my experiments... I'm not worried about calling clear. Clearing an already empty vector shouldn't cause any problems. And I expect the performance cost would be negligible. I'm just a little worried why it was needed and I don't like it that we don't understand it... Maybe my test doesn't meet all the requirements to trigger the problem or maybe it is because the test where the problem was observed used the python interface or something is wrong in ROOT or maybe there is some other reason. That is as far as I got with this. If you want me to pursue this further, let me know. Without taking a deep dive into ROOT or the python interface, I'm not sure what else I can do. Maybe someone else is already looking at this and I should just move on... Note the new test in this PR does use 10 events with varying content (which was the suggestion that started this discussion). |
DataFormats/Scouting/README.md
Outdated
* a file written by the same release | ||
* files written by (some) earlier releases | ||
|
||
If the persistent format of any Scouting data format gets changed in the future, please adjust the `TestReadRun3Scouting` and `TestWriteRun3Scouting` modules accordingly. It is important that every member container has some content in this test. Please also add a new file to the [https://github.com/cms-data/DataFormats-Scouting/](https://github.com/cms-data/DataFormats-Scouting/) repository, and update the `TestRun3ScoutingDataFormats` unit test to read the newly created file. The file name should contain the release or pre-release with which it was written. There will probably be analogous tests added in the future for run 2 and future runs which will need similar maintenance, although it is unlikely the run 2 formats will change anymore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd update the text with the outcome of the discussion in cms-data/DataFormats-Scouting#1 (comment) . I'd also add here a comment along "if the latest file of Run 3 scouting before the update has not been used in data taking, the file can be deleted".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
# 12_4_0 was chosen because that release was used for 2022 data. | ||
# 13_0_3 is being used for 2023 data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something along
# 12_4_0 was chosen because that release was used for 2022 data. | |
# 13_0_3 is being used for 2023 data. | |
# 12_4_0 was chosen because the 12_4_X release cycle was used for 2022 data, and the scouting data formats were not updated after 12_4_0. | |
# 13_0_3 was chosen because the 13_0_X release cycle was used for 2023 data, and the scouting data formats and ROOT were not updated after 13_0_3. |
(in order to record the history a bit more accurately)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// to check that values read from persistent storage match the values | ||
// we know were written. | ||
|
||
std::vector<double> expectedCaloJetsValues_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like all of these could be const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made all the data members const in both modules.
if (expectedCaloJetsValues_.size() != 16) { | ||
throwWithMessage("analyzeCaloJets, test configuration error, expectedCaloJetsValues must have size 16"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this check (also elsewhere where the pattern is used) could be moved to constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved them to the constructor. Thanks.
084a0b8
to
e5ba4b3
Compare
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41834/35750 |
I just force pushed a few more changes. It only affected file TestWriteRun3Scouting.cc. There was an error in the exception message. I fixed that and also simplified it by adding a function to throw the exception which removed a little duplication. And I squashed it into the last commit. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41834/35758 |
Pull request #41834 was updated. @cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please check and sign again. |
please test with cms-data/DataFormats-Scouting#1 |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a96aee/32953/summary.html The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: You can see more details here: Comparison SummarySummary:
|
+core |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
* a file written by the same release | ||
* files written by (some) earlier releases | ||
|
||
If the persistent format of any Scouting data format gets changed in the future, please adjust the `TestReadRun3Scouting` and `TestWriteRun3Scouting` modules accordingly. It is important that every member container has some content in this test. Please also add a new file to the [https://github.com/cms-data/DataFormats-Scouting/](https://github.com/cms-data/DataFormats-Scouting/) repository, and update the `TestRun3ScoutingDataFormats` unit test to read the newly created file. The file name should contain the version numbers of the data format classes (from classes_def.xml) in alphabetical order and the release or pre-release with which it was written. If the latest file of Run 3 scouting before the update has not been used in data taking, the file can be deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wouldn't be bad to also rearrange classes_def.xml in order to have the classes listed there in alphabetical order, in order to avoid some head hache to who's trying to associate the versions in the name of the data file to the classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll include it in the Run 2 Scouting PR which I am currently implementing. A good idea.
+1 |
PR description:
Add a new unit test for the Run 3 Scouting data formats. This generates a data file containing Run 3 Scouting objects with known content. Then it reads it. It verifies that when we read it we obtain values that match the known written values for all the data fields in the objects. In particular all containers have content so all contained types are also read.
It also reads the old files in the DataFormats-Scouting data repository which are listed in the shell script. The plan is that each time a Run 3 Scouting data format is modified an additional file will added. The data files required for this PR were generated with releases 12_0_4 (2022 data) and 13_0_3 (2023 data).
There are 10 events with content and vector sizes varying from event to event.
PR validation:
This only adds a unit test in DataFormats/Scouting which passes. It shouldn't affect anything else.
Note that the test will only pass if the cms-data/DataFormats-Scouting#1 pull request is merged and that external update is used. Waiting to merge this PR until a couple days after the data PR is merged might avoid some syncing difficulties in IBs and PR tests.