-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unit test for trigger::TriggerEvent format #41631
Conversation
@cmsbuild , ping |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41631/35518
|
A new Pull Request was created by @wddgit (W. David Dagenhart) for master. It involves the following packages:
@cmsbuild, @missirol, @Martin-Grunewald can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test with cms-data/DataFormats-HLTReco#1 |
Note this PR was motivated by discussions after the problems discussed in issues #41246 and #41348. It does not actually detected those problems very well, but as we discussed them we concluded we needed better tests of raw data formats if we we continue to guarantee that those formats would be backward compatible forever. I already implemented similar tests for the other 3 raw data formats (see #41565, #41494, and #4139) recently. I'll continue and consider adding some for Scouting formats next. I'm looking at what already exists for Scouting at the moment. |
FYI @makortel |
Thanks, @wddgit. Due to #41040, a test was added for Scouting in #41093, but the idea was to replace it with a test in the style of the ones you are implementing (see #41040 (comment)). |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-bee537/32565/summary.html The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: Comparison SummarySummary:
|
+hlt |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
As usual, I have a naive question (not specific to this PR): would it help to have these unit tests run on more than 1 single event ? My concern comes from #41025 (comment), where we spotted an issue with the initial implementation of an ioread rule only after we tested on more than 1 event. |
@missirol can you also sign cms-data/DataFormats-HLTReco#1 ? |
+1 |
Why did reading more than 1 event help in that case? One possibility I can imagine is that some container is empty on most events and using many events ensured we hit an event where the container wasn't empty. This test guarantees all containers have content on the first event. So it wouldn't help in that case. Or was was it something else? Or we don't know why? |
Unfortunately, I don't have much insight (that's why I'm asking). More expert-level comments were in #41025 (comment). I took
appears in the 7th event, which is the 3rd event in which that collection is not empty. If I had done that check on less events, I would have missed that the Maybe this is irrelevant for this PR, and I should have asked elsewhere. I'm just trying to understand better. [1] ${CMSSW_BASE}/src/DataFormats/Scouting/test/scoutingCollectionsDumper.py -k Run3Scouting -i /eos/cms/store/user/cmsbuild/store/mc/Run3Winter21DRMiniAOD/VBFHToTauTau_M125_TuneCP5_14TeV-powheg-pythia8/GEN-SIM-DIGI-RAW/FlatPU30to80FEVT_112X_mcRun3_2021_realistic_v16-v1/270000/005b56c1-0107-46b3-9740-1c6efc559295.root.unused -v -1 -n 100 [2] cmssw41025_logs.tar.gz |
Hello,
It fails before running the Many thanks, |
It is failing to find a data file that was recently added to the cms-data/DataFormats-HLTReco repository. The regular IBs are finding the file, so it should be there and functional. Is there something different about how the ASAN IBs find files? This doesn't seem to me to be an issue with the code in the PR. Anyone know how this works? The data file was added to the data repository at approximately the same time as the PR with the new unit test. It is possible it is a timing issue and this will just work with the next ASAN IB, but I don't how this works. Does someone need to do something to update the external dependency to the new version of the data repository? |
@aandvalenzuela @wddgit this unit test was failing also in the normail IB in 2023-05-12-2300 IBs. |
Maybe in the future as I do more of these, I should let the PR with the data files get merged first. Then wait a few days, then submit the other one instead of submitting them together... |
@Dr15Jones @missirol @makortel Jumping back to the comment from @missirol 5 days ago. My understanding is this: The clear is needed because the FWLite test is reusing the objects and the schema evolution code in ROOT doesn't do that clear for new data members (a little scary for FWLite...). The clear isn't necessary in a cmsRun job because it makes a new object every event. Processing more than one event wouldn't help detect this issue in the new tests I am writing because these tests use cmsRun. If I wanted to detect this I would need to use FWLite (or something with similar behavior) in the read step of the test in addition to processing more than 1 event. Do I understand correctly? |
What I understood in #41025 (comment) is that the way the schema was evolved in the PR, the Only if the schema would have been evolved like I speculated in #41025 (comment), the I think @missirol has a point, and the tests would be able cover more potential problems if they would have 2 events with different content (thanks for pointing it out!). |
So I looked again. This is the relevant line of code: https://cmssdt.cern.ch/dxr/CMSSW/source/IOPool/Input/src/RootDelayedReader.cc#86 For cmsRun, it appears to default construct a new object every time. All the vectors would already be empty and clear would do nothing. For cmsRun it isn't needed. I didn't test this. I could experiment if you are not convinced. Possibly I am missing something. @missirol Did you see the error in a cmsRun process reading the objects? For FWLite the events wouldn't need to have different content to have the problem. A vector would just keep adding more elements each time a new event was read and the vector would grow and grow without a clear call in the iorule. |
In #41025 (comment) I understood that the actual |
I'll try some experiments. I've got to write code similar to that to write the scouting tests anyway. Maybe you are right. My understanding of what ROOT is doing internally is not very good. I'll let you know what I find. |
I think the answer is 'no'.
Overall, I have a feeling I introduced confusion (if so, I apologise). I understand only now that FWLite and [1] |
@missirol Thanks for the clarification. It will take a few days for me to work through and implement these new tests. Let's continue this conversation in that PR when it is submitted and I've experimented a bit with it. |
PR description:
Add a new unit test for the trigger::TriggerEvent raw data format. This generates a data file containing a trigger::TriggerEvent object with known content. Then it reads it. It verifies that when we read it we obtain values that match the known written values for all the data fields in the object. In particular all containers have content so all contained types are also read.
It also reads the old files in the DataFormats/HLTReco data repository which are listed in the shell script. The plan is that each time the data format of trigger::TriggerEvent is modified a file will added.
PR validation:
This only adds a unit test in DataFormats/HLTReco which passes. It shouldn't affect anything else.