Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of extended ParticleNet trainings for simultaneous jet flavor tagging, pT regression, and tau ID + reconstruction #40745

Merged
merged 25 commits into from
Apr 3, 2023

Conversation

scooperstein
Copy link

@scooperstein scooperstein commented Feb 10, 2023

This PR adds a set of ParticleNet-based networks and infers them on the relevant jet collections. These networks are significant extensions of the ParticleNet jet flavor classifiers that have been included in CMSSW so far. The new AK4 network performs simultaneously jet flavor classification, jet pT regression, and hadronic tau ID and reconstruction (charge, DM). The new AK8 network performs jet classification and mass regression simultaneously, while also including resonance decays to merged hadronic taus for the first time.

The AK4 networks are split by jet eta (central vs. forward) and by jet source (CHS vs. PUPPI). The CHS training is needed for studies of using ParticleNet for hadronic tau ID and reconstruction (TAU), while the PUPPI training is intended for the jet classification and pT regression (JME, BTV). We intend to unify these tasks to run on only one jet source (PUPPI) in the future, but this solution was agreed upon with the L2 POG conveners in the interim because current PUPPI tunes are inefficient for seeding hadronic taus. The networks are also split by jet eta because the scope of the forward network is highly reduced, allowing for a lighter and faster network architecture for the forward jet inferences. The AK8 network takes PUPPI jets as a source and is only inferred for central AK8 jets.

Note that the input features for these networks are dependent on miniAOD input collections, therefore they can only be inferred on the miniAOD.

This commit is dependent on the model files as committed in this PR: cms-data/RecoBTag-Combined#50. This PR has been tested by running the added tasks on a single tt file and checking the outputs of the network evaluations.

Additional information on these networks available in recent presentations:
at JME: https://indico.cern.ch/event/1220368/contributions/5141019/attachments/2548768/4389765/cooperstein_ParticleNet_nov162022.pdf
and at the Tau POG: https://indico.cern.ch/event/1223165/contributions/5158297/attachments/2561047/4414677/cooperstein_ParticleNetPlusTau_dec62022.pdf
In addition, an analysis note AN-22-094 (http://cms.cern.ch/iCMS/jsp/openfile.jsp?tp=draft&files=AN2022_094_v2.pdf) describes these networks in detail.

@rgerosa

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40745/34157

  • This PR adds an extra 20KB to repository

  • Found files with invalid states:

    • RecoBTag/ONNXRuntime/python/pfParticleNetFromMiniAOD_cff.py:

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@mandrenguyen
Copy link
Contributor

@scooperstein As it says above, you can need to apply the code format patch with scram b code-format

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40745/34184

  • This PR adds an extra 16KB to repository

  • Found files with invalid states:

    • RecoBTag/ONNXRuntime/python/pfParticleNetFromMiniAOD_cff.py:

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @scooperstein for master.

It involves the following packages:

  • RecoBTag/FeatureTools (reconstruction)
  • RecoBTag/ONNXRuntime (reconstruction)

@cmsbuild, @mandrenguyen, @clacaputo can you please review it and eventually sign? Thanks.
@AlexDeMoor, @emilbols, @JyothsnaKomaragiri, @AnnikaStein, @missirol, @hqucms, @andrzejnovak, @demuller this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@scooperstein
Copy link
Author

@scooperstein As it says above, you can need to apply the code format patch with scram b code-format

Hi @mandrenguyen, yes sorry about that. I ran the code formatting and I have updated the PR.

@mandrenguyen
Copy link
Contributor

test parameters:
pull_request = cms-data/RecoBTag-Combined#50

@mandrenguyen
Copy link
Contributor

@scooperstein can you please update the PR title to be a bit more informative?

@scooperstein scooperstein changed the title Particlenet cmssw130 Integration of extended ParticleNet trainings for simultaneous jet flavor tagging, pT regression, and tau ID + reconstruction Feb 15, 2023
@scooperstein
Copy link
Author

scooperstein commented Feb 15, 2023

@scooperstein can you please update the PR title to be a bit more informative?

done, hopefully this is better :) Just to add, if it useful I can also add links to various presentations outlining the object performance deliverables from these trainings. The implementation of these networks has also given the go-ahead by the Tau POG and JME conveners.

@mandrenguyen
Copy link
Contributor

@scooperstein Yes, please link any material in the PR description. It doesn't appear that this code is classification is actually being executed in any workflow. Shouldn't we test this?

@mandrenguyen
Copy link
Contributor

type jetmet, btv

@cmsbuild
Copy link
Contributor

Pull request #40745 was updated. @swertz, @vlimant, @clacaputo, @cmsbuild, @simonepigazzini, @mandrenguyen can you please check and sign again.

@swertz
Copy link
Contributor

swertz commented Mar 27, 2023

please test

Another test after rebasing, to sort out these differences in jet variables

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-470863/31621/summary.html
COMMIT: 55daf9d
CMSSW: CMSSW_13_1_X_2023-03-27-1100/el8_amd64_gcc11
Additional Tests: NANO
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/40745/31621/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-470863/31621/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-470863/31621/git-merge-result

Comparison Summary

Summary:

  • You potentially removed 3 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 6285 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3554092
  • DQMHistoTests: Total failures: 113
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3553957
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 42.804 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 11834.0,... ): 5.188 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 13234.0,... ): 3.244 KiB Physics/NanoAODDQM
  • Checked 213 log files, 164 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

NANO Comparison Summary

Summary:

  • You potentially added 59 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 159 differences found in the comparisons
  • DQMHistoTests: Total files compared: 11
  • DQMHistoTests: Total histograms compared: 10737
  • DQMHistoTests: Total failures: 169
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 10568
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 47.56600000000001 KiB( 10 files compared)
  • DQMHistoSizes: changed ( 2500.311,... ): 7.900 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 2500.331,... ): 5.956 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 2500.401,... ): 0.228 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 2500.511 ): 0.354 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 2500.601 ): 5.188 KiB Physics/NanoAODDQM
  • Checked 23 log files, 10 edm output root files, 11 DQM output files

Nano size comparison Summary:

Sample kb/ev ref kb/ev diff kb/ev ev/s/thd ref ev/s/thd diff rate mem/thd ref mem/thd
2500.31 2.333 2.224 0.109 ( +4.9% ) 5.76 9.63 -40.2% 1.568 1.485
2500.311 2.440 2.323 0.117 ( +5.0% ) 5.16 9.25 -44.3% 1.941 1.859
2500.312 2.391 2.277 0.114 ( +5.0% ) 5.33 9.36 -43.1% 1.933 1.843
2500.33 1.172 1.099 0.072 ( +6.6% ) 10.71 21.03 -49.0% 1.697 1.663
2500.331 1.508 1.394 0.114 ( +8.2% ) 5.88 16.30 -63.9% 1.858 1.817
2500.332 1.409 1.326 0.083 ( +6.3% ) 8.30 17.95 -53.8% 1.895 1.865
2500.401 2.137 2.138 -0.001 ( -0.1% ) 10.32 10.53 -2.0% 1.240 1.179
2500.501 1.710 1.711 -0.002 ( -0.1% ) 16.72 16.79 -0.4% 1.152 1.111
2500.511 1.122 1.124 -0.002 ( -0.2% ) 29.82 30.79 -3.1% 1.396 1.350
2500.601 2.038 2.050 -0.012 ( -0.6% ) 12.57 12.52 +0.4% 1.220 1.160

@swertz
Copy link
Contributor

swertz commented Mar 28, 2023

+xpog

All the changes in jet and fatJet discriminators are understood.

We think the changes flagged for other jet variables are spurious.

@mandrenguyen
Copy link
Contributor

+reconstruction
ParticleNet output added to mini

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 3, 2023

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@rappoccio
Copy link
Contributor

+1

@smuzaffar
Copy link
Contributor

this required cms-data/RecoBTag-Combined#50 to go in to

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.