Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[14_0_X] Introduce Unified Particle Transformer AK4 jet tagger (Backport) #44660

Merged

Conversation

AlexDeMoor
Copy link
Contributor

PR description:

This PR introduce UnifiedParticleTransformerAK4 a novel inclusive tagger for jet. This network will perform an inclusive tagging combining b/c/tau/lep tagging with jet energy regression (both the regression and the resolution estimation via quantile regression). The model is trained with a specific robust training combining improved adversarial training and domain adaptation for reducing the impact of the data/MC disagreement on the final performance. The output nodes of the domains are also kept for exploring the possibility of efficiency mapping and their impact later.

An overview of the method can be seen at: https://indico.cern.ch/event/1368069/contributions/5793148/
A focus on the novel adversarial training is described here: https://indico.cern.ch/event/1372038/#3-adversarial-training-for-par
The preliminary results of the model were shown in the following meeting: https://indico.cern.ch/event/1397392/#17-preliminary-results-of-part
The final results will be shared this Monday: https://indico.cern.ch/event/1403350/#3-part-2024-final-results-and

This PR requires the associated ONNX model which has been submitted in the adequate RecoBTag-Combined repo: cms-data/RecoBTag-Combined#57

For your information: a last training is ongoing trying to improve the current performance via an enriched dataset. A modification of the final model could occur. This will only affect the RecoBTag-Combined PR, not this one.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Yes this is the backport of #44641

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 8, 2024

A new Pull Request was created by @AlexDeMoor for CMSSW_14_0_X.

It involves the following packages:

  • DataFormats/BTauReco (reconstruction)
  • PhysicsTools/NanoAOD (xpog)
  • PhysicsTools/PatAlgos (xpog, reconstruction)
  • RecoBTag/Configuration (reconstruction)
  • RecoBTag/FeatureTools (reconstruction)
  • RecoBTag/ONNXRuntime (reconstruction)

@mandrenguyen, @jfernan2, @vlimant, @cmsbuild, @hqucms can you please review it and eventually sign? Thanks.
@hatakeyamak, @jdolen, @schoef, @AnnikaStein, @andrzejnovak, @rovere, @Ming-Yan, @gkasieczka, @rappoccio, @Senphy, @gpetruc, @mmarionncern, @azotz, @demuller, @jdamgov, @mbluj, @AlexDeMoor, @mariadalfonso, @nhanvtran, @emilbols, @hqucms, @gouskos, @JyothsnaKomaragiri, @ahinzmann, @seemasharmafnal, @missirol this is something you requested to watch as well.
@sextonkennedy, @rappoccio, @antoniovilela you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 8, 2024

cms-bot internal usage

@hqucms
Copy link
Contributor

hqucms commented Apr 9, 2024

enable nano

@hqucms
Copy link
Contributor

hqucms commented Apr 9, 2024

test parameters:

@jfernan2
Copy link
Contributor

jfernan2 commented Apr 9, 2024

backport of #44641

@hqucms
Copy link
Contributor

hqucms commented Apr 9, 2024

please test

@Senphy
Copy link

Senphy commented Apr 9, 2024

  1. The addon and relvals issues are caused by missing this backport [Backport 14.0.X] backport customize BTVNano changes  #44627
  2. The unit test issue is caused by the site requirement. In the log here we can see e.g.
15:28:59  dasgoclient --limit 0 --query 'file dataset=/MuonEG/Run2023C-PromptReco-v4/MINIAOD site=T2_CH_CERN' | ibeos-lfn-sort -u > step1_dasquery.log  2>&1
15:28:59  
15:29:00 ERROR executing  cd 2500.001_NANOmc106Xul17v2; cmsDriver.py step2  -s NANO,DQM:@nanoAODDQM --process NANO --mc  --eventcontent NANOAODSIM,DQM --datatier NANOAODSIM,DQMIO -n 10000 --customise "Configuration/DataProcessing/Utils.addMonitoring" --era Run2_2017,run2_nanoAOD_106Xv2 --conditions auto:phase1_2017_realistic  --customise Validation/Performance/TimeMemoryJobReport.customiseWithTimeMemoryJobReport  --filein filelist:step1_dasquery.log --fileout file:step2.root  --suffix "-j JobReport2.xml "  > step2_NANOmc106Xul17v2.log  2>&1;  ret= 256

This dataset is not stored in T2_CH_CERN anymore so the dasgoclient will return a blank list. Local test works well after removing this site requirement and adding the backport mentioned in 1.

@hqucms
Copy link
Contributor

hqucms commented Apr 9, 2024

please abort

@hqucms
Copy link
Contributor

hqucms commented Apr 9, 2024

please test with #44627

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 9, 2024

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ab0c66/38715/summary.html
COMMIT: 9d40b1f
CMSSW: CMSSW_14_0_X_2024-04-09-1100/el8_amd64_gcc12
Additional Tests: NANO
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/44660/38715/install.sh to create a dev area with all the needed externals and cmssw changes.

This pull request cannot be automatically merged, could you please rebase it?
You can see the log for git cms-merge-topic here: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ab0c66/38715/git-merge-result

@hqucms
Copy link
Contributor

hqucms commented Apr 16, 2024

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ab0c66/38863/summary.html
COMMIT: 32560af
CMSSW: CMSSW_14_0_X_2024-04-15-2300/el8_amd64_gcc12
Additional Tests: NANO
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/44660/38863/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 30 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 841 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3318556
  • DQMHistoTests: Total failures: 85
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3318451
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 15.244999999999997 KiB( 47 files compared)
  • DQMHistoSizes: changed ( 11634.0,... ): 1.229 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 13234.0,... ): 0.738 KiB Physics/NanoAODDQM
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

NANO Comparison Summary

Summary:

  • You potentially added 50 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 59 differences found in the comparisons
  • DQMHistoTests: Total files compared: 15
  • DQMHistoTests: Total histograms compared: 16281
  • DQMHistoTests: Total failures: 35
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 16246
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 13.768999999999998 KiB( 14 files compared)
  • DQMHistoSizes: changed ( 2500.001,... ): 1.229 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 2500.011,... ): 0.738 KiB Physics/NanoAODDQM
  • Checked 49 log files, 29 edm output root files, 15 DQM output files

Nano size comparison Summary:

Sample kb/ev ref kb/ev diff kb/ev ev/s/thd ref ev/s/thd diff rate mem/thd ref mem/thd
2500.0 2.693 2.593 0.100 ( +3.8% ) 4.03 5.19 -22.2% 2.177 2.151
2500.001 2.808 2.725 0.083 ( +3.0% ) 3.58 4.65 -22.9% 2.618 2.542
2500.002 2.754 2.671 0.084 ( +3.1% ) 3.76 4.83 -22.2% 2.594 2.512
2500.01 1.373 1.323 0.050 ( +3.8% ) 7.11 9.54 -25.5% 2.279 2.267
2500.011 1.825 1.746 0.079 ( +4.6% ) 3.76 5.20 -27.7% 2.449 2.407
2500.012 1.646 1.586 0.060 ( +3.8% ) 5.53 7.49 -26.2% 2.370 2.351
2500.1 2.267 2.263 0.004 ( +0.2% ) 5.18 5.19 -0.2% 2.068 1.980
2500.2 2.374 2.368 0.006 ( +0.2% ) 5.94 5.95 -0.2% 1.975 1.887
2500.21 1.208 1.203 0.005 ( +0.4% ) 4.23 4.27 -0.9% 2.284 2.176
2500.211 1.577 1.571 0.006 ( +0.4% ) 3.70 3.76 -1.5% 2.252 2.234
2500.3 2.140 2.134 0.006 ( +0.3% ) 12.05 12.01 +0.4% 1.903 1.896
2500.301 2.734 2.726 0.008 ( +0.3% ) 9.85 9.72 +1.3% 1.843 1.841
2500.31 1.271 1.284 -0.013 ( -1.0% ) 19.34 19.67 -1.6% 2.280 2.283
2500.311 1.666 1.678 -0.013 ( -0.7% ) 12.10 12.37 -2.2% 2.323 2.309
2500.312 7.164 7.164 0.000 ( +0.0% ) 1.47 1.42 +3.7% 1.701 1.692
2500.313 1.568 1.568 0.000 ( +0.0% ) 6.48 6.92 -6.3% 1.045 1.050
2500.314 1.199 1.199 0.000 ( +0.0% ) 12.52 12.60 -0.7% 2.233 2.244
2500.315 1.787 1.800 -0.013 ( -0.7% ) 12.71 12.75 -0.3% 2.338 2.294
2500.316 3.226 3.123 0.104 ( +3.3% ) 2.06 2.21 -6.7% 2.181 2.142
2500.317 1.814 1.826 -0.013 ( -0.7% ) 12.35 12.70 -2.7% 2.262 2.137
2500.318 4.074 4.208 -0.135 ( -3.2% ) 5.37 5.16 +4.0% 2.248 2.370
2500.4 2.283 2.307 -0.024 ( -1.0% ) 11.78 11.58 +1.7% 1.894 1.801
2500.401 1.849 1.849 0.000 ( +0.0% ) 9.81 10.07 -2.6% 1.678 1.761
2500.402 2.849 2.874 -0.025 ( -0.9% ) 9.67 9.92 -2.5% 1.831 1.787
2500.403 5.350 5.183 0.166 ( +3.2% ) 1.46 1.59 -8.3% 1.804 1.763
2500.404 2.857 2.881 -0.025 ( -0.9% ) 9.67 9.93 -2.6% 1.770 1.914
2500.405 8.599 8.830 -0.231 ( -2.6% ) 3.49 3.29 +6.2% 1.786 1.942
2500.5 5.194 5.194 0.000 ( +0.0% ) 15.86 15.99 -0.8% 1.569 1.425
2500.51 9.120 9.120 0.000 ( +0.0% ) 9.78 9.80 -0.2% 1.657 1.649

@cmsbuild
Copy link
Contributor

REMINDER @rappoccio, @antoniovilela, @sextonkennedy: This PR was tested with #44723, please check if they should be merged together

@hqucms
Copy link
Contributor

hqucms commented Apr 19, 2024

@cms-sw/reconstruction-l2 Could you please review and sign? Thanks!

@mandrenguyen
Copy link
Contributor

type btv, jetmet, tau

@mandrenguyen
Copy link
Contributor

+reconstruction

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_14_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_14_1_X is complete. This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @rappoccio, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)

@hqucms
Copy link
Contributor

hqucms commented Apr 19, 2024

@cms-sw/orp-l2 I opened the cms-dist PR cms-sw/cmsdist#9148, which should be merged together.

@rappoccio
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 21fc109 into cms-sw:CMSSW_14_0_X Apr 22, 2024
13 checks passed
stahlleiton pushed a commit to stahlleiton/cmssw that referenced this pull request Aug 5, 2024
…4-04-07-2300

[14_0_X] Introduce Unified Particle Transformer AK4 jet tagger (Backport)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants