Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Reduction Network-based electron energy regression using the SONIC service #35839

Merged
merged 16 commits into from
Feb 4, 2022

Conversation

ssrothman
Copy link
Contributor

PR description:

This PR commits the first tranche of code to apply the newly developed Dynamic Reduction Network (DRN)-based energy regression for electrons implemented using pytorch. This commit is intended for the ECAL calibration use case where the energy of electron Mustache superclusters are corrected. This does not require to be added to the full reconstruction workflow and can be applied on the fly when the necessary nTuples needed for ECAL calibration are produced using the ECALELF framework.

The DRN model is implemented in CMSSW (here and here) using the SONIC service, modifying the existing implementation of the BDT based regression and the corresponding producer.

The SONIC config file and torchscript regression weights needed are intended to be hosted on git cms-data. A PR will be created there committing these files referencing the current PR here.

A summary of this DRN model as well as the rationale of the CMSSW implementation may be found here.

This is a draft PR pending some careful checks of resource usage. In particular, prior to submitting the full PR we will

  1. carefully check the CPU utilization in our proposed workflow
  2. build a minimal triton server image to reduce the memory overhead

PR validation:

Validation has been performed using a custom cfg intended to be run using cmsRun that has been also committed in the test folder (here). Pending the cms-data PR, the config and weights files must be cloned from https://github.com/ssrothman/DRNData in order to run the validation config. A histogram of the predicted energy over the generated energy for both the DRN and the 2018 UL BDT (as well as the raw energy) is embedded here:

bigtest

It has also been confirmed that this implementation yields identical results to the standalone python implementation used for training and validation of the DRN, such as in the slides linked above.

if this PR is a backport please specify the original PR and why you need to backport that PR:

As of now a backport is not envisaged, but might be re-visited later.

@Sam-Harper @lgray @kpedro88 @rchatter @lfinco @swagata87 @jainshilpi @SohamBhattacharya @simonepigazzini @violatingcp

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35839/26205

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@slava77
Copy link
Contributor

slava77 commented Oct 25, 2021

* This PR adds an extra 2284KB to repository

please cleanup the commit history to remove the intermediate files (especially the large ones)

also, I'd rather have updates in .gitignore in a separate PR (if they are really needed). So, please remove it from this PR.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35839/26222

Code check has found code style and quality issues which could be resolved by applying following patch(s)

#ifndef Progression_EGM_DRN_SCEnergyCorrectorDRN_h
#define Progression_EGM_DRN_SCEnergyCorrectorDRN_h

#include "HeterogeneousCore/SonicTriton/interface/TritonEDProducer.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary include

void setEventSetup(const edm::EventSetup& es);
void setEvent(const edm::Event& e);

//std::pair<double, double> getCorrections(const reco::SuperCluster& sc) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete commented-out code

void makeInput(const edm::Event& iEvent, TritonInputMap& iInput, const reco::SuperClusterCollection& inputSCs) const;
TritonOutput<float> getOutput(const TritonOutputMap& iOutput);

//void modifyObject(reco::SuperCluster& sc) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete commented-out code


#include <vdt/vdtMath.h>

#include <iostream>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary include


void SCEnergyCorrectorDRN::makeInput(const edm::Event& iEvent, TritonInputMap& iInput, const reco::SuperClusterCollection& inputSCs ) const {

std::vector<unsigned> nHits = {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary default initialization

timeout = cms.untracked.uint32(10),
modelName = cms.string('MustacheEE'),
modelVersion = cms.string(""),
modelConfigPath = cms.FileInPath("DRNData/models/MustacheEE/config.pbtxt"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above re: path

static float sigmoid(float x){
return 1.0f/(1.0f + exp(-x));
}
static float ln2 = log(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is only used inside the static function logcorrection below, just define it inside that function

corrSCs->push_back(inputSC);
corrSCs->back().setEnergy(corrEn);
corrSCs->back().setCorrectedEnergy(corrEn);
//corrSCs->back().setCorrectedEnergyUncertainty(-1.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete commented-out code

@@ -0,0 +1,560 @@
// -*- C++ -*-
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Virtually every line of code in this file violates CMS coding rules: http://cms-sw.github.io/cms_coding_rules.html

If this really needs to be included in the official repository, please clean it up.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35839/26224

  • This PR adds an extra 28KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35839/26227

ERROR: Build errors found during clang-tidy run.

RecoEcal/EgammaClusterAlgos/interface/SCEnergyCorrectorDRN.h:61:46: error: unknown type name 'TritonInputMap' [clang-diagnostic-error]
    void makeInput(const edm::Event& iEvent, TritonInputMap& iInput, const reco::SuperClusterCollection& inputSCs) const;
                                             ^
RecoEcal/EgammaClusterAlgos/interface/SCEnergyCorrectorDRN.h:62:5: error: no template named 'TritonOutput' [clang-diagnostic-error]
    TritonOutput<float> getOutput(const TritonOutputMap& iOutput);
    ^
RecoEcal/EgammaClusterAlgos/interface/SCEnergyCorrectorDRN.h:62:41: error: unknown type name 'TritonOutputMap' [clang-diagnostic-error]
    TritonOutput<float> getOutput(const TritonOutputMap& iOutput);
                                        ^
RecoEcal/EgammaClusterAlgos/src/SCEnergyCorrectorDRN.cc:57:28: error: out-of-line definition of 'makeInput' does not match any declaration in 'SCEnergyCorrectorDRN' [clang-diagnostic-error]
void SCEnergyCorrectorDRN::makeInput(const edm::Event& iEvent, TritonInputMap& iInput, const reco::SuperClusterCollection& inputSCs ) const {
                           ^
RecoEcal/EgammaClusterAlgos/src/SCEnergyCorrectorDRN.cc:118:43: error: out-of-line definition of 'getOutput' does not match any declaration in 'SCEnergyCorrectorDRN' [clang-diagnostic-error]
TritonOutput<float> SCEnergyCorrectorDRN::getOutput(const TritonOutputMap& iOutput) {
                                          ^
Suppressed 2275 warnings (2273 in non-user code, 1 NOLINT, 1 with check filters).
--
gmake: *** [config/SCRAM/GMake/Makefile.coderules:129: code-checks] Error 2
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2

@jpata
Copy link
Contributor

jpata commented Jan 26, 2022

kind ping on this

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35839/27992

  • This PR adds an extra 16KB to repository

  • Found files with invalid states:

    • RecoEcal/EgammaClusterProducers/test/DRNTestNTuplizer.cc:

@cmsbuild
Copy link
Contributor

Pull request #35839 was updated. @jpata, @cmsbuild, @clacaputo, @slava77 can you please check and sign again.

@jpata
Copy link
Contributor

jpata commented Jan 31, 2022

@cmsbuild please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d7e2d0/22099/summary.html
COMMIT: 99ce2ba
CMSSW: CMSSW_12_3_X_2022-01-30-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/35839/22099/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d7e2d0/22099/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d7e2d0/22099/git-merge-result

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 43
  • DQMHistoTests: Total histograms compared: 3449612
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3449590
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 42 files compared)
  • Checked 181 log files, 42 edm output root files, 43 DQM output files
  • TriggerResults: no differences found

@jpata
Copy link
Contributor

jpata commented Feb 1, 2022

+reconstruction

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2022

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented Feb 4, 2022

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.