Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecHit-based rho producer for HLT #36157

Merged
merged 10 commits into from
Nov 26, 2021
Merged

Conversation

swagata87
Copy link
Contributor

PR description:

In an attempt to phase-out calotowers, this PR adds a new rho producer which is recHit-based. It takes the raw recHits of ECAL and HCAL as input and computes rho using fastjet. The new producer code is kept in RecoJets/JetProducers/plugins/ because other rho producer codes are also there. But this one is primarily aimed for HLT. According to current plan, egamma and muon HLT can use it in Run3, in order to phase out calotowers. Until Run2, calotower-based rho was used in egamma and muon HLT.

PR validation:

Ran egamma HLT after adding a module for the new rho producer. Relevant part of HLT config is this:

process.hltRhoNew = cms.EDProducer( "FixedGridRhoProducerFastjetFromRecHit",
    hbheRecHitsTag = cms.InputTag( "hltHbhereco" ),
    ebRecHitsTag = cms.InputTag( 'hltEcalRecHit','EcalRecHitsEB' ),
    eeRecHitsTag = cms.InputTag( 'hltEcalRecHit','EcalRecHitsEE' ),
    eThresHB = cms.vdouble(0.1, 0.2, 0.3, 0.3),
    eThresHE = cms.vdouble(0.1, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2),
    maxRapidity = cms.double( 2.5 ),
    gridSpacing = cms.double( 0.55 ),
    skipHCAL = cms.bool(False),
    skipECAL = cms.bool(False)
)

Then compared calotower-based rho and recHit-based rho, and found that they are fairly close.

Screen Shot 2021-08-31 at 14 08 40 (1)

Note that an exact match is neither expected, nor required. One reason of the small difference is different noise-cleaning cuts in calotower creation code.

Muon HLT team validated this producer independently, and found that their HLT performance remain very similar after moving to recHit-based rho. This is discussed in TSG meeting today: https://indico.cern.ch/event/1097086/contributions/4618454/attachments/2347696/4003696/20211117_TSG_MuonHLTIso_DropCaloTower_KPLee_v2.pdf

This PR is not a backport.
A backport to 12_1_X might be needed for HLT studies.

Thanks to Muon POG, in particular @KyeongPil-Lee, for validating the producer in the context of muon HLT.

With thanks to @Sam-Harper, for useful discussions.

@swagata87
Copy link
Contributor Author

assign hlt

  • the new producer code is primarily intended for HLT. All checks are made from HLT setup only.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-36157/26707

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

New categories assigned: hlt

@missirol,@Martin-Grunewald you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @swagata87 (Swagata Mukherjee) for master.

It involves the following packages:

  • RecoJets/JetProducers (reconstruction)

@jpata, @missirol, @cmsbuild, @Martin-Grunewald, @slava77 can you please review it and eventually sign? Thanks.
@rappoccio, @jdolen, @yslai, @jdamgov, @ahinzmann, @nhanvtran, @gkasieczka, @clelange, @schoef, @mariadalfonso, @seemasharmafnal this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

Copy link
Contributor

@missirol missirol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small comments/suggestions. If any of these comments conflict with choices agreed upon in previous PRs, you can ignore them.

#include "FWCore/ParameterSet/interface/ParameterSet.h"
#include "fastjet/tools/GridMedianBackgroundEstimator.hh"
#include "FWCore/Framework/interface/Event.h"
#include "DataFormats/Common/interface/View.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#include "DataFormats/Common/interface/View.h"

#include "DataFormats/Common/interface/View.h"
#include "FWCore/Framework/interface/MakerMacros.h"
#include "DataFormats/HcalRecHit/interface/HcalRecHitCollections.h"
#include "TLorentzVector.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#include "TLorentzVector.h"

General comment: I was wondering if using TLorentzVector is the best choice here. I would have naively picked math::XYZTLorentzVector (used in reco::Candidate), or even just used 4 floats (px/py/pz/E), as that seems to be all that's needed below in this plugin.

#include "Geometry/CaloGeometry/interface/CaloGeometry.h"
#include "Geometry/CaloGeometry/interface/CaloSubdetectorGeometry.h"
#include "Geometry/Records/interface/CaloGeometryRecord.h"
#include "DataFormats/Math/interface/Vector3D.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#include "DataFormats/Math/interface/Vector3D.h"

descriptions.add("hltFixedGridRhoProducerFastjetFromRecHit", desc);
}

FixedGridRhoProducerFastjetFromRecHit::~FixedGridRhoProducerFastjetFromRecHit() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FixedGridRhoProducerFastjetFromRecHit::~FixedGridRhoProducerFastjetFromRecHit() {}
FixedGridRhoProducerFastjetFromRecHit::~FixedGridRhoProducerFastjetFromRecHit() = default;

desc.add<std::vector<double> >("eThresHE", {0.1, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2});
desc.add<double>("maxRapidity", 2.5);
desc.add<double>("gridSpacing", 0.55);
descriptions.add("hltFixedGridRhoProducerFastjetFromRecHit", desc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
descriptions.add("hltFixedGridRhoProducerFastjetFromRecHit", desc);
descriptions.addWithDefaultLabel(desc);

if (!skipHCAL_) {
for (const auto &hit : iEvent.get(hbheRecHitsTag_)) {
if (passedHcalNoiseCut(hit)) {
TLorentzVector hitp4(0, 0, 0, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above on use of TLorentzVector.

for (const auto &hit : iEvent.get(hbheRecHitsTag_)) {
if (passedHcalNoiseCut(hit)) {
TLorentzVector hitp4(0, 0, 0, 0);
getHitP4(hit.id(), hit.energy(), hitp4, iSetup.getData(caloGeometryToken_));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
getHitP4(hit.id(), hit.energy(), hitp4, iSetup.getData(caloGeometryToken_));
getHitP4(hit.id(), hit.energy(), hitp4, caloGeometry);

if (!skipECAL_) {
for (const auto &hit : iEvent.get(ebRecHitsTag_)) {
if (passedEcalNoiseCut(hit, &thresholds)) {
TLorentzVector hitp4(0, 0, 0, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above on use of TLorentzVector.


for (const auto &hit : iEvent.get(eeRecHitsTag_)) {
if (passedEcalNoiseCut(hit, &thresholds)) {
TLorentzVector hitp4(0, 0, 0, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above on use of TLorentzVector.

Comment on lines 142 to 145
std::shared_ptr<const CaloCellGeometry> cellGeom =
subDetGeom != nullptr ? subDetGeom->getGeometry(detId) : std::shared_ptr<const CaloCellGeometry>();
if (cellGeom != nullptr) {
const auto &gpPos = cellGeom->repPos();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::shared_ptr<const CaloCellGeometry> cellGeom =
subDetGeom != nullptr ? subDetGeom->getGeometry(detId) : std::shared_ptr<const CaloCellGeometry>();
if (cellGeom != nullptr) {
const auto &gpPos = cellGeom->repPos();
if (subDetGeom != nullptr) {
const auto &gpPos = subDetGeom->getGeometry(detId)->repPos();

This is a guess, just to highlight that maybe things could be simplified (..but maybe I'm missing something). Mainly, I didn't get why shared_ptr was needed.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-36157/26722

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

Pull request #36157 was updated. @jpata, @missirol, @cmsbuild, @Martin-Grunewald, @slava77 can you please check and sign again.

double thispx = thispt * cos(gpPos.phi());
double thispy = thispt * sin(gpPos.phi());
double thispz = thispt * sinh(gpPos.eta());
hitp4.SetPxPyPzE(thispx, thispy, thispz, hitE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it not be corrected for beam spot (or identified signal vertex) position?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this rho producer, we are just aiming for a proxy of pile-up. As long as we can construct a variable whose value increases with pile-up, we are fine. So I think it's okay to not correct the recHit positions for beamspot.

I have addressed the other comments.

if (passedHcalNoiseCut(hit)) {
math::XYZTLorentzVector hitp4(0, 0, 0, 0);
getHitP4(hit.id(), hit.energy(), hitp4, caloGeometry);
inputs.push_back(fastjet::PseudoJet(hitp4.Px(), hitp4.Py(), hitp4.Pz(), hitp4.E()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested change
inputs.emplace_back(hitp4.Px(), hitp4.Py(), hitp4.Pz(), hitp4.E());
similar below

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no need of a LorentzVector or similar.
one single double hitp4[4]; suffice;

math::XYZTLorentzVector &hitp4,
const CaloGeometry &caloGeometry) {
const CaloSubdetectorGeometry *subDetGeom = caloGeometry.getSubdetectorGeometry(detId);
if (subDetGeom != nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really something that can happen?
should not the job just stop? (and tested much earlier)?

double thispz = thispt * sinh(gpPos.eta());
hitp4.SetPxPyPzE(thispx, thispy, thispz, hitE);
} else {
if (detId.rawId() != 0)
Copy link
Contributor

@VinInn VinInn Nov 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again: is this really something that can happen?
in particular once the hit paid the noise cut!

Copy link
Contributor

@jpata jpata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few additional comments inline.


void FixedGridRhoProducerFastjetFromRecHit::getHitP4(const DetId &detId,
float hitE,
math::XYZTLorentzVector &hitp4,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested by Vincenzo, this function could just return a fixed-size array, rather than modify an input TLorentzVector.

const CaloSubdetectorGeometry *subDetGeom = caloGeometry.getSubdetectorGeometry(detId);
if (subDetGeom != nullptr) {
const auto &gpPos = subDetGeom->getGeometry(detId)->repPos();
double thispt = hitE / cosh(gpPos.eta());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these could all be const

private:
void produce(edm::Event &, const edm::EventSetup &) override;
void getHitP4(const DetId &detId, float hitE, math::XYZTLorentzVector &hitp4, const CaloGeometry &caloGeometry);
bool passedHcalNoiseCut(const HBHERecHit &hit);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool passedHcalNoiseCut(const HBHERecHit &hit);
bool passedHcalNoiseCut(const HBHERecHit &hit) const;

void produce(edm::Event &, const edm::EventSetup &) override;
void getHitP4(const DetId &detId, float hitE, math::XYZTLorentzVector &hitp4, const CaloGeometry &caloGeometry);
bool passedHcalNoiseCut(const HBHERecHit &hit);
bool passedEcalNoiseCut(const EcalRecHit &hit, const EcalPFRecHitThresholds &thresholds);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool passedEcalNoiseCut(const EcalRecHit &hit, const EcalPFRecHitThresholds &thresholds);
bool passedEcalNoiseCut(const EcalRecHit &hit, const EcalPFRecHitThresholds &thresholds) const;


private:
void produce(edm::Event &, const edm::EventSetup &) override;
void getHitP4(const DetId &detId, float hitE, math::XYZTLorentzVector &hitp4, const CaloGeometry &caloGeometry);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void getHitP4(const DetId &detId, float hitE, math::XYZTLorentzVector &hitp4, const CaloGeometry &caloGeometry);
std::array<double, 4> getHitP4(const DetId &detId, float hitE, const CaloGeometry &caloGeometry) const;

(or float, in case everything is already float, I didn't check)

@cmsbuild
Copy link
Contributor

Pull request #36157 was updated. @jpata, @missirol, @cmsbuild, @Martin-Grunewald, @slava77 can you please check and sign again.

@missirol
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5d7a34/20721/summary.html
COMMIT: 3d014b7
CMSSW: CMSSW_12_2_X_2021-11-23-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/36157/20721/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 42
  • DQMHistoTests: Total histograms compared: 3247025
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3247003
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 41 files compared)
  • Checked 177 log files, 37 edm output root files, 42 DQM output files
  • TriggerResults: no differences found

@missirol
Copy link
Contributor

+hlt

  • Reviewed on demand; aside from the intended use case for this plugin, there is nothing in its implementation which is explicitly specific to HLT (no dependence on HLT packages).
  • There is a certain level of assumption in some methods concerning the validity of pointers, and length of arrays/vectors [1] [2]; I can't really judge this, so it is left to the review of RECO experts.

@jpata
Copy link
Contributor

jpata commented Nov 24, 2021

I can't really judge this, so it is left to the review of RECO experts.

Since this code is not running in a test, we can only perform a visual inspection, which may not be very helpful. Was this PR actually executed in an HLT setup to test that it runs?

@missirol
Copy link
Contributor

Since this code is not running in a test, we can only perform a visual inspection, which may not be very helpful.

I agree; I was hoping that people more familiar than me with some of the methods in this PR could tell a priori that certain calls (valid of pointers, etc) are always valid for the use case at hand; I understand it might not be obvious to guarantee that. I'm just trying to exclude issues like the one in #35488.

Was this PR actually executed in a HLT setup to test that it runs?

Swagata can answer this, of course; my understanding is yes (and this was used for some results presented in TSG). I don't know which tests were done with the latest version of the plugin (post-PR review).

@jpata
Copy link
Contributor

jpata commented Nov 24, 2021

Right. I don't know either. I think it would be good to run a new test on the code as it is currently in the PR.

@swagata87
Copy link
Contributor Author

Was this PR actually executed in an HLT setup to test that it runs?

yes, before making this PR the producer was run independently in Egamma HLT and muon HLT.

I think it would be good to run a new test on the code as it is currently in the PR.

indeed, as several parts were changed it makes sense to re-run and make sure things are still okay.

Egamma HLT was run using the latest version of the code. Calotower-based rho and recHit-based rho was compared. Here are the plots. Entries of histogram indicates number of events on which HLT was run.

  1. QCD_Pt-30To50_EMEnriched
    QCD-1
  2. TTbar
    TTbar-1
  3. WJetsToLNu
    WJets-1

@swagata87
Copy link
Contributor Author

Muon POG has now made rate & timing studies using the recHit based rho producer.
For the timing study they used updated version of the code from this PR, commit 503d8b4. This is not the latest commit, but the only change made after this was adding some packages in BuildFile. So I guess this might resolve the concern related to validity of pointers.
here is slides from @KyeongPil-Lee: https://swmukher.web.cern.ch/swmukher/20211124_MuonHLTIsoStudy_rechitBasedRho_KPLee_v1.pdf

@missirol
Copy link
Contributor

So I guess this might resolve the concern related to validity of pointers.

Thanks, Swagata. I don't have any further comments.

@jpata
Copy link
Contributor

jpata commented Nov 26, 2021

+reconstruction

  • adds a new FixedGridRhoProducerFastjetFromRecHit that is not enabled by default, intended for HLT
  • no reco changes (the new code is not running in tests)
  • according to private tests reported above, the code works as intended

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants