Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BDT models for low pT electron seeds #10

Closed
wants to merge 1 commit into from

Conversation

mverzett
Copy link
Contributor

@mverzett mverzett commented Dec 7, 2018

This PR introduce the BDT models used in the GSF Seeding in the upcoming PR for low-pT electron reconstruction.
Two models are provided:

  1. A model that is aware of the different expected pT and eta spectrum of electrons from B decays and of their different displacement.
  2. A model where the background sample is reweighted to remove any discriminating power of pt and eta. Displacement is not provided.

The performance of these two models is compared to the current seeding approach (blue star) and an hypothetical seeding approach same selection as the current but with no minimal pT threshold. The triangle marker show the performance of the OR of the two trained models applying a cut that induces the same mistag rate (respectively 1, 3, and 10 times the mistag rate of the current seeding).
iterm2 sccpsn log_bdt_comparison

The performance of the same models have been tested applying a posteriori a 0.5 GeV pT cut on the track, showing the same performance in the interested phase space. It is therefore safe to apply such a cut before computing the BDT outputs to save time.
iterm2 dnihlh log_bdt_comparison

@bainbrid is also interested in following this thread.

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 7, 2018

A new Pull Request was created by @mverzett (Mauro Verzetti) for branch master.

@cmsbuild, @smuzaffar, @gudrutis, @mrodozov can you please review it and eventually sign? Thanks.
You can sign-off by replying to this message having '+1' in the first line of your reply.
You can reject by replying to this message having '-1' in the first line of your reply.

external issue cms-sw/cmsdist#4563

@slava77
Copy link

slava77 commented Dec 7, 2018

@mverzett
do we really need this fancy naming "improvedfullseeding_formatted" and perhaps even the date are not really functional and appear to reflect only the history of derivation.
I can imagine this way the next version will be "evenmoreimproved" 😄

@mverzett
Copy link
Contributor Author

mverzett commented Dec 7, 2018

Hi Slava, the name can be changed, please let me know. They reflected some studies we made with different feature sets to understand how much we could gain by adding more and more information. The date represent the internal ntuples used to run the training for internal bookkeeping.

@slava77
Copy link

slava77 commented Dec 7, 2018

I think that MC sample/campaign tag is more useful in the name than just a simple data which has a meaning only to you.

Do you expect to use both files in reco or are they integrated now "for upcoming studies", not for production?

@slava77
Copy link

slava77 commented Dec 7, 2018

@mverzett
what is the status of the pull request for CMSSW to use these training files?

@bainbrid
Copy link
Contributor

bainbrid commented Dec 7, 2018 via email

@slava77
Copy link

slava77 commented Dec 7, 2018

Imminent. Today or Monday ...

OK, it sounds like we better wait for that to show up before merging this PR

@slava77
Copy link

slava77 commented Dec 10, 2018

@mverzett
please let me know if you can rename the files to be a bit more functional

@mverzett
Copy link
Contributor Author

@slava77 I just realised I made the PR from the wrong repo, I will close and re-open with the names changed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants