Skip to content

FNALLPC/tagging-short-exercise-das

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tagging Short Exercise

Welcome to the Tagging Short Exercise for CMSDAS 2025!

Intro

A set of slides with introductory material, definitions, useful links is available on Indico.

Option 1: Setup with Purdue AF

  • Navigate to the Purdue AF website and click “Login to Purdue Analysis Facility”.

  • On the CILogon page, choose CERN account to log in (using Fermilab or Purdue credentials is also possible).

  • You will be redirected to the “Server Options” page. The default resource selection (4 CPUs, 16 GB RAM) is enough for the HATS exercises, but you can select more resources if needed. Do not add GPUs to your session – there are not enough GPUs for all participants.

  • Click “Start” to create your Analysis Facility session. It may take a couple of minutes to load.

  • Done! Your session is ready.

  • On the left panel, click on the "Git" icon. Then click on "Clone a Repository".

  • Paste this git path in the text box https://gitlab.cern.ch/cms-analysis/cmsdas/pog/b-tagging.git.

  • Enter your CERN username and password in the prompt.

  • Go back to the file browser by clicking on the top "File Browser" icon in the left panel. You should now see a new b-tagging directory.

  • Open a terminal from the main workspace (under "Other").

  • Type

cd b-tagging
git checkout DAS2025
conda env create -f env.yml     #This will take a while
python -m ipykernel install --user --name=FTAG-Tutorial
  • Go back to the file browser and navigate to b-tagging/notebooks. Your exercise notebooks are available here. Open the first notebook to start the exercise.

Option 2: Setup with lxplus (slower, not recommended)

Click here if you cannot set things up on Purdue AF... Perform these initial steps for the setup at lxplus (e.g. after doing `ssh -l your-lxplus-username@lxplus.cern.ch` from your own machine):
  1. Get Miniconda (if you have not yet done so in another exercise). We recommend that you do this in your eos area i.e. /eos/user/<u>/<username>/miniconda3
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

Special note: installation may take a while, therefore we recommend (if possible) to start early with the instructions.

  1. Checkout this repository into new directory (example directory given below for convenience)
mkdir -p /eos/user/<u>/<username>/CMSDAS2025/Tagging
cd /eos/user/<u>/<username>/CMSDAS2025/Tagging
git clone https://gitlab.cern.ch/cms-analysis/cmsdas/pog/b-tagging.git -b DAS2025  # Enter your CERN username and password when prompted
cd b-tagging
  1. Install relevant python packages into a conda-environment (comes with the Git repo)
conda env create -f env.yml
  1. Connect to a screen session (to start jupyter lab server and open in browser). A screen session ensures that your jupyter server keeps running within lxplus even if your ssh session get disconnected from lxplus.
screen -S server

Within that new screen-session, make sure to have the active conda environment:

conda activate FTAG-Tutorial

and start a jupyter lab server with forwarding to a specific port (choose a random four-digit number XXXX: the 7890 is just an example, do not use 7890!)

jupyter lab --no-browser --port=7890

Note down

  • the machine you were working with (most likely, something like lxplusXYZ).
  • the port you have chosen above.
  • the first http-link presented to you after starting the jupyter server instance (something like http://localhost:XXXX...). You can copy this with Ctrl/Cmd+C

Useful screen commands: Within this ssh connection, you may detach from the screen via Ctrl + A (hold down Ctrl) followed by D (while Ctrl is pressed). One can always go back to this screen-session via screen -r server from the same lxplus machine.

On your own laptop/machine: from a new terminal, connect to the port on which you started the server (pick the exact machine you worked with for the previous step): Example:

ssh -L 7890:localhost:7890 username@lxplus934.cern.ch

General case:

ssh -L XXXX:localhost:XXXX your-username@lxplusXYZ.cern.ch

Now open your browser and paste the http-link you copied. Navigate to notebooks on your left panel.

All packages to work with the exercises should be available from there, you don't need to use the terminal from now on, just keep the session open while you're working.

Option 3: Setup on EAF

Click here if you cannot set things up on Purdue AF...

To run on FNAL EAF, you will need to be on the Fermilab fgz network (if you are onsite) or use a VPN (if you are offsite, https://redtop.fnal.gov/guide-to-vpn-connections-to-fermilab/). Then login at https://analytics-hub.fnal.gov using your FNAL Services credentials. Once you successfully connect, select CMS - CPU Interactives - AL9 Dask (Coffea 0.7.x) [stable](top left) server options, as shown in the image below.

Click Start at the bottom of the page.

To open a Terminal click on the corresponding option in the Launcher Tab. If the Launcher tab is not open, you can open a new one from the File menu in the top left. This will open a new tab with a bash terminal.

Upload Grid Certificates - first time only!

We will copy your grid certificates from the LPC cluster, to do this, got to the terminal you just opened.

Execute the following commands (following the appropriate prompts) to copy your certificate from the LPC to Jupyter (note: replace username with your FNAL username!)

The following command will prompt you for your FNAL password

kinit username@FNAL.GOV
rsync -rLv username@cmslpc-el9.fnal.gov:.globus/ ~/.globus/
chmod 755 ~/.globus
chmod 600 ~/.globus/*
kdestroy

Initialize Your Proxy at every Login!

If you have a password on your grid certificate, you'll need to remember to execute the following in a terminal each time you log in to Jupyter. Similar to the LPC cluster, you will get a new host at each logon, and the new host won't have your old credentials.

Each time you log in, open a terminal and execute:

voms-proxy-init -voms cms -valid 192:00

Checkout the code

Open up a terminal and run the following command from your home area:

git clone https://gitlab.cern.ch/cms-analysis/cmsdas/pog/b-tagging.git -b DAS2025

On the left you should see the b-tagging directory you created. Click on it and then on the notebooks directory. In it there are three exercise. Start with number 1. Select the Python3 (Safe mode) kernel for all of them.

Tutorials

All individual tutorials / exercises are available from the notebooks directory. There are three .ipynb files which are plug-and-play, just (double-)click to open and follow the instructions inside.

Jupyter tip: To run a command, click on it inside the Jupyter notebook and click the play button on the top panel. You can edit the command and rerun it by clicking the run button again. Click the play button again to run the next command, and so on.

Acessing Tagger Outputs

Access AK4 and AK8 jet tagger information from standard NanoAOD files. Explore how these distributions look like for various flavours of jets and heavy objects.

Performance

Learn how network performance is evaluated and which performance metrics play a key role for flavour tagging. Perform more studies to evaluate performance as a function of certain parameters and compare across samples.

Scale factors

Scale Factors are essential before we can use taggers on real collision data. Explore one of the methods that are used to compare simulation and data and extract correction factors.

Bonus

Explore how performance depends on kinematic quantities related to the jet. This is one concept to keep in mind, differential distributions do matter (not only inclusive metrics), in this case explored for simple features like pseudorapidity and transverse momentum. Most likely you will also need to adapt to differentially measured scale factors (in bins of disciminators, though) when using such algorithms in an analysis.

Contact

This session:

Irene Dutta, 2025
📧 irene.dutta@cern.ch, 💻 @irenedutta23

Maintained by:

Spandan Mondal, 2024
📧 spandan.mondal@cern.ch, 💻 @mondalspandan

Sebastian Wuchterl, CMS PO&DAS 2023
📧 sebastian.wuchterl@cern.ch, 💻 @SWuchterl

Svenja Diekmann, CMS PO&DAS 2023
📧 svenja.diekmann@cern.ch, 💻 @SvenjaDiekmann

Orignal credits to: Annika Stein, 2023
📧 annika-stein@cern.ch, 💻 @AnnikaStein

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%