-
Notifications
You must be signed in to change notification settings - Fork 0
Batch Job Submission
Goldilocks supports batch job submission through a collection of python scripts/classes and text files.
Specifically, the framework is tailored towards batch jobs submitted at the LPC.
Per this recommendation,
the CMSSW repository is uploaded to EOS as a tarball.
Then, the batch job retrieves the tarball, unpacks it, and runs the framework.
File | About |
---|---|
python/submitBatchJobs.py |
Steering script for submitting batch jobs |
python/batch/batchSubmission.py |
Class that contains information for generating condor & shell scripts, submitting batch jobs |
python/batch/batchScripts.py |
Templates for condor and shell scripts |
config/batch.txt |
Configuration file for batch jobs (this is just an example, user can write their own) |
config/tarCMSSW.sh |
Shell script used to tar the CMSSW directory for submitting batch jobs |
- commit all code & push any necessary changes
- Open the tar script (
config/tarCMSSW.sh
) and ensure the arguments are correct, e.g.,eos_path
. - Copy the tar script to
$CMSSW_BASE/..
(one directory above the CMSSW release) - execute
source tarCMSSW.sh
to tar the CMSSW directory & upload it to EOS- During this step watch the verbose output of the tarball being created to catch if any large and unnecessary files are being included! If there is extra material, either delete it or exclude that path from the tar process.
- Return to the goldilocks directory (
$CMSSW_BASE/src/Analysis/goldilocks
) and submit batch jobs viapython python/submitBatchJobs.py config/batch.txt
The configuration requires a certain set of arguments be defined to successfully submit batch jobs.
Here is an example configuration file (config/batchConfig.txt
):
date today
output_dir eos
eos_path /store/user/demarley/susy/semi-resolved-tagging/goldilocks/qcd_700-1000
subdir qcd_700-1000
eos_tarball_path /store/user/demarley/susy/semi-resolved-tagging/goldilocks/
files config/QCD_HT700to1000_TuneCUETP8M1_13TeV-madgraphMLM-pythia8.txt
username demarley
executable run_training
test False
submit True
verbose True
config config/training.txt
The different options are defined below.
Argument | Description |
---|---|
date | The date to be used in the EOS directory for saving the output. Can be any arbitrary string. The default (today ) uses the current day the jobs are submitted, e.g., 30Apr2018 . |
output_dir | Output directory. Only EOS is supported right now. (eos) |
eos_path | Path to store the samples. (/store/user/lpctop/ttbarAC/flatNtuples/SingleMuonHv3/$TODAY ). The string $TODAY in this setup will be replaced with the current date. |
subdir | Sub-directory to place the batch files needed to monitor the jobs, e.g, log, error, out, etc. (SingleMuonG) |
eos_tarball_path | Where the tarball of goldilocks exists to use in batch job (/store/user/lpctop/ttbarAC/flatNtuples/CWoLa/) |
files | The MiniAOD files to access and process into flat ntuples. (config/officialSamples/listOfSingleMuonHv3Files.txt ) |
username | LPC username (demarley) |
executable | The executable you want to process. There are only two options at the moment. (run ) |
test |
True /False : If True , only one job will be submitted to batch, else all jobs will be submitted. |
submit |
True /False : If True , jobs will be submitted to batch, otherwise all batch scripts are prepared, but none are submitted. |
verbose | Lots of print statements -- recommended to be True so you can catch crashes. |
config | The goldilocks configuration file you want to use with all proper options defined (see config/cmaConfig.txt ) |
The configuration is processed using util.py which will just read the file and convert the <arg> <value>
information in each line into a key:value
dictionary.
(Some values are then converted from string
to boolean
.)
Then arguments define different attributes of the batch/batchSubmission.py
class BatchSubmission
.
In the config
option, e.g., config/cmaConfig.txt
, the output path
and input file
arguments are not necessary, they will be overwritten by the options presented in the batch configuration file.