Skip to content

MicroBooNE_Reco_Processing_Example_MCC8.md

Lynn Garren edited this page Jun 13, 2023 · 1 revision

MicroBooNE Reco Processing Example MCC8

These instructions are for running reconstruction on samples going into the MicroBooNE Cincinnati Workshop 2017. They do not cover all of the options that are possible within project.py and most importantly do not cover information about how to generate custom samples (through either custom fcl or modules). There are some general comments about grid submission snuck into the instructions, so enjoy and learn!

There is a template file to base you project xml file off of located here:

    /uboone/app/users/kirby/single_particle_mcc8_template/test_reco_dataset_template.xml

Copy this file into your app volume area:

    cp /uboone/app/users/kirby/single_particle_mcc8_template/test_reco_dataset_template.xml
     /uboone/app/users/$USER/<name_your_working_area>/<name_your_file>.xml

Note that the /uboone/app/users/ area should be used to store configuration files, uboonecode builds, but should never be used to store data files. At the top of the xml file you will find this section while must be completed by the analyzer:

    <!DOCTYPE project [
    <!ENTITY user_id "put you user name here">
    <!-- e.g. kirby, mrmooney, etc. this is the same as your kerberos principal UID and FNAL email -->
    <!ENTITY number_of_jobs "number of jobs to run">
    <!-- This has to be equal to or less than the number of files in your dataset. check with samweb list-definition-files \-\-summary <your_defname> -->
    <!-- It should also be equal to the number of events you want divided by 50. so if you want 10K, set to 200. But read the line above again first. -->
    <!ENTITY defname "SAM dataset name">
    <!-- name of the sample that you are running reco over -->
    <!ENTITY name "test_&defname;">
    <!-- this is the test_<defname> from above -->
    <!-- Note that the name will be used for the name of output files so please use something reasonable -->

    <!-- Examples are here:

    <!ENTITY user_id "kirby">
    <!ENTITY number_of_jobs "2"> <- I'm setting this to 2 for testing. it would normaly be something like 200
    <!ENTITY defname "prod_muminus_0-2.0GeV_isotropic_uboone_mcc8_detsim">
    <!ENTITY name "test_&defname;">

    -->

You really shouldn't change the fcl files in this xml file. You should really only change the user id, the number of jobs, and the SAM dataset definition. But you can change whether or not your dataset had spacecharge or DDR on.

NOTE!!!! DO NOT IGNORE THIS and keep track of these settings for your sample!!!

Once you have edited your file, then you need to make sure that the staging and output directories are ready. You credit those files with these commands:

    mkdir -p /pnfs/uboone/scratch/users/$USER #This creates a directory in the dCache scratch area
    mkdir -p /uboone/data/users/$USER #this creates a directory in the BlueArc data volume

Note that the dCache scratch area does not have a quota, but files have a limited lifetime before they are flushed from the volume (usually ~ 30 days). But there is NO guarantee that files will be stored permanently. While on the BlueArc data volume, there is a 1.5 TiB quota for each user but the storage is permanent. For full details look here: https://redmine.fnal.gov/redmine/projects/fife/wiki/Understanding_storage_volumes

Now you're ready to submit jobs. First setup the environment with a current version of uboonecode so that you get a current version of the larbatch UPS product.

    source /cvmfs/uboone.opensciencegrid.org/products/setup_uboone.sh
    setup uboonecode v06_26_01 -q e10:prof
    cd  /uboone/app/users/$USER/<name_your_working_area>/
    project.py --xml `pwd`/test_muminus_reco.xml --stage reco --submit
    #this submits the "reco" stage of the project which is reco1+reco2
    jobsub_q --user=$USER

The last command checks to make sure that the jobs have been submitted. You should see a list of jobs with status I equal to the xml variable number_of_jobs. Once all the jobs are complete, then you'll run this command:

    project.py --xml `pwd`/test_muminus_reco.xml --stage reco --check
    #this checks the "sim" stage files to make sure they were produced successfully

If there are no error, then you can continue to reco and mergeana stage.

    project.py --xml `pwd`/test_muminus_reco.xml --stage mergeana --submit
    #this submits the "mergeana" stage of the project which produces AnaTree files
    project.py --xml `pwd`/test_muminus_reco.xml --stage mergeana --checkana
    #NOTE CHECKANA!!!!! this checks the "mergeana" stage files to make sure they were produced successfully

You should now have files in three locations within /pnfs/uboone/scratch (again these files have a finite lifetime since they are in a scratch area!!!!). Note that these are the xml variables (e.g. &tag;), and so you will have to translate them:

    /pnfs/uboone/scratch/users/&user_id;/&tag;/&relreco1;/reco/&name;
    /pnfs/uboone/scratch/users/&user_id;/&tag;/&relreco2;/mergeana/&name;

And we need to move them to permanent storage. We will do this using SAM4Users utilities. http://microboone-docdb.fnal.gov:8080/cgi-bin/ShowDocument?docid=6896 These commands will have to be translated slightly but hopefully you understand what is being done. First we will make text files that contain the full paths to the files we've generated:

    ###ls /pnfs/uboone/scratch/users/&user_id;/&tag;/&relsim/sim/&name;/*/&name;*.root >& /uboone/app/users/$USER/&name;_sim_filelist.txt
    #this has to be translated and using the example above would become
    ls /pnfs/uboone/scratch/users/kirby/mcc8/v06_26_01/reco/test_muminus_reco/*/test_muminus_reco*.root >& /uboone/app/users/$USER/test_muminus_reco_filelist.txt
    ls /pnfs/uboone/scratch/users/kirby/mcc8/v06_26_01/mergeana/test_muminus_reco/*/ana*.root >& /uboone/app/users/$USER/test_muminus_mergeana_filelist.txt

Now we're going to setup FIFE utils UPS product, declare those files to SAM4Users, and then copy them to tape backed archive. But you much first come up with a dataset name. I recommend $USER_mcc8_&name;_sim_v1 and if you regenerate, to increment the v1 on the end. So this looks like:

    source /cvmfs/uboone.opensciencegrid.org/products/setup_uboone.sh
    setup uboonecode v06_26_01 -q e10:prof
    setup fife_utils
    ##sam_add_dataset -n $USER_mcc8_&name;_v1 -f /uboone/app/users/$USERS/&name;_reco_filelist.txt
    sam_add_dataset -n $USER_mcc8_test_muminus_reco_v1 -f /uboone/app/users/$USERS/test_muminus_reco_filelist.txt
    sam_move2archive_dataset -n $USER_mcc8_test_muminus_reco_v1
    sam_add_dataset -n $USER_mcc8_test_muminus_mergeana_v1 -f /uboone/app/users/$USERS/test_muminus_mergeana_filelist.txt
    sam_move2archive_dataset -n $USER_mcc8_test_muminus_mergeana_v1

At that point, the files are removed from dCache scratch space and moved to tape-backed,permanent storage. You will now need to access them through the SAM datasets definitions.