diff --git a/README.rst b/README.rst index 643d69cb..d59b45c6 100644 --- a/README.rst +++ b/README.rst @@ -20,15 +20,20 @@ protopipe |CI| |codacy| |coverage| |documentation| |doilatest| A pipeline prototype for the `Cherenkov Telescope Array (CTA) `_. - based on the `ctapipe `_ and - `pyirf `__ libraries plus original code -- successfully tested code migrated and imported from each new release -- allows for full-scale analyses on the `DIRAC `__ computing grid + `pyirf `__ libraries plus original code, +- successfully tested code migrated and imported from each new release, +- allows for full-scale analyses on the `DIRAC `__ computing grid thanks to its `interface `__. Resources --------- -- Source code: `GitHub repository `__ -- Documentation (master branch): `GitHub Pages `__ +- Source code (protopipe): `GitHub repository `__ +- Source code (DIRAC grid interface): `GitHub repository `__ +- Documentation: + + - `GitHub Pages `__ (only development version) + - `readthedocs `__ (also latest releases) + - Current performance: `RedMine `__ - Slack channels: diff --git a/docs/index.rst b/docs/index.rst index 9222a1c0..88ac6603 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -22,6 +22,24 @@ Current performance is stored internally at `this RedMine page `__ +- Source code (DIRAC grid interface): `GitHub repository `__ +- Documentation: + + - `GitHub Pages `__ (only development version) + - `readthedocs `__ (also latest releases) + +- Current performance: `RedMine `__ + +- Slack channels: + + - `#protopipe `__ + - `#protopipe_github `__ + - `#protopipe-grid `__ + Citing this software -------------------- diff --git a/docs/install/grid.rst b/docs/install/grid.rst index 669a2edd..74598c3b 100644 --- a/docs/install/grid.rst +++ b/docs/install/grid.rst @@ -1,18 +1,39 @@ .. _install-grid: +================ Grid environment ================ +.. contents:: + :local: + Requirements ------------- +************ + +DIRAC GRID certificate +====================== + +In order to access the GRID utilities you will need a certificate associated with an +account. -* credentials and certificate for using the GRID (follow `these instructions `__.) -* `Vagrant `_ +You can find all necessary information at +`this `_ +Redmine wikipage. + +Source code for the interface +============================= + +.. warning:: + Usage of the pipeline on an infrastucture different than the DIRAC grid has + not been fully tested. + This interface code is **highly** bound to DIRAC, + but the scripts which manage download, merge and upload of files + could be easily adapted to different infrastructures. Getting a released version -------------------------- -You can find the latest released version `here `__ +The latest released version are stored `at this GitHub repository `__ .. list-table:: compatibility between *protopipe* and its interface :widths: 25 25 @@ -37,15 +58,173 @@ This version is always compatible *only* with the development version of *protop ``git clone https://github.com/HealthyPear/protopipe-grid-interface.git`` +Container and options for containerization +========================================== + +.. note:: + Any of the following containerization choices constitutes a requirement. + +- **Single user working from a personal Linux machine** + + CTADIRAC can be installed natively on Linux (see `here `_). + In this case make sure that the protopipe-grid-interface source code + resides at the same path as protopipe. + +- **Single user working from a personal macos or Windows machine** + + The *Docker* container should be enough. + +- **User working on a shared environment (HPC machine or server)** + + In case you are not allowed to use *Docker* for security reasons, another supported option is *Singularity*. + + - on *Linux*, if you can't install natively make sure that either *Singularity* or *Docker* are available and accessible to your user, + - on *Windows* or *macos*, if you can't use *Docker* you will need to use *Singularity* via *Vagrant*. + +Docker +------ + +The container used by the interface requires the +`installation of Docker `_. + +To enter the container (and the first time downloading the image), + +| ``docker run --rm -v $HOME/.globus:/home/dirac/.globus`` +| ``-v $PWD/shared_folder:/home/dirac/shared_folder`` +| ``-v [...]/protopipe-grid-interface:/home/dirac/protopipe-grid-interface`` +| ``-v [...]/protopipe:/home/dirac/protopipe`` +| ``-it ctadirac/client`` + +where ``[...]`` is the path of your source code on the host. +The ``--rm`` flag will erase the container at exit +to save disk space (the data stored in the ``shared_folder`` won't disappear). +Please, refer to the Docker documentation for other use cases. + +.. note:: + In case you are using a released version of *protopipe*, there is no container + at the moment and the GRID environment based on CTADIRAC still requires Python2. + In this case you can link the source code folder from your python environment + installation on the host just like you would do with the development + version (``import protopipe; protopipe.__path__``). + +.. warning:: + If you are using *macos* you could encounter some disk space issues. + Please check `this page `_ and + `this other page `_ + on how to manage disk space. + +Vagrant +------- + +.. note:: + Only required for users that want to use a *Singularity* + container on a *macos* and *Microsoft Windows* machine. + +All users, regardless of their operative systems, can use this interface via +`Vagrant `_. + +The *VagrantFile* provided with the interface code allows to download a virtual +machine in form of a *Vagrant box* which will host the actual container. + +The user needs to, + +1. copy the ``VagrantFile`` from the interface +2. edit lines from 48 to 59 according to the local setup +3. enter the virtual machine with``vagrant up && vagrant ssh`` + +The *VagrantFile* defines creates automatically also the ``shared_folder`` +used by the interface to setup the analysis. + +Singularity +----------- + +.. warning:: + Support for *Singularity* has been dropped by the mantainers of *CTADIRAC*. + The following solutions have not been tested in all possible cases. + +- **macos / Microsoft Windows** + + `Singularity `_ is already installed and ready to use from the *Vagrant box* + obtained by using the *VagrantFile*. + +- **Linux** + + users that do not want to use *Vagrant* will need to have *Singularity* installed + on their systems and they will need to edit their own environment accordingly. + + For pure-*Singularity* users (aka on Linux machines without *Vagrant*) + bind mounts for *protopipe*, its grid interface and the shared_folder + will work in the same way: ``--bind path_on_host:path_on_container``. + +The DIRAC grid certificate should be already available since *Singularity* +mounts the user's home by default. +For more details, please check e.g. +`system-defined bind paths `_. + +Depending on the privileges granted on the host there are 2 ways to get a working container. + +Using the CTADIRAC Docker image +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**Method #1** + +Provided you have at least *Singularity 3.3*, you can pull directly the CTADIRAC Docker image from *DockerHub*, +but you will need to use the ``fakeroot`` mode. +This mode grants you root privileges only *inside* the container. + +``singularity build --fakeroot ctadirac_client_latest.sif docker://ctadirac/client`` + +``singularity shell --fakeroot ctadirac_client_latest`` + +``. /home/dirac/dirac_env.sh`` + +**Method #2** + +You shouldn't need root privileges for this to work (not throughly tested, though), + +``singularity build --sandbox --fix-perms ctadirac_client_latest.sif docker://ctadirac/client`` + +``singularity shell ctadirac_client_latest`` + +``. /home/dirac/dirac_env.sh`` + +Building the Singularity image +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Support for *Singularity* has been dropped by the mantainers of *CTADIRAC*, +but the recipe for the container has been saved here. + +In this case you won't need to do ``. /home/dirac/dirac_env.sh``: the +commands will be already stored in your ``$PATH``. + +.. warning:: + The recipe ``CTADIRAC_singularity`` is maintained by the author; if any bug arises, + reverting to the methods described above (if possible) will provide you with a working environment. + +If you have root privileges you can just build your own image with, + +``singularity build ctadirac_client_latest.sif CTADIRAC_singularity`` + +otherwise you have to either, + +- revert to the ``--fakeroot`` mode + (use it also to enter the container just like the methods above) + +- build the image remotely at ``https://cloud.sylabs.io`` using the ``--remote`` flag + (for this you will need to interface with that servce to generate an access token) + Setup the working environment ------------------------------ - -1. create and enter a folder where to work, -2. copy the ``VagrantFile`` from the interface -3. edit lines from 48 to 59 according to your local setup -4. ``vagrant up && vagrant ssh`` -5. ``singularity pull --name CTADIRAC_with_protopipe.sif shub://HealthyPear/CTADIRAC`` -6. ``singularity shell CTADIRAC_with_protopipe.sif`` +***************************** + +The CTADIRAC container doesn't provide everything *protopipe* needs, +but this can be solved easily by issuing the following command inside the container's home directory, + +``source protopipe-grid-interface/setup.sh`` + +This will not only install some missing Python packages, +but also provide convenient environment variables ``$GRID_INTERFACE`` and ``$PROTOPIPE`` +for the source code and check that the DIRAC interface has been properly +installed and initialized. From here, diff --git a/docs/usage/use_grid.rst b/docs/usage/use_grid.rst index 606d4273..4b4a6372 100644 --- a/docs/usage/use_grid.rst +++ b/docs/usage/use_grid.rst @@ -23,22 +23,26 @@ Usage You will work with two different virtual environments: - - protopipe (Python >=3.5, conda environment) - - GRID interface (Python 2.7, inside the container). + - protopipe (Python >=3.7) + - GRID interface (Python 2.7) + + Their location and activation will depend on your installation if choice + (see :ref:`install-grid`). - Open 1 tab for each of these environments on you terminal so you can work seamlessly between the 2. - - To monitor the jobs you can use the + To monitor the jobs you can the `DIRAC Web Interface `_ 1. **Setup analysis** (GRID enviroment) - 1. Enter the container - 2. ``python $GRID/create_analysis_tree.py --analysis_name myAnalysis`` + After having entered the container use the script + + ``python $GRID_INTERFACE/create_analysis_tree.py`` - All configuration files for this analysis are stored under ``configs``. - Throughout these instructions ``$ANALYSIS`` will be a label for the analysis - path within or outside of the container. + to create a complete analysis directory depending on your setup. + The script will store and partially edit for you all the necessary + configuration files under the ``configs`` folder as well as the operational + scripts to download and upload data and model files under ``data`` and + ``estimators`` respectively. .. figure:: ./AnalysisTree.png :width: 250 @@ -47,45 +51,46 @@ Usage 2. **Obtain training data for energy estimation** (GRID enviroment) 1. edit ``grid.yaml`` to use gammas without energy estimation - 2. ``python $GRID/submit_jobs.py --config_file=grid.yaml --output_type=TRAINING`` + 2. ``python $GRID_INTERFACE/submit_jobs.py --analysis_path=[...]/test_analysis --output_type=TRAINING`` 3. edit and execute ``$ANALYSIS/data/download_and_merge.sh`` once the files are ready 3. **Build the model for energy estimation** (both enviroments) 1. switch to the ``protopipe environment`` - 2. edit ``regressor.yaml`` - 3. launch the ``build_model.py`` script of protopipe with this configuration file - 4. you can operate some diagnostics with ``model_diagnostic.py`` using the same configuration file - 5. diagnostic plots are stored in subfolders together with the model files - 6. return to the ``GRID environment`` to edit and execute ``upload_models.sh`` from the estimators folder + 2. edit the configuration file of your model of choice + 3. use ``protopipe-MODEL`` with this configuration file + 4. (development users) use the proper benchmarking notebooks under ``docs/contribute/benchmarks`` to check the performance of the generated models + 5. return to the ``GRID environment`` to edit and execute ``upload_models.sh`` from the estimators folder 4. **Obtain training data for particle classification** (GRID enviroment) 1. edit ``grid.yaml`` to use gammas **with** energy estimation - 2. ``python $GRID/submit_jobs.py --config_file=grid.yaml --output_type=TRAINING`` + 2. ``python $GRID_INTERFACE/submit_jobs.py --analysis_path=[...]/test_analysis --output_type=TRAINING`` 3. edit and execute ``$ANALYSIS/data/download_and_merge.sh`` once the files are ready 4. repeat the first 3 points for protons + 5. (development users) use the proper benchmarking notebooks under ``docs/contribute/benchmarks`` to check the estimated energies 4. **Build a model for particle classification** (both enviroments) 1. switch to the ``protopipe environment`` - 2. edit ``classifier.yaml`` - 3. launch the ``build_model.py`` script of protopipe with this configuration file - 4. you can operate some diagnostics with ``model_diagnostic.py`` using the same configuration file - 5. diagnostic plots are stored in subfolders together with the model files - 6. return to the ``GRID environment`` to edit and execute ``upload_models.sh`` from the estimators folder + 2. edit ``RandomForestClassifier.yaml`` + 3. use ``protopipe-MODEL`` with this configuration file + 4. (development users) use the proper benchmarking notebooks under ``docs/contribute/benchmarks`` to check the performance of the generated models + 5. return to the ``GRID environment`` to edit and execute ``upload_models.sh`` from the ``estimators`` folder 5. **Get DL2 data** (GRID enviroment) Execute points 1 and 2 for gammas, protons, and electrons separately. - 1. ``python $GRID/submit_jobs.py --config_file=grid.yaml --output_type=DL2`` + 1. ``python $GRID_INTERFACE/submit_jobs.py --analysis_path=[...]/test_analysis --output_type=DL2`` 2. edit and execute ``download_and_merge.sh`` + 3. (development users) use the proper benchmarking notebooks under ``docs/contribute/benchmarks`` to check the quality of the generated DL2 data 6. **Estimate the performance** (protopipe enviroment) 1. edit ``performance.yaml`` 2. launch the performance script with this configuration file and an observation time + 3. (development users) use the proper benchmarking notebooks under ``docs/contribute/benchmarks`` to check the quality of the generated DL3 data Troubleshooting @@ -125,9 +130,16 @@ Something went wrong during the download phase, either because of your network connection (check for possible instabilities) or because of a problem on the server side (in which case the solution is out of your control). -The best approach is: +First let the process finish and eliminate the incomplete merged file, then +the recommended approach is to use the DIRAC's command, + +``dirac-dms-directory-sync source destination`` + +where ``source`` is the LFN on DIRAC's FileCatalog and ``destination`` is the +target folder under you analysis directory tree. + +If this doesn't work, a more manual approach is: -- let the process finish and eliminate the incomplete merged file, - go to the GRID, copy the list of files and dump it into e.g. ``grid.list``, - do the same with the local files into e.g. ``local.list``, - do ``diff <(sort local.list) <(sort grid.list)``, diff --git a/protopipe/aux/example_config_files/analysis.yaml b/protopipe/aux/example_config_files/analysis.yaml index 7eb34fb3..9175714c 100644 --- a/protopipe/aux/example_config_files/analysis.yaml +++ b/protopipe/aux/example_config_files/analysis.yaml @@ -13,11 +13,12 @@ General: cam_id_list : ['LSTCam', 'NectarCam'] # Selected cameras (disabled option) Calibration: - # factor to transform the integrated charges (in ADC counts) into number of - # photoelectrons - # the pixel-wise one calculated by simtelarray is 0.92 - calib_scale: 0.92 + # for a CTAMARS-like analysis disable integration correction apply_integration_correction: false + # factor to transform the integrated charges (in ADC counts) into number of + # photoelectrons (on top of the DC-to-PHE factor!) + # the pixel-wise one calculated by simtelarray is 0.92 for CTAMARS + calibscale: 0.92 # Cleaning for reconstruction ImageCleaning: @@ -93,7 +94,7 @@ ImageCleaning: # Cut for image selection ImageSelection: source: "extended" # biggest or extended - charge: [55., 1e10] + charge: [50., 1e10] pixel: [3, 1e10] ellipticity: [0.1, 0.6] nominal_distance: [0., 0.8] # in camera radius diff --git a/protopipe/pipeline/event_preparer.py b/protopipe/pipeline/event_preparer.py index 0a0cd352..426e37d1 100644 --- a/protopipe/pipeline/event_preparer.py +++ b/protopipe/pipeline/event_preparer.py @@ -992,10 +992,7 @@ def prepare_event(self, source, return_stub=True, save_images=False, debug=False tel_tilted = tel_ground.transform_to(tilted_frame) # but this not - core_tilted = SkyCoord(x=core_ground.x, - y=core_ground.y, - frame=tilted_frame - ) + core_tilted = core_ground.transform_to(tilted_frame) impact_dict_reco[tel_id] = np.sqrt( (core_tilted.x - tel_tilted.x) ** 2