Migration to Stage1 format #100

TjarkMiener · 2021-03-01T14:22:44Z

Hi all,

This PR adds the migration to the stage1 format. The code can be tested with the stage1 notebook. I git rebased PR #97 into this branch.

Changes w.r.t. the current (master) reader:

Parent class for shared code. Each format has it own subclass with different example identifiers. However, the "output" of the reader (__getitem__ return) is generic though.
Option to read/store the example identifiers in a pandas hdf5 file.
Different pointing modes supported.
Read images and/or image parameters.
Read cleaned images.
Construct the pyIRF simulated Table.
Only have 'mono' and 'stereo' reading mode. The previous multi-stereo mode can be obtained with the stereo mode and selection of different telescope types.

add "image_selection_from_file" parameter. It is a dict where one cut can be stored (at the moment). If "image_selection_from_file" is present "image_selection" must not be present because the first one is for the new format of h5 file, the second one for the old format.

from Unsigned Int to Signed Int

Use "image_selection" also for the new format in order to use the intensity cut without implement the same filter for image_selection_from_file

update

delete non used parameter algorithm in the filters and streamline duplicate code

stay consistent

read the selected parameters in the config file (ctlearn config) and append them in the example_identifiers. Now we have: ('/mnt/lromanato/anaconda3/envs/LST37/CTA/Prod5h5/TJARK/gamma_20deg_180deg_run9___cta-prod5-lapalma_LST1_desert-2158m_LST1_mono_cone6.h5', 708, 306, 1, [677.5469970703125, 0.0]) where example_identifiers.append((filename, nrow,image_index, tel_id, temp_list)) temp list = the parameters required for this single image. (in this example selected_parameters is ['hillas_intensity','leakage_intensity_2']

example: [{'name': 'image', 'tel_type': 'LST_LST_LSTCam', 'base_name': 'image', 'shape': (114, 114, 2), 'dtype': dtype('float32')}, {'name': 'particletype', 'tel_type': None, 'base_name': 'shower_primary_id', 'shape': (), 'dtype': dtype('int8')}, {'name': 'parameter', 'tel_type': None, 'base_name': 'hillas_intensity', 'shape': (), 'dtype': dtype('float32')}, {'name': 'parameter', 'tel_type': None, 'base_name': 'leakage_intensity_2', 'shape': (), 'dtype': dtype('float32')}] renamed in training_parameters because they will be the parameters required for the NN training

In reader.py now i read also the parameters selected (i.e. training_parameters) of the image. This fix the ctlearn bug of the previus commit

…resent also tested and fixed for all the combination possible (i.e. both present, no one present, one present and viceversa)

add the col_name to the parameter. In this way in ctlearn all the parameters selected are present in features. example: features: {'image': <tf.Tensor 'IteratorGetNext:0' shape=<unknown> dtype=float32>, 'parameter_hillas_intensity': <tf.Tensor 'IteratorGetNext:2' shape=<unknown> dtype=float32>, 'parameter_leakage_intensity_2': $tf.Tensor 'IteratorGetNext:3' shape=<unknown> dtype=float32>}

change position attribute and event loop in __getitem__

replace pop () with a non-destructive method

Preparing reader for parameters in stereo mode. Need to understand which value pass to image_index at line 523

fix a little distraction

new notebook to show how the new reader works. Image parameters at the moment can be succesfully loaded only in mono mode and be selected in the yml ctlearn file.

added pyIRF simulation table; added pyirf to setup.py removed ctapipe from setup.py removed multi stereo mode annd support this feature in the stereo mode by calling more than one telescope type parameter selection cuts based on the parameter tables and not on the image itself added cleaned images read images and/or parameters added multiplicity cut on the subarray

Speed ups due to astropy tables Different pointing modes. Pointing modes over time is not working at the moment due to ctapipe issue 1484 & 1562 Update stage1 dl1 reading notebook

The stage1 tool is compressing the image and peak_time columns to integer values. This commit is converting back to floating point values, if a compression was used.

dl1_data_handler/reader.py

Add ability to write non-MC MAGIC data

magic_data

Take into account the north pointing Take the num of showers for pyirf from the shower distribution table. Some stage1 files (very few though) are missing this metadata.

…handler into stage1

…ression in MAGIC As discussed in the call, it is better to reconstruct the SrcPosX/Y in the camera for the arrival direction regression. This is a dirty fix for the time being here. We need to come up with a more appropriate solution in ctapipe_io_magic for this regression task (under discussion).

It will automatically detect the split, which was used in the ctapipe-stage1/ctapipe-merge tool and deal with the different 'Group' names. The table itself have the same structure. get rid of some local variables that occupied some memory store unshuffled example_identifiers to disk and shuffle them afterwards. minor bug fix -> read *_TRANSFORM_SCALE from the first table and not from the table tel_001. It broke before, when LST1 is not in the file.

Wrong index was used

This feature can be now used with several processes.

TjarkMiener · 2021-12-09T16:58:21Z

Hi @nietootein. This PR is pending for a while and it was already used to produce the results for 2109.05809 and 2112.01828. For reproducibility purposes, can we merge this PR and release a new version with known issues (see #104)?

This PR requires CTLearn v0.5.1 #136

nietootein · 2021-12-09T17:02:30Z

Sorry for the latency, @TjarkMiener. PR merged.

LucaRomanato and others added 30 commits November 26, 2020 08:05

Streamline filters and reader code

91d5688

fix morphology value

2feff94

from Unsigned Int to Signed Int

Add all the DataContainer filters

5cd36b5

Streamline code + image_selection for new format

7a04564

Use "image_selection" also for the new format in order to use the intensity cut without implement the same filter for image_selection_from_file

Merge pull request #1 from cta-observatory/hillas

9542607

update

support only new format

29e3781

Streamline filters

fa4ebbe

delete non used parameter algorithm in the filters and streamline duplicate code

Update filters.py

6ee3058

stay consistent

streamline _select_image_from_file code

e13ef08

update __getitem__ for parameters

f1ee830

In reader.py now i read also the parameters selected (i.e. training_parameters) of the image. This fix the ctlearn bug of the previus commit

fix bug if image_selection_from_file or training_parameters are not p…

5246a4a

…resent also tested and fixed for all the combination possible (i.e. both present, no one present, one present and viceversa)

Update reader.py

e993e91

change position attribute and event loop in __getitem__

update path event_intensity_filter

5c745fa

Add legend informations

4929d83

exlude algorithms for filter_function

7cd3078

replace pop () with a non-destructive method

Preparing reader.py for future works

c4e3a36

Preparing reader for parameters in stereo mode. Need to understand which value pass to image_index at line 523

Improving Reader performance

1697589

Update log in writer

b4975e9

Update reader.py

9196f96

fix reader.py

4cadbfc

fix a little distraction

fix np.log10 Hillas_log_intensity

72a7b77

notebook for reader demo

b27c253

new notebook to show how the new reader works. Image parameters at the moment can be succesfully loaded only in mono mode and be selected in the yml ctlearn file.

update event_intensity_filter to new format

0fab6af

subclassing current dl1dh reader

bade4a1

adopted ctapipe functionality to construct the reader

544bcc0

Speed ups due to astropy tables Different pointing modes. Pointing modes over time is not working at the moment due to ctapipe issue 1484 & 1562 Update stage1 dl1 reading notebook

TjarkMiener added 3 commits April 9, 2021 13:01

convert back to floating point

9cc6238

The stage1 tool is compressing the image and peak_time columns to integer values. This commit is converting back to floating point values, if a compression was used.

adjusted the transforms functions

bb33e5f

moved ctapipe imports in DL1ReaderStage1 class

ec98582

maxnoe reviewed Apr 9, 2021

View reviewed changes

dl1_data_handler/reader.py Outdated Show resolved Hide resolved

TjarkMiener and others added 21 commits April 9, 2021 14:18

removed try except from ctapipe import

d31c7f2

Add bool self.mc

90a1784

Read run number directly from root file

a687f70

Add a continer for real MAGIC data

94b8850

Change the name of mcheader to just header

32ad9c4

Put superstar MC data behind and if statement

0a874d8

Update _parse_header() to work with real data

736034e

Add a decoder for uproot ASCII arrays

7c6c78f

Comment out MC/real_data print statements.

6303bfa

added energy range to meta data

20e0d14

Merge pull request #101 from astrojarred/stage-1-PR

f36d564

Add ability to write non-MC MAGIC data

Merge branch 'stage1' into magic_data

a1d410f

Merge pull request #102 from cta-observatory/magic_data

97ae825

magic_data

updated transform function to stage1

4f2b249

Take into account the north pointing Take the num of showers for pyirf from the shower distribution table. Some stage1 files (very few though) are missing this metadata.

Merge branch 'stage1' of https://github.com/cta-observatory/dl1-data-…

985c8a5

…handler into stage1

update notebooks

f72fecd

bug fixed

0ecd83e

Wrong index was used

close example identifiers file

e8e958f

This feature can be now used with several processes.

fix the sorting of the images

8690b57

TjarkMiener mentioned this pull request Dec 3, 2021

Dl1dh 0.10.0 ctlearn-project/ctlearn#136

Merged

nietootein merged commit 6fa5949 into master Dec 9, 2021

TjarkMiener deleted the stage1 branch January 13, 2022 15:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migration to Stage1 format #100

Migration to Stage1 format #100

TjarkMiener commented Mar 1, 2021

TjarkMiener commented Dec 9, 2021

nietootein commented Dec 9, 2021

Migration to Stage1 format #100

Migration to Stage1 format #100

Conversation

TjarkMiener commented Mar 1, 2021

TjarkMiener commented Dec 9, 2021

nietootein commented Dec 9, 2021