This directory is designed to download images from Jasmin object store and perform inference to:
- detect objects
- classify objects as moth or non-moth
- identify the order
This branch is designed to save crops of beetles.
To use this pipeline on JASMIN you must have access to the following services:
- Login Services: jasmin-login. This provides access to the JASMIN shared services, i.e. login, transfer, scientific analysis servers, Jupyter notebook and LOTUS.
- Object Store: ami-test-o. This is the data object store tenancy for the Automated Monitoring of Insects Trap.
The JASMIN documentation provides useful infomration on how to get set-up with these services. Including:
- Generate an SSH key
- Getting a JASMIN portal account
- Request “jasmin-login” access (access to the shared JASMIN servers and the LOTUS batch cluster)
You will need to add the models files to the ./models subdirectory. Following this you can pass in:
- binary_model_path: The path to the binary model weights
- order_model_path: The path to the binary model weights
- order_threshold_path: The path to the binary model weights
- localisation_model_path: The path to the binary model weights
AMBER team members can find these files on OneDrive. Others can contact Katriona Goldmann for the model files.
There are several object detection models which can be used in this analysis. These have varying recommended confidence thresholds to define object bounding boxes. The box threshold can be altered using the --box_threshold
argument in 04_process_chunks.py
. The table below outlines the recommended thresholds for some models:
Model file name | Recommended box threshold |
---|---|
v1_localizmodel_2021-08-17-12-06.pt (Default) | 0.99 (Default) |
fasterrcnn_resnet50_fpn_tz53qv9v.pt | 0.8 |
Once you have access to JASMIN, you will need to install miniforge to run condat. Then create a conda environment and install packages:
CONDA_ENV_PATH="~/conda_envs/moth_detector_env/"
source ~/miniforge3/bin/activate
conda create -p "${CONDA_ENV_PATH}" python=3.9
conda activate "${CONDA_ENV_PATH}"
conda install pytorch torchvision torchaudio cpuonly -c pytorch
conda install --yes --file requirements.txt
To use the inference scripts you will need to set up a credentials.json
file containing:
{
"AWS_ACCESS_KEY_ID": `SECRET`,
"AWS_SECRET_ACCESS_KEY": `SECRET`,
"AWS_REGION": `SECRET`,
"AWS_URL_ENDPOINT": `SECRET`,
"UKCEH_username": `SECRET`,
"UKCEH_password": `SECRET`,
"directory": './inferences/data'
}
Contact Katriona Goldmann for the AWS Access and UKCEH API configs.
Load the conda env on Jasmin:
source ~/miniforge3/bin/activate
conda activate "~/conda_envs/moth_detector_env/"
or on Baskerville:
module load bask-apps/live
module load CUDA/11.7.0
module load Python/3.9.5-GCCcore-10.3.0
module load Miniforge3/24.1.2-0
eval "$(${EBROOTMINIFORGE3}/bin/conda shell.bash hook)"
source "${EBROOTMINIFORGE3}/etc/profile.d/mamba.sh"
mamba activate "/bask/projects/v/vjgo8416-amber/moth_detector_env"
The multi-core pipeline is run in several steps:
- Listing All Available Deployments
- Generate Key Files
- Chop the keys into chunks
- Analyse the chunks
To find information about the available deployments you can use the print_deployments function. For all deployments:
python 01_print_deployments.py --include_inactive
or for the UK only:
python 01_print_deployments.py \
--subset_countries 'United Kingdom'
python 02_generate_keys.py --bucket 'gbr' --deployment_id 'dep000072' --output_file './keys/solar/dep000072_keys.txt'
python 03_pre_chop_files.py --input_file './keys/solar/dep000072_keys.txt' --file_extensions 'jpg' 'jpeg' --chunk_size 100 --output_file './keys/solar/dep000072_workload_chunks.json'
For a single chunk:
python 04_process_chunks.py \
--chunk_id 1 \
--json_file './keys/solar/dep000072_workload_chunks.json' \
--output_dir './data/solar/dep000072' \
--bucket_name 'gbr' \
--credentials_file './credentials.json' \
--csv_file 'dep000072.csv' \
--localisation_model_path ./models/fasterrcnn_resnet50_fpn_tz53qv9v.pt \
--species_model_path ./models/turing-uk_v03_resnet50_2024-05-13-10-03_state.pt \
--species_labels ./models/03_uk_data_category_map.json \
--perform_inference \
--remove_image \
--save_crops
If running using slurm, we typically write each chunk to an individual csv so ensure the do not overwrite one another. To combine into one file, run:
python 05_combine_outputs.py \
--csv_file_pattern "./data/solar/gbr/dep000072_*.csv" \
--main_csv_file "./data/solar/gbr/dep000072.csv" \
--remove_chunk_files
To run with slurm you need to be logged in on the scientific nodes.
It is recommended you set up a shell script to runfor your country and deployment of interest. For example, solar_field_analysis.sh
peformes inferences for the UK's Solar 1 panels deployment. You can run this using:
sbatch solar_field_analysis.sh
Note to run slurm you will need to install miniforge on the scientific nodes.
To check the slurm queue:
squeue -u USERNAME