This module shows how to use YOLOv8. Specifically, it shows how to create a COCO dataset, how to train the YOLOv8 model on the COCO dataset, and then run inference to create cropped outputs.
If you just want to do inference, you can use these weights. To perform inference, see step 3 which refers to the infer module
There are two pretrained YOLO models:
For stereoscopic images, use: stereoscopic detection model
For full-fundus images, use: full fundus detection model
At a high level, these are the steps:
- Gathering your original CFP images and segmentation labels (i.e. cup, disc, background segmentation label .png's)
- Create COCO dataset
- Train YOLO model using COCO dataset
- Run inference with YOLO model to create cropped dataset
- Evaluate YOLO model performance
The docs go in this order
Just to make sure everything is clear, the first thing that should be confirmed is that you have your original dataset.
This should be:
- a separate folder of images
- a separate folder of segmentation labels with the same file name as its corresponding image
i.e.: /path/to/dataset/images/drishti001.png which is a CFP /path/to/dataset/labels_images/drishti001.png which is a n-channel PNG (assumes background is red channel, disc is green channel, cup is blue). Probably most important thing is disc is channel 1 since we crop around the disc
Once you have your dataset like that, you can proceed
In case you have full fundus images, but want to have the model see images in stereoscopic view, this allows you to create the stereoscopic dataset from the full fundus
python detection/scripts/preprocess/create_stereoscopic_dataset_from_mask.py \
--image_input_folder /path/to/input \
--label_input_folder /path/to/labels \
--output_image_folder /path/to/output \
--output_label_folder /path/to/labels/output \
--crop_min 0.15 \
--crop_max 0.15 \
--max_offset 0.0
Confirm the outputs are correct with the notebook visualize_stereoscopic_dataset.ipynb
In terms of actual data, a COCO dataset consists of two things:
- A .yaml file describing metadata (paths, class names, etc)
- The actual data, i.e. images and labels, labels being .txt files
Three example .yaml's are shown in this repo solely for example purposes, it was what I used to run. They won't work for you, you'd need to modify paths maybe the class name if you'd like, but shows an example
The final dataset will look like this:
You can see that the images and labels are in the folder in the path shown. The key part is the /train/images, /val/labels, etc.
You can read more about how the labels are specified on the YOLOv8 docs, but essentially the first 0 is the class, and then it is I think x,y of center, and width,height of the bounding box target. The scripts show what it is in more detail if you are interested.
To create a COCO dataset, it is actually very easy.
- Prepare your folders of images and segmentation labels, both as .png's. For example our labels were cup, disc, background, so it was 3 channel images. The images of course CFPs
you should now have:
- /.../dataset/images
- /.../dataset/labels_images
(as already specified above, just confirming)
- Run the python script create_coco_dataset.py and specify your labels_images folder, and desired output folder
For example:
python detection/coco/coco_dataset/create_coco_dataset.py \
--input_folder /path/to/dataset/labels_images \
--output_folder /path/to/dataset/labels_txt \
--padding 0
This will create the .txt's which you will use along with your original images to train the YOLO model. The paths to these will be specified in a .yaml below
Confirm the output is correct with the notebook visualize_coco_dataset.ipynb
- Split the dataset into train and val
Given you now have
- /path/to/dataset/images
- /path/to/dataset/labels_images
- /path/to/dataset/labels_txt
You now want to split it into a train and val split, and also appropriately name the folders into COCO format so it can be read by the YAML and processing scripts properly
python detection/coco/coco_dataset/create_coco_splits.py \
--images_path /path/to/dataset/images \
--labels_path /path/to/dataset/labels_txt \
--output_path /path/to/output \
--split_ratio 0.85
- Create the .yaml
Copy the format specified in the files coco_example.yaml
Specify:
- path: root path to the data. In the above example it would be /path/to/output
- train: relative path to train images. In the above example it would be images/train
- val: relative path to val images. In the above example it would be images/val
- I believe that it automatically finds the labels based on the folder name being
/path/to/dataset/labels/...
which was created for you in thecreate_coco_splits.py
script
You now have:
- a folder of .txt's which are the COCO dataset labels which are cropped around the disc plus
padding
and correspond to the labels given from /labels_images, which are for the /images - another folder which is your COCO train/val dataset with copied images and .txt files in appropriate folder locations
- a .yaml which points to the COCO train/val dataset and appropriate paths
You must start with a pre-trained model (either ours or a provided one), and will need to download pretrained model weights. You can tweak the code to not include, but it's generally recommended.
To download generic model weights, go to official docs: https://github.com/ultralytics/ultralytics?tab=readme-ov-file#models
Otherwise, you can download our model weights if using for full fundus or stereoscopic images to detect optic nerve head: Our Pretrained YOLO detection models
And choose whatever sized model you want
python detection/scripts/train/train_yolo.py \
--data path/to/coco_yaml_.yaml \
--model path/to/pretrained/model.pt \
--epochs 200 \
--imgsz 640 \
--lr0 0.000005 \
--patience 10
Even though no output path is specified, it ends up making a folder called ./runs/detect/train...
and saves all your things there like results and checkpoint/final model weights etc.
To see some examples of the YOLO output from images in the dataset, refer to the notebook visualize_yolo_infer.ipynb
You now have:
- A trained YOLO model weights in the path
./runs/detect/train1/weights/best.pt
To run inference, you need the images you want to run on, and the trained YOLO model:
- /path/to/inference/images - the folder of images you want to run inference on
- /runs/detect/train1/weights/best.pt - the path to the trained YOLO model, which crops how you trained it to with the COCO dataset
Once you have that, and you should if you followed instructions so far, just run the create_yolocropped_dataset_multiprocess.py
script:
python detection/scripts/infer/create_yolocropped_dataset_multiprocess.py \
--root_directory /path/to/input/images \
--output_directory /path/to/output \
--model_path /path/to/model \
--threshold 0.875 \
--output_img_size 512 \
--batch_size 16 \
--num_processes 64
That will take the images in /path/to/inference/images, run inference on all those images using the best.pt
weights for a YOLO model in parallel.
Images will be cropped and saved if the detection was above the threshold (meaning it was a high quality detection and not a spurious one) and resize it to your desired output size
You now have
- A folder with cropped images where you specified output_directory using the trained YOLO model, from the images you specified in root_directory
python detection/scripts/convert_labels/convert_grayscale_labels_multichannel.py \
--input_folder /path/to/labels/ \
--output_folder /path/to/outputs
To collect reporting metrics, run YOLO's val method
python detection/scripts/evaluate/val_yolo.py \
--data /path/to/yaml.yaml \
--model /path/to/model.pt \
--imgsz 640 \
--batch_size 16 \
--device 0