This document introduces how to finetune the BDD-A model with our data.
The BDD-A model is supposed to run in nvidia-docker environment. If you can run nvidia-docker on your machine, run the ../docker/docker-compose.yml
. Make sure you also have docker-compose installed.
# Setup the docker env
make setup_bdda
cd docker
# Run the docker
# docker-compose run bdda
# Currently nvidia docker is not supported with docker compose so we have to run manually
docker build . -t bdda
docker run -it --rm --gpus='"device=0"' -v $(pwd)/../bdda:/tmp -p 8888:8888 bdda bash
If the nvidia-docker is not available, we have to setup a virtual environment.
make setup_bdda_venv
# Source the venv
source venv/bin/activate
# Install the requirements
pip3 install -r bdda/requirements.txt
Since our data are images, we skip the step of parsing videos into images in the finetuning pipeline of BDD-A model.
- The BDD-A model requires only digits in the image file names, only one underscore is allowed and image format must be jpg.
- The pattern for the image file name is
<sequence_index>_<frame_index>.jpg
, where the<sequence_index>
indicates the different sequences and<frame_index>
indicates different frames in the driving scene. - To make the data recognizable for the BDD-A model in some intermediate steps of the finetuning, the
<sequence_index>
should not be padded while the<frame index>
should be padded ahead with zeros to make it at least 5 digits long.
To ensure the correct naming conventions run the prepare.py
script. This will copy all images with the right naming conventions to the output_folder/camera_images
. If there are label json files they will be converted to gazemap images in output_folder/gazemap_images
.
It will create a naming.json
which is required for reformatting the data back to our naming conventions after running the BDD-A model. If a naming file is provided as optional argument the script will append to this file.
usage: prepare.py [-h] [-o OUTPUT_DIR] [-s SUFFIX]
[--image_topics IMAGE_TOPICS [IMAGE_TOPICS ...]] [-n NAMING]
input_dirs [input_dirs ...]
Prepare the input data according to the bdda naming conventions
positional arguments:
input_dirs Path to the directories with driving scenarios.
optional arguments:
-h, --help show this help message and exit
-o OUTPUT_DIR, --output_dir OUTPUT_DIR
Path to the directory where the prepared files will be
put. (default: bdda/data)
-s SUFFIX, --suffix SUFFIX
Suffix of the image files. (default: .png)
--image_topics IMAGE_TOPICS [IMAGE_TOPICS ...]
All image topics that should be should be prepared
(default: ['front_left', 'front', 'front_right',
'rear_left', 'rear', 'rear_right'])
-n NAMING, --naming NAMING
Naming file required for converting back from the bdda
naming convention. This file is generated by this
tool. If this file provided the naming data will be
appended to this file. (default: None)
Make sure to setup the environment as mentioned above.
This describes how to generate gazemaps for input images.
- Move the prepared input images into
data/inference
. - Run
./predict.sh <path_to_model>
(if no path is provided the bdda model will be used). Note: Currently this does not work for finetuned models (see this issue). We suggest this workaround. - The output will be generated in
<path_to_model>/prediction_iter*
- Generate fake gazemaps with
python3 generate_fake_gazemaps.py
. - Run the model test code.
- The output will be generated in
driver_attention_prediction/logs/<model_name>/prediction_iter*
.
This describes how to finetune the BDD-A model with new data.
- Move the prepared input images and gazemaps into
data/training
and a small subset intodata/validation
. - Run
./train.sh
This describes how to test the finetuned model with test data.
- Move the prepared input images and gazemaps into
data/testing
. - Run
./test.sh <path_to_model>
(if no path is provided the bdda model will be used).
To proceed with our pipeline we rename the data back to our naming conventions. This requires the naming.json
file, generated by the prepare module to generate the correct names. Every scenario with all views will be put into a subfolder of the provided output folder.
usage: reformat_gaze_maps.py [-h] [-o OUTPUT_DIR] [-s SUFFIX] input_dir naming
Reformat the predicted gaze maps into the pipeline naming convention
positional arguments:
input_dir Path to the predicted gazemaps
naming Path to the naming.json
optional arguments:
-h, --help show this help message and exit
-o OUTPUT_DIR, --output_dir OUTPUT_DIR
Path to the output directory where the gazemaps will
be put according to its scenarios (default: bdda)
-s SUFFIX, --suffix SUFFIX
Suffix of the image files. (default: .png)