Skip to content

Latest commit

 

History

History

bdda

Berkley Deep Drive Attention

This document introduces how to finetune the BDD-A model with our data.

Setup the Environment

The BDD-A model is supposed to run in nvidia-docker environment. If you can run nvidia-docker on your machine, run the ../docker/docker-compose.yml. Make sure you also have docker-compose installed.

# Setup the docker env
make setup_bdda
cd docker

# Run the docker
# docker-compose run bdda
# Currently nvidia docker is not supported with docker compose so we have to run manually
docker build . -t bdda
docker run -it --rm --gpus='"device=0"' -v $(pwd)/../bdda:/tmp -p 8888:8888 bdda bash

If the nvidia-docker is not available, we have to setup a virtual environment.

make setup_bdda_venv

# Source the venv
source venv/bin/activate

# Install the requirements
pip3 install -r bdda/requirements.txt

Preparing the Data

Since our data are images, we skip the step of parsing videos into images in the finetuning pipeline of BDD-A model.

Naming Convention

  1. The BDD-A model requires only digits in the image file names, only one underscore is allowed and image format must be jpg.
  2. The pattern for the image file name is <sequence_index>_<frame_index>.jpg, where the <sequence_index> indicates the different sequences and <frame_index> indicates different frames in the driving scene.
  3. To make the data recognizable for the BDD-A model in some intermediate steps of the finetuning, the <sequence_index> should not be padded while the <frame index> should be padded ahead with zeros to make it at least 5 digits long.

Prepare

To ensure the correct naming conventions run the prepare.py script. This will copy all images with the right naming conventions to the output_folder/camera_images. If there are label json files they will be converted to gazemap images in output_folder/gazemap_images.

It will create a naming.json which is required for reformatting the data back to our naming conventions after running the BDD-A model. If a naming file is provided as optional argument the script will append to this file.

usage: prepare.py [-h] [-o OUTPUT_DIR] [-s SUFFIX]
                  [--image_topics IMAGE_TOPICS [IMAGE_TOPICS ...]] [-n NAMING]
                  input_dirs [input_dirs ...]

Prepare the input data according to the bdda naming conventions

positional arguments:
  input_dirs            Path to the directories with driving scenarios.

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT_DIR, --output_dir OUTPUT_DIR
                        Path to the directory where the prepared files will be
                        put. (default: bdda/data)
  -s SUFFIX, --suffix SUFFIX
                        Suffix of the image files. (default: .png)
  --image_topics IMAGE_TOPICS [IMAGE_TOPICS ...]
                        All image topics that should be should be prepared
                        (default: ['front_left', 'front', 'front_right',
                        'rear_left', 'rear', 'rear_right'])
  -n NAMING, --naming NAMING
                        Naming file required for converting back from the bdda
                        naming convention. This file is generated by this
                        tool. If this file provided the naming data will be
                        appended to this file. (default: None)

Using the Model

Make sure to setup the environment as mentioned above.

Prediction

This describes how to generate gazemaps for input images.

  1. Move the prepared input images into data/inference.
  2. Run ./predict.sh <path_to_model> (if no path is provided the bdda model will be used). Note: Currently this does not work for finetuned models (see this issue). We suggest this workaround.
  3. The output will be generated in <path_to_model>/prediction_iter*

Prediction Workaround

  1. Generate fake gazemaps with python3 generate_fake_gazemaps.py.
  2. Run the model test code.
  3. The output will be generated in driver_attention_prediction/logs/<model_name>/prediction_iter*.

Training

This describes how to finetune the BDD-A model with new data.

  1. Move the prepared input images and gazemaps into data/training and a small subset into data/validation.
  2. Run ./train.sh

Test

This describes how to test the finetuned model with test data.

  1. Move the prepared input images and gazemaps into data/testing.
  2. Run ./test.sh <path_to_model> (if no path is provided the bdda model will be used).

Reformat

To proceed with our pipeline we rename the data back to our naming conventions. This requires the naming.json file, generated by the prepare module to generate the correct names. Every scenario with all views will be put into a subfolder of the provided output folder.

usage: reformat_gaze_maps.py [-h] [-o OUTPUT_DIR] [-s SUFFIX] input_dir naming

Reformat the predicted gaze maps into the pipeline naming convention

positional arguments:
  input_dir             Path to the predicted gazemaps
  naming                Path to the naming.json

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT_DIR, --output_dir OUTPUT_DIR
                        Path to the output directory where the gazemaps will
                        be put according to its scenarios (default: bdda)
  -s SUFFIX, --suffix SUFFIX
                        Suffix of the image files. (default: .png)