Dataset preparation

MOT 2017 Dataset

MOT is the main dataset for training and evaluating the TraDeS tracker.

We have packed the dataset preprocessing code as a script:

    cd $TraDeS_ROOT/src/tools/
    sh get_mot_17.sh

The script includes:

Download and unzip the dataset from MOT17 website.
Convert it into COCO format using tools/convert_mot_to_coco.py.
Create the half-half train/ val set described in the paper.

The output data structure should be:

${TraDeS_ROOT}
|-- data
`-- |-- mot17
    `-- |--- train
        |   |--- MOT17-02-FRCNN
        |   |    |--- img1
        |   |    |--- gt
        |   |    |   |--- gt.txt
        |   |    |   |--- gt_train_half.txt
        |   |    |   |--- gt_val_half.txt
        |   |    |--- det
        |   |    |   |--- det.txt
        |   |    |   |--- det_train_half.txt
        |   |    |   |--- det_val_half.txt
        |   |--- ...
        |--- test
        |   |--- MOT17-01-FRCNN
        |---|--- ...
        `---| annotations
            |--- train_half.json
            |--- val_half.json
            |--- train.json
            `--- test.json

nuScenes Dataset

nuScenes is used for training and evaluating 3D object tracking.

Download the dataset from nuScenes website. You only need to download the "Keyframe blobs", and only need the images data. You also need to download the maps and all metadata to make the nuScenes API happy. According to the current website version, the data is under Full dataset (v1.0) -- Trainval. You should download "Metadata" and "Keyframe blobs only for part * [US]". The estimated data size is around 60G or less.

Unzip, rename, and place (or symlink) the data as below. You will need to merge folders from different zip files.

${TraDeS_ROOT}
|-- data
`-- |-- nuscenes
    `-- |-- v1.0-trainval
        |   |-- samples
        |   |   |-- CAM_BACK
        |   |   |   | -- xxx.jpg
        |   |   |-- CAM_BACK_LEFT
        |   |   |-- CAM_BACK_RIGHT
        |   |   |-- CAM_FRONT
        |   |   |-- CAM_FRONT_LEFT
        |   |   |-- CAM_FRONT_RIGHT
        |-- |-- |-- maps
        |-- |-- |-- v1.0-trainval_meta
        `-- annotations
        |-- |-- train.json
        |-- |-- val.json
        |-- |-- test.json

Run python tools/convert_nuScenes.py to convert the annotation into COCO format. It will create train.json, val.json, test.json under data/nuscenes/annotations. nuScenes API is required for running the data preprocessing.

CrowdHuman Dataset

CrowdHuman is used for pretraining the MOT model. Only the training set is used.

Download the dataset from its website.

Unzip and place (or symlink) the data as below. You will need to merge folders from different zip files.

${TraDeS_ROOT}
|-- data
`-- |-- crowdhuman
    |-- |-- CrowdHuman_train
    |   |   |-- Images
    |-- |-- CrowdHuman_val
    |   |   |-- Images
    |-- |-- annotation_train.odgt
    |-- |-- annotation_val.odgt

Run python tools/convert_crowdhuman_to_coco.py to convert the annotation into COCO format. It will create train.json, val.json under data/crowdhuman/annotations.

Youtube-VIS Dataset

Download the dataset from website.
Converted annotations: train.json and val.json.

-- coco pretrained model: coco_seg.pth

${TraDeS_ROOT}
|-- data
`-- |-- youtube_vis
    |-- |-- train/
    |-- |-- val/
    |-- |-- annotations/
    |-- |-- |-- train.json
    |-- |-- |-- val.json

References

Please cite the corresponding references if you use the datasets.

  @article{MOT16,
    title = {{MOT}16: {A} Benchmark for Multi-Object Tracking},
    shorttitle = {MOT16},
    url = {http://arxiv.org/abs/1603.00831},
    journal = {arXiv:1603.00831 [cs]},
    author = {Milan, A. and Leal-Taix\'{e}, L. and Reid, I. and Roth, S. and Schindler, K.},
    month = mar,
    year = {2016},
    note = {arXiv: 1603.00831},
    keywords = {Computer Science - Computer Vision and Pattern Recognition}
  }

  @article{shao2018crowdhuman,
    title={Crowdhuman: A benchmark for detecting human in a crowd},
    author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
    journal={arXiv:1805.00123},
    year={2018}
  }

  @inproceedings{nuscenes2019,
  title={{nuScenes}: A multimodal dataset for autonomous driving},
  author={Holger Caesar and Varun Bankiti and Alex H. Lang and Sourabh Vora and Venice Erin Liong and Qiang Xu and Anush Krishnan and Yu Pan and Giancarlo Baldan and Oscar Beijbom},
  booktitle={CVPR},
  year={2020}
  }
  
  @inproceedings{yang2019video,
  title={Video instance segmentation},
  author={Yang, Linjie and Fan, Yuchen and Xu, Ning},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={5188--5197},
  year={2019}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DATA.md

DATA.md

Dataset preparation

MOT 2017 Dataset

nuScenes Dataset

CrowdHuman Dataset

Youtube-VIS Dataset

References

Files

DATA.md

Latest commit

History

DATA.md

File metadata and controls

Dataset preparation

MOT 2017 Dataset

nuScenes Dataset

CrowdHuman Dataset

Youtube-VIS Dataset

References