DATASET

The following is adapted from Scene-Graph-Benchmark.

Download the VG images part1 (9 Gb) part2 (5 Gb). Extract these images to the file datasets/vg/VG_100K. If you want to use other directory, please link it in DATASETS['VG_stanford_filtered']['img_dir'] of maskrcnn_benchmark/config/paths_catelog.py.
Download the scene graphs and extract them to datasets/vg/VG-SGG-with-attri.h5, or you can edit the path in DATASETS['VG_stanford_filtered_with_attribute']['roidb_file'] of maskrcnn_benchmark/config/paths_catalog.py.

Download the GQA images Full (20.3 Gb). Extract these images to the file datasets/gqa/images. If you want to use other directory, please link it in DATASETS['VG_stanford_filtered']['img_dir'] of maskrcnn_benchmark/config/paths_catelog.py.
In order to achieve a representative split like VG150, we manually clean up a substantial fraction of annotations that have poor-quality or ambiguous meanings, and then select Top-200 object classes as well as Top-100 predicate classes by their frequency, thus establishing the GQA200 split. You can download the annotation file from this link, and put all three files to datasets/gqa/.

Provide feedback