The following is adapted from Scene-Graph-Benchmark.
- Download the VG images part1 (9 Gb) part2 (5 Gb). Extract these images to the file
datasets/vg/VG_100K
. If you want to use other directory, please link it inDATASETS['VG_stanford_filtered']['img_dir']
ofmaskrcnn_benchmark/config/paths_catelog.py
. - Download the scene graphs and extract them to
datasets/vg/VG-SGG-with-attri.h5
, or you can edit the path inDATASETS['VG_stanford_filtered_with_attribute']['roidb_file']
ofmaskrcnn_benchmark/config/paths_catalog.py
.
- Download the GQA images Full (20.3 Gb). Extract these images to the file
datasets/gqa/images
. If you want to use other directory, please link it inDATASETS['VG_stanford_filtered']['img_dir']
ofmaskrcnn_benchmark/config/paths_catelog.py
. - In order to achieve a representative split like VG150, we manually clean up a substantial fraction of annotations that have poor-quality or ambiguous meanings, and then select Top-200 object classes as well as Top-100 predicate classes by their frequency, thus establishing the GQA200 split. You can download the annotation file from this link, and put all three files to
datasets/gqa/
.