This repository contains code for the paper "Fine-Grained Predicates Learning for Scene Graph Generation (CVPR 2022)". This code is based on Scene-Graph-Benchmark.pytorch.
News: An extented verison of FGPL is provided in Adaptive Fine-Grained Predicates Learning for Scene Graph Generation.
The performance of current Scene Graph Generation models is severely hampered by some hard-to-distinguish predicates, eg, ''woman-on/standing on/walking on-beach'' or ''woman-near/looking at/in front of-child''. While general SGG models are prone to predict head predicates and existing re-balancing strategies prefer tail categories, none of them can appropriately handle these hard-to-distinguish predicates. To tackle this issue, inspired by fine-grained image classification, which focuses on differentiating among hard-to-distinguish object classes, we propose a method named Fine-Grained Predicates Learning (FGPL) which aims at differentiating among hard-to-distinguish predicates for Scene Graph Generation task. Specifically, we first introduce a Predicate Lattice that helps SGG models to figure out fine-grained predicate pairs. Then, utilizing the Predicate Lattice, we propose a Category Discriminating Loss and an Entity Discriminating Loss, which both contribute to distinguishing fine-grained predicates while maintaining learned discriminatory power over recognizable ones. The proposed model-agnostic strategy significantly boosts the performances of three benchmark models (Transformer, VCTree, and Motif) by 22.8%, 24.1% and 21.7% of Mean Recall (mR@100) on the Predicate Classification sub-task, respectively. Our model also outperforms state-of-the-art methods by a large margin (i.e., 6.1%, 4.6%, and 3.2% of Mean Recall (mR@100)) on the Visual Genome dataset.
Within our Fine-Grained Predicates Learning (FGPL) framework, shown below, we first construct a Predicate Lattice concerning context information to understand ubiquitous correlations among predicates. Then, utilizing the Predicate Lattice, we develop a Category Discriminating Loss and an Entity Discriminating Loss which help SGG models differentiate hard-to-distinguish predicates.
All our experiments are conducted on one NVIDIA GeForce RTX 3090, if you wanna run it on your own device, make sure to follow distributed training instructions in Scene-Graph-Benchmark.pytorch.
Follow DATASET.md for instructions of dataset preprocessing.
Follow the instructions to install and use the code. Also, we provide scripts for training models with FGPL our model (in scripts/885train_[motif/trans/vctree].sh
(https://github.com/XinyuLyu/FGPL/tree/master/scripts)), and
key commands for training script should be set up as follows:\
python ./tools/relation_train_net.py \
--config-file "configs/e2e_relation_X_101_32_8_FPN_1x_transformer_FGPL.yaml"/"configs/e2e_relation_X_101_32_8_FPN_1x_motif_FGPL.yaml"/ "configs/e2e_relation_X_101_32_8_FPN_1x_vctree_FGPL.yaml" \
MODEL.ROI_RELATION_HEAD.USE_GT_BOX True \
MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True\
MODEL.ROI_RELATION_HEAD.PREDICTOR TransformerPredictor/VCTreePredictor/MotifPredictor \
.
.
.
(This is for FGPL) MODEL.ROI_RELATION_HEAD.USE_EXTRA_LOSS True \
(This is for CDL) MODEL.ROI_RELATION_HEAD.USE_LOGITS_REWEIGHT True \
(These are paramters for CDL) MODEL.ROI_RELATION_HEAD.MITIGATION_FACTOR_HYPER 1.5 \
MODEL.ROI_RELATION_HEAD.COMPENSATION_FACTOR_HYPRT 2.0 \
(This is for EDL) MODEL.ROI_RELATION_HEAD.USE_CONTRA_LOSS True \
(This is for EDL) MODEL.ROI_RELATION_HEAD.USE_CONTRA_BCE True \
(These are parameters for EDL) MODEL.ROI_RELATION_HEAD.CONTRA_DISTANCE_LOSS_VALUE 0.6 \
MODEL.ROI_RELATION_HEAD.CONTRA_DISTANCE_LOSS_COF 0.1 \
MODEL.ROI_RELATION_HEAD.CANDIDATE_NUMBER 5 \
OUTPUT_DIR ./checkpoints/${MODEL_NAME};
The trained models(Transformer-FGPL, Motif-FGPL, VCTree-FPGL) on Predcls\SGCLs\SGDet are released as below. We provide test.sh
for directly reproduce the results in our paper. Remember to set MODEL.WEIGHT
as checkopints we provided and choose the corresponding dataset split in DATASETS.TEST
.
Predcls | SGCLs | SGDet |
---|---|---|
Motif-FGPL-Predcls | Motif-FGPL-SGCLS | Motif-FGPL-SGDet |
Transformer-FGPL-Predcls | Transformer-FGPL-SGCLS | Transformer-FGPL-SGDet |
VCTree-FGPL-Predcls | VCTree-FPGL-SGCLS | VCTree-FPGL-SGDet |
Be free to contact me (xinyulyu68@gmail.com) if you have any questions!
The code is implemented based on Scene-Graph-Benchmark.pytorch, and SGG-G2S. Thanks for their great works!
@inproceedings{sgg:FPGL,
author = {Xinyu Lyu and
Lianli Gao and
Yuyu Guo and
Zhou Zhao and
Hao Huang and
Heng Tao Shen and
Jingkuan Song},
title = {Fine-Grained Predicates Learning for Scene Graph Generation},
booktitle = {CVPR},
year = {2022}
}