TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection (AAAI 2024 Paper)

by Hao Sun^{* 1}, Mingyao Zhou^{* 1}, Wenjing Chen^†2, Wei Xie^†1

¹ Central China Normal University, ² Hubei University of Technology, ^* Equal Contribution, ^† Corresponding authors.

Prerequisites

0. Clone this repository

git clone https://github.com/your-repo/tr_detr.git
cd tr_detr

1. Prepare datasets

If any dataset link becomes invalid, you can refer to Hugging Face for alternative resources.

QVHighlights

Download the official feature files for the QVHighlights dataset from Moment-DETR.

Download moment_detr_features.tar.gz (8GB) and extract it under the ../features directory.
You can modify the data directory by changing the feat_root parameter in the shell scripts located in the tr_detr/scripts/ directory.

tar -xf path/to/moment_detr_features.tar.gz

TVSum

Download the feature files for the TVSum dataset from UMT.

Download TVSum (69.1MB) and either extract it under the ../features/tvsum/ directory or modify the feat_root parameter in the TVSum shell scripts located in the tr_detr/scripts/tvsum/ directory.

2. Install dependencies

Python version 3.7 is required. Install dependencies using:

pip install -r requirements.txt

Note: The requirements.txt includes additional libraries that may not be required. These will be cleaned up in future updates. For Anaconda setup, refer to the official Moment-DETR GitHub.

QVHighlights

Training

You can train the model using only video features or both video and audio features:

bash tr_detr/scripts/train.sh   # Only video
bash tr_detr/scripts/train_audio.sh   # Video + audio

The best validation accuracy is achieved at the last epoch.

Inference Evaluation and Codalab Submission

After training, you can generate hl_val_submission.jsonl and hl_test_submission.jsonl for validation and test sets by running:

bash tr_detr/scripts/inference.sh results/{direc}/model_best.ckpt 'val'
bash tr_detr/scripts/inference.sh results/{direc}/model_best.ckpt 'test'

Replace {direc} with the path to your saved checkpoint. For more details on submission, see standalone_eval/README.md.

TVSum

Training

Similar to QVHighlights, you can train the model on the TVSum dataset:

bash tr_detr/scripts/tvsum/train_tvsum.sh   # Only video
bash tr_detr/scripts/tvsum/train_tvsum_audio.sh   # Video + audio

The best results are saved in results_[domain_name]/best_metric.jsonl.

Citation

If you find this repository useful, please cite our work:

@inproceedings{sun_zhou2024tr,
  title={Tr-detr: Task-reciprocal transformer for joint moment retrieval and highlight detection},
  author={Sun, Hao and Zhou, Mingyao and Chen, Wenjing and Xie, Wei},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={5},
  pages={4998--5007},
  year={2024}
}

License

The annotation files and parts of the implementation are borrowed from Moment-DETR and QD-DETR. Consequently, our code is also released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
checkpoint/[V_SOTA]hl-video_tef-exp-2023_07_24_20_09_00		checkpoint/[V_SOTA]hl-video_tef-exp-2023_07_24_20_09_00
data		data
standalone_eval		standalone_eval
tr_detr		tr_detr
utils		utils
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection (AAAI 2024 Paper)

Prerequisites

0. Clone this repository

1. Prepare datasets

QVHighlights

TVSum

2. Install dependencies

QVHighlights

Training

Inference Evaluation and Codalab Submission

TVSum

Training

Citation

License

About

Releases

Packages

Languages

License

mingyao1120/TR-DETR

Folders and files

Latest commit

History

Repository files navigation

TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection (AAAI 2024 Paper)

Prerequisites

0. Clone this repository

1. Prepare datasets

QVHighlights

TVSum

2. Install dependencies

QVHighlights

Training

Inference Evaluation and Codalab Submission

TVSum

Training

Citation

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages