This is the official implementation of AE TextSpotter, which introduces linguistic information to eliminate the ambiguity in text detection. This code is based on MMDetection v1.0rc1.
Python 3.6+
Pytorch 1.1.0
torchvision 0.2.1
pytorch_transformers 1.1.0
mmcv 0.2.13
Polygon3
opencv-python 4.4.0
Please refer to MMDetection v1.0rc1 for installation.
Step1: Downloading dataset from ICDAR 2019 ReCTS.
Step2: The root of "data/ReCTS" should be:
data/ReCTS/
├── train
│ ├── img
│ ├── gt
├── test
│ ├── img
In folder "data/ReCTS/", files "TDA_ReCTS_train_list.txt" and "TDA_ReCTS_val_list.txt" are downloaded from TDA-ReCTS. Other json files can be generated by run "python tools/rects_prepare_data.py".
Step3: Download and unzip bert-base-chinese.zip in the root of this repository.
unzip bert-base-chinese.zip
Step1:
tools/rects_dist_train.sh local_configs/rects_ae_textspotter_r50_1x.py 8
Step2:
tools/rects_dist_train.sh local_configs/rects_ae_textspotter_lm_r50_1x.py 8
TDA-ReCTS
tools/rects_dist_test.sh local_configs/rects_ae_textspotter_lm_r50_1x.py work_dirs/rects_ae_textspotter_lm_r50_1x/latest.pth 8 --json_out results.json
ICDAR 2019 ReCTS Task 4: End-to-End Text Spotting
tools/rects_dist_test.sh local_configs/rects_ae_textspotter_lm_r50_1x_test.py work_dirs/rects_ae_textspotter_lm_r50_1x/latest.pth 8 --json_out results_test.json
python tools/rects_trans2submit.py
python tools/rects_test.py local_configs/rects_ae_textspotter_lm_r50_1x.py work_dirs/rects_ae_textspotter_lm_r50_1x/latest.pth --show
The training list, validation list, and evaluation script of this code come from TDA-ReCTS
python tools/rects_eval.py
The output of the evaluation script should be:
[Best F-Measure] p: 84.94, r: 78.10, f: 81.37, 1-ned: 51.02, best_score_th: 0.569
[Best 1-NED] p: 86.68, r: 76.09, f: 81.04, 1-ned: 51.51, best_score_th: 0.626
Method | Precision (%) | Recall (%) | F-measure (%) | 1-NED (%) | Model |
---|---|---|---|---|---|
AE TextSpotter | 84.94 | 78.10 | 81.37 | 51.51 | Google Drive |
AE TextSpotter (Paper) | 84.78 | 78.28 | 81.39 | 51.32 | - |
Method | Precision (%) | Recall (%) | F-measure (%) | 1-NED (%) | Model |
---|---|---|---|---|---|
AE TextSpotter | 93.38 | 89.98 | 91.65 | 71.83 | Same as TDA-ReCTS |
AE TextSpotter (Paper) | 92.60 | 91.01 | 91.80 | 71.81 | - |
This project is released under the Apache 2.0 license.
If you use this work in your research, please cite us.
@inproceedings{wenhai2020ae,
title={AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting},
author={Wang, Wenhai and Liu, Xuebo and Ji, Xiaozhong and Xie, Enze and Liang, Ding and Yang, ZhiBo and Lu, Tong and Shen, Chunhua and Luo, Ping},
booktitle={European Conference on Computer Vision (ECCV)},
year={2020}
}
PAN (ICCV 2019): https://github.com/whai362/pan_pp.pytorch
PSENet (CVPR 2019): https://github.com/whai362/PSENet