Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Benchmark of crnn_mobilenet_v3 small #523

Merged
merged 8 commits into from
Oct 4, 2021
42 changes: 23 additions & 19 deletions docs/source/using_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -100,8 +100,8 @@ For a comprehensive comparison, we have compiled a detailed benchmark on publicl
* - crnn_mobilenet_v3_small
- (32, 128, 3)
- 2.1M
-
-
- 86.21
- 90.56
-
* - crnn_mobilenet_v3_large
- (32, 128, 3)
Expand Down Expand Up @@ -171,6 +171,8 @@ For a comprehensive comparison, we have compiled a detailed benchmark on publicl
+----------------------------------------+------------+---------------+---------+------------+---------------+---------+
| db_resnet50 + sar_resnet31 | 71.25 | 76.29 | 0.27 | 84.50 | **81.96** | 0.83 |
+----------------------------------------+------------+---------------+---------+------------+---------------+---------+
| db_resnet50 + crnn_mobilenet_v3_small | 69.85 | 74.80 | | 80.85 | 78.42 | 0.83 |
+----------------------------------------+------------+---------------+---------+------------+---------------+---------+
| db_mobilenet_v3_large + crnn_vgg16_bn | 67.73 | 71.73 | | 71.65 | 59.03 | |
+----------------------------------------+------------+---------------+---------+------------+---------------+---------+
| Gvision text detection | 59.50 | 62.50 | | 75.30 | 70.00 | |
Expand All @@ -190,23 +192,25 @@ FPS (Frames per second) is computed after a warmup phase of 100 tensors (where t
Since you may be looking for specific use cases, we also performed this benchmark on private datasets with various document types below. Unfortunately, we are not able to share those at the moment since they contain sensitive information.


+----------------------------------------------+----------------------------+----------------------------+----------------------------+----------------------------+
| | Receipts | Invoices | IDs | US Tax Forms |
+==============================================+============+===============+============+===============+============+===============+============+===============+
| **Architecture** | **Recall** | **Precision** | **Recall** | **Precision** | **Recall** | **Precision** | **Recall** | **Precision** |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| db_resnet50 + crnn_vgg16_bn (ours) | 78.70 | 81.12 | 65.80 | 70.70 | 50.25 | 51.78 | 79.08 | 92.83 |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| db_resnet50 + master (ours) | **79.00** | **81.42** | 65.57 | 69.86 | 51.34 | 52.90 | 78.86 | 92.57 |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| db_resnet50 + sar_resnet31 (ours) | 78.94 | 81.37 | 65.89 | **70.79** | **51.78** | **53.35** | 79.04 | 92.78 |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| db_mobilenet_v3_large + crnn_vgg16_bn (ours) | 78.36 | 74.93 | 63.04 | 68.41 | 39.36 | 41.75 | 72.14 | 89.97 |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| Gvision doc. text detection | 68.91 | 59.89 | 63.20 | 52.85 | 43.70 | 29.21 | 69.79 | 65.68 |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| AWS textract | 75.77 | 77.70 | **70.47** | 69.13 | 46.39 | 43.32 | **84.31** | **98.11** |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
+----------------------------------------------+----------------------------+----------------------------+----------------------------+----------------------------+----------------------------+----------------------------+
| | Receipts | Invoices | IDs | US Tax Forms | Resumes | Road Fines |
+==============================================+============+===============+============+===============+============+===============+============+===============+============+===============+============+===============+
| **Architecture** | **Recall** | **Precision** | **Recall** | **Precision** | **Recall** | **Precision** | **Recall** | **Precision** | **Recall** | **Precision** | **Recall** | **Precision** |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| db_resnet50 + crnn_vgg16_bn (ours) | 78.70 | 81.12 | 65.80 | 70.70 | 50.25 | 51.78 | 79.08 | 92.83 | | | | |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| db_resnet50 + master (ours) | **79.00** | **81.42** | 65.57 | 69.86 | 51.34 | 52.90 | 78.86 | 92.57 | | | | |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| db_resnet50 + sar_resnet31 (ours) | 78.94 | 81.37 | 65.89 | **70.79** | **51.78** | **53.35** | 79.04 | 92.78 | | | | |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| db_resnet50 + crnn_mobilenet_v3_small (ours) | 76.81 | 79.15 | 64.89 | 69.61 | 45.03 | 46.38 | 78.96 | 92.11 | 85.91 | 87.20 | 84.85 | 85.86 |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| db_mobilenet_v3_large + crnn_vgg16_bn (ours) | 78.36 | 74.93 | 63.04 | 68.41 | 39.36 | 41.75 | 72.14 | 89.97 | | | | |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| Gvision doc. text detection | 68.91 | 59.89 | 63.20 | 52.85 | 43.70 | 29.21 | 69.79 | 65.68 | | | | |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+
| AWS textract | 75.77 | 77.70 | **70.47** | 69.13 | 46.39 | 43.32 | **84.31** | **98.11** | | | | |
+----------------------------------------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+------------+---------------+


Two-stage approaches
Expand Down