[datasets/benchmark] Improve research use #855

felixdittrich92 · 2022-03-15T21:30:05Z

🚀 The feature

This request can be split into three parts:

ensure that any already integrated dataset which has the information (boxes & text labels) to be used for the recognition task can also be used for this (crop boxes with corresponding labels)
add a section in the documentation (like models) for datasets (also split into detection / recognition)
integrate a script for benchmarking into references/ or directly into the training script for detection and recognition which follows current papers / common used benchmarking splits

detection:
TODO
train:COCO-Text/ ... ?
val: IC03/IC13/...?

recognition:
train: MJSynth/SynthText
val: SVHN/SVT/IIIT5K/IC03/IC13 (+Funsd/Cord)

Motivation, pitch

It would be great to get a comparison to other implementations or other OCR applications for research purposes.
This would make the entire library or its implemented models a little more transparent and easier to compare with others.
As a final point, I have to add that it's just great to see if an implementation reach better benchmarks as other 😅

Additional context

Any feedback or suggestion is very welcome 💯

felixdittrich92 · 2022-06-16T08:11:34Z

closed with #933

felixdittrich92 added the type: enhancement Improvement label Mar 15, 2022

fg-mindee added the module: datasets Related to doctr.datasets label Mar 18, 2022

felixdittrich92 mentioned this issue Mar 23, 2022

[datasets][PoC] Enable dataset usage for recognition task #867

Closed

2 tasks

felixdittrich92 mentioned this issue Apr 13, 2022

[feature] Part 2 from use datasets for recognition #891

Merged

felixdittrich92 mentioned this issue Apr 29, 2022

[docu]: add documentation for datasets #905

Merged

felixdittrich92 closed this as completed Jun 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[datasets/benchmark] Improve research use #855

[datasets/benchmark] Improve research use #855

felixdittrich92 commented Mar 15, 2022 •

edited

Loading

felixdittrich92 commented Jun 16, 2022

[datasets/benchmark] Improve research use #855

[datasets/benchmark] Improve research use #855

Comments

felixdittrich92 commented Mar 15, 2022 • edited Loading

🚀 The feature

Motivation, pitch

Additional context

felixdittrich92 commented Jun 16, 2022

felixdittrich92 commented Mar 15, 2022 •

edited

Loading