Arabic Handwritten Document Data-Listing Solution

How To Install

conda create -n ahdoc python=3.10
conda activate ahdoc

conda install ipython pip

# python dependencies
pip install -r requirement.txt

# clone repo
git clone https://github.com/Hedrax/AHDoc.git
cd AHDoc/

Data

Model Best-Weights

Data Preparation

Text-Detection

Datasets must follow the format of the custom Data format provided above
utils.py has all functions needed to migrate from JSON with Pascal and txt with Yolo format
utils.py also has all functions needed to list image names and get corresponding label files to list in .txt file

OCR Engine

Datasets must follow the format all images in png in the same directory with labels.txt labels must be in the form imageNameWithoutExtension_groundTruth
utils.py has all functions needed for dataset preparation

Train

Follow instructions in train.py
All training configurations are saved in config.py

Test

put the weights files downloaded from the above reference to ./Text Detection/weights/ and ./OCR Engine/weights/
Follow instructions in inference.py and evaluation.py in terms of OCR Engine
All test configurations are saved in config.py

Results

Text-Detection

We compare best-weights of universal model performance on our custom evaluation Arabic handwritten data

Weights	Precision (%)	Recall (%)	F-measure (%)
Universal Model	61.53	34.60	41.33
Our-Model	81.66	78.82	79.07

OCR Module Performance Results

our results on the TEST set of 18-fonts

#	Number of Words	Solid Accuracy%		Salted Accuracy%		Bolded Accuracy%		Notes
		CRR	WRR	CRR	WRR	CRR	WRR
1	1	94.28	70.05	91.85	57.08	77.25	17.81	Tested on 7-Character Words
2	1	94.24	54.06	91.42	50.94	90.19	46.04	-
3	2	89.81	39.38	87.11	34.84	86.75	33.85	-
4	3	89.64	35.63	88.23	37.39	87.79	35.59	-
5	4	82.23	28.59	80.41	24.61	80.25	24.61	-
6	5	73.17	20.25	71.88	17.225	70.25	16.62	-
7	6	66.01	18.78	64.95	13.48	63.50	14.39	-

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
OCR Engine		OCR Engine
Text Detection		Text Detection
imgs		imgs
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Arabic Handwritten Document Data-Listing Solution

How To Install

Data

Text-Detection

Evaluation

OCR ENGINE

Model Best-Weights

Data Preparation

Text-Detection

OCR Engine

Train

Test

Results

Text-Detection

OCR Module Performance Results

Demo

Text-Detection

OCR-ENGINE

reference

About

Contributors 3

Languages

License

Hedrax/AHDoc

Folders and files

Latest commit

History

Repository files navigation

Arabic Handwritten Document Data-Listing Solution

How To Install

Data

Text-Detection

Evaluation

OCR ENGINE

Model Best-Weights

Data Preparation

Text-Detection

OCR Engine

Train

Test

Results

Text-Detection

OCR Module Performance Results

Demo

Text-Detection

OCR-ENGINE

reference

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

Languages