python>=3.10
python -m pip install -r requirements.txt
pip install .
python main.py --path "pdf_path"
You can also import a model directly via its full name and then call its __call__
method with pdf path or image path.
-
Layer
from pdfdet import uni_model model = uni_model(name='yolov8m_cdla') layer = model(path="image_path") content = layer.to_json() """ content format: { "image": numpy.ndarray, "boxes": [{"box": [x1, y1, x2, y2], "label": str, "score": float}, ...], } """
-
Document
from pdfdet import uni_model import cv2 model = uni_model(name='yolov8m_cdla') doc = model(path="pdf_path") layers = sorted(doc.layers, key=lambda x: int(x)) for i in layers: layer = getattr(doc, i) im = layer.imshow() im = cv2.resize(im, (640, 640)) cv2.imshow("im", im) cv2.waitKey(0) cv2.destroyAllWindows()
# batch predict
python tools/batch_process.py --model "model_name" --src "image_root" --save "res_root"
# generate visualize result
python tools/visualize.py "image_path" "res_path"
#
# evaluate cdla dataset(labelme format)
python tools/eval_map50.py "gt_root" "res_root"
Model | Source | Associated Dataset | optional model |
---|---|---|---|
paddle_pub | PaddlePaddle | PubLayNet(English) | |
paddle_cdla | PaddlePaddle | CDLA(Chinese) | |
cnstd_yolov7 | CNSTD | CDLA | |
yolov8l_doc | huggingface | DocLayNet(English, German,French, Japanese) | yolov8n_doc,yolov8s_doc |
yolov8m_cdla | layout_analysis | CDLA | yolov8n_cdla |
Note: Labels and annotation strategies vary across different datasets. Visual comparison should be the primary method for evaluating effectiveness.
Model | map50 | map50:95 | p | r |
---|---|---|---|---|
paddle_cdla | 0.9675 | 0.8359 | 0.9602 | 0.9347 |
cnstd_yolov7 | 0.9058 | 0.6662 | 0.9543 | 0.8321 |
yolov8m_cdla | 0.9436 | 0.8086 | 0.9449 | 0.8980 |
Model | map50:95 | p | r |
---|---|---|---|
paddle_cdla | 0.5717 | 0.5853 | 0.6248 |
cnstd_yolov7 | 0.5034 | 0.6278 | 0.5651 |
yolov8m_cdla | 0.4783 | 0.5266 | 0.5922 |