======= Header metadata =======

Evaluation on 969 random PDF files out of 982 PDF (ratio 1.0).

======= Strict Matching ======= (exact matches)

===== Field-level results =====

label                accuracy     precision    recall       f1           support

abstract             77.27        9.37         9.08         9.22         969
authors              91.62        67.43        66.53        66.98        968
first_author         97.65        91.94        90.8         91.36        967
title                95.9         85.9         83.59        84.73        969

all (micro avg.)     90.61        63.82        62.48        63.14        3873
all (macro avg.)     90.61        63.66        62.5         63.07        3873


======== Soft Matching ======== (ignoring punctuation, case and space characters mismatches)

===== Field-level results =====

label                accuracy     precision    recall       f1           support

abstract             80.42        22.36        21.67        22.01        969
authors              91.67        67.64        66.74        67.19        968
first_author         97.65        91.94        90.8         91.36        967
title                98.01        94.59        92.05        93.31        969

all (micro avg.)     91.94        69.25        67.8         68.52        3873
all (macro avg.)     91.94        69.13        67.81        68.47        3873


==== Levenshtein Matching ===== (Minimum Levenshtein distance at 0.8)

===== Field-level results =====

label                accuracy     precision    recall       f1           support

abstract             86.46        47.28        45.82        46.54        969
authors              96.31        86.49        85.33        85.91        968
first_author         97.73        92.25        91.11        91.68        967
title                98.48        96.5         93.91        95.19        969

all (micro avg.)     94.74        80.72        79.03        79.87        3873
all (macro avg.)     94.74        80.63        79.04        79.83        3873


= Ratcliff/Obershelp Matching = (Minimum Ratcliff/Obershelp similarity at 0.95)

===== Field-level results =====

label                accuracy     precision    recall       f1           support

abstract             85.71        44.2         42.83        43.5         969
authors              93.5         75.08        74.07        74.57        968
first_author         97.65        91.94        90.8         91.36        967
title                98.45        96.39        93.81        95.08        969

all (micro avg.)     93.83        76.98        75.37        76.16        3873
all (macro avg.)     93.83        76.9         75.38        76.13        3873

===== Instance-level results =====

Total expected instances:       969
Total correct instances:        58 (strict)
Total correct instances:        160 (soft)
Total correct instances:        350 (Levenshtein)
Total correct instances:        295 (ObservedRatcliffObershelp)

Instance-level recall:  5.99    (strict)
Instance-level recall:  16.51   (soft)
Instance-level recall:  36.12   (Levenshtein)
Instance-level recall:  30.44   (RatcliffObershelp)

======= Citation metadata =======

Evaluation on 969 random PDF files out of 982 PDF (ratio 1.0).

======= Strict Matching ======= (exact matches)

===== Field-level results =====

label                accuracy     precision    recall       f1           support

authors              96.99        79.38        78.3         78.84        61728
date                 99.39        95.85        94.16        94.99        62107
first_author         99.22        94.74        93.42        94.08        61728
inTitle              99.38        95.77        94.82        95.3         61677
issue                99.86        2.08         75           4.05         16
page                 99.36        96.26        95.32        95.78        52105
title                98.58        90.3         90.84        90.57        60559
volume               99.65        97.85        98.29        98.07        59595

all (micro avg.)     99.05        92.67        92.07        92.37        419515
all (macro avg.)     99.05        81.53        90.02        81.46        419515


======== Soft Matching ======== (ignoring punctuation, case and space characters mismatches)

===== Field-level results =====

label                accuracy     precision    recall       f1           support

authors              97.01        79.51        78.43        78.97        61728
date                 99.39        95.85        94.16        94.99        62107
first_author         99.23        94.82        93.5         94.16        61728
inTitle              99.45        96.25        95.29        95.77        61677
issue                99.86        2.08         75           4.05         16
page                 99.36        96.26        95.32        95.78        52105
title                99.4         95.94        96.52        96.23        60559
volume               99.65        97.85        98.29        98.07        59595

all (micro avg.)     99.17        93.59        92.98        93.29        419515
all (macro avg.)     99.17        82.32        90.81        82.25        419515


==== Levenshtein Matching ===== (Minimum Levenshtein distance at 0.8)

===== Field-level results =====

label                accuracy     precision    recall       f1           support

authors              99.01        93.28        92.01        92.64        61728
date                 99.39        95.85        94.16        94.99        62107
first_author         99.3         95.27        93.94        94.6         61728
inTitle              99.5         96.56        95.6         96.08        61677
issue                99.86        2.08         75           4.05         16
page                 99.36        96.26        95.32        95.78        52105
title                99.65        97.66        98.24        97.95        60559
volume               99.65        97.85        98.29        98.07        59595

all (micro avg.)     99.47        95.97        95.34        95.65        419515
all (macro avg.)     99.47        84.35        92.82        84.27        419515


= Ratcliff/Obershelp Matching = (Minimum Ratcliff/Obershelp similarity at 0.95)

===== Field-level results =====

label                accuracy     precision    recall       f1           support

authors              98.05        86.69        85.51        86.1         61728
date                 99.39        95.85        94.16        94.99        62107
first_author         99.22        94.76        93.44        94.09        61728
inTitle              99.45        96.24        95.28        95.76        61677
issue                99.86        2.08         75           4.05         16
page                 99.36        96.26        95.32        95.78        52105
title                99.63        97.51        98.1         97.81        60559
volume               99.65        97.85        98.29        98.07        59595

all (micro avg.)     99.33        94.86        94.24        94.55        419515
all (macro avg.)     99.33        83.4         91.89        83.33        419515

===== Instance-level results =====

Total expected instances:               62109
Total extracted instances:              62910
Total correct instances:                41373 (strict)
Total correct instances:                44134 (soft)
Total correct instances:                51595 (Levenshtein)
Total correct instances:                48283 (RatcliffObershelp)

Instance-level precision:       65.77 (strict)
Instance-level precision:       70.15 (soft)
Instance-level precision:       82.01 (Levenshtein)
Instance-level precision:       76.75 (RatcliffObershelp)

Instance-level recall:  66.61   (strict)
Instance-level recall:  71.06   (soft)
Instance-level recall:  83.07   (Levenshtein)
Instance-level recall:  77.74   (RatcliffObershelp)

Instance-level f-score: 66.19 (strict)
Instance-level f-score: 70.6 (soft)
Instance-level f-score: 82.54 (Levenshtein)
Instance-level f-score: 77.24 (RatcliffObershelp)

Matching 1 :    57278

Matching 2 :    980

Matching 3 :    1211

Matching 4 :    357

Total matches : 59826

======= Citation context resolution =======

Total expected references:       62109 - 64.1 references per article
Total predicted references:      62910 - 64.92 references per article

Total expected citation contexts:        106379 - 109.78 citation contexts per article
Total predicted citation contexts:       97185 - 100.29 citation contexts per article

Total correct predicted citation contexts:       93525 - 96.52 citation contexts per article
Total wrong predicted citation contexts:         3660 (wrong callout matching, callout missing in NLM, or matching with a bib. ref. not aligned with a bib.ref. in NLM)

Precision citation contexts:     96.23
Recall citation contexts:        87.92
fscore citation contexts:        91.89

======= Fulltext structures =======

Evaluation on 969 random PDF files out of 982 PDF (ratio 1.0).

======= Strict Matching ======= (exact matches)

===== Field-level results =====

label                accuracy     precision    recall       f1           support

availability_stmt    99.76        29.69        26.48        27.99        574
figure_title         87.41        0.02         0.01         0.01         31073
funding_stmt         98.58        5.24         23.84        8.59         906
reference_citation   70.69        55.41        55.64        55.53        106306
reference_figure     81.63        56.95        49.99        53.25        67647
reference_table      99.59        69.03        74.84        71.82        2254
section_title        97.43        85.35        74.07        79.31        21462
table_title          99.23        0.46         0.16         0.24         1832

all (micro avg.)     91.79        54.89        47.8         51.1         232054
all (macro avg.)     91.79        37.77        38.13        37.09        232054


======== Soft Matching ======== (ignoring punctuation, case and space characters mismatches)

===== Field-level results =====

label                accuracy     precision    recall       f1           support

availability_stmt    99.75        38.87        34.67        36.65        574
figure_title         89.02        49           15.16        23.16        31073
funding_stmt         98.35        5.24         23.84        8.59         906
reference_citation   93.3         91.05        91.42        91.23        106306
reference_figure     78.82        57.24        50.24        53.51        67647
reference_table      99.53        69.11        74.93        71.9         2254
section_title        97.15        86.23        74.83        80.13        21462
table_title          99.5         81.02        28.66        42.34        1832

all (micro avg.)     94.43        76.49        66.61        71.21        232054
all (macro avg.)     94.43        59.72        49.22        50.94        232054

===== Document-level ratio results =====

label                accuracy     precision    recall       f1           support

availability_stmt    83.39        96.24        89.2         92.59        574

all (micro avg.)     83.39        96.24        89.2         92.59        574
all (macro avg.)     83.39        96.24        89.2         92.59        574

====================================================================================


Evaluation report in markdown format saved under /home/lfoppiano/grobid/grobid-trainer/../grobid-home/tmp/report.md
:grobid-trainer:jatsEval (Thread[Daemon worker,5,main]) completed. Took 30 mins 38.449 secs.