Skip to content

Evaluation

Matthew edited this page Jan 22, 2017 · 61 revisions

Table of Contents

Unlexicalised tree linearisation grammars

Language Coverage (test) BLEU score (test) BLEU score (train)
Basque 0.7718 0.3640 0.7244
Catalan 0.8515 0.5123 0.7737
Chinese 0.7631 0.4735 0.9435
Czech 0.9021 0.4374 0.6475
English 0.7501 0.5233 0.8755
Finnish 0.8100 0.4498 0.7978
German 0.7292 0.3995 0.7843
Latin 0.6907 0.1529 0.5657
Spanish 0.8119 0.4517 0.7237
Turkish 0.7364 0.4323 0.8394
Language Coverage (projectivised test) BLEU score (projectivised test) BLEU score (projectivised train)
Basque 0.7453 0.3359 0.8607
Catalan 0.8334 0.3057 0.8247
Chinese 0.7443 0.3962 0.9522
Czech 0.8440 0.3053 0.7400
English 0.7534 0.4276 0.9331
Finnish 0.7753 0.3544 0.8641
German 0.7091 0.3313 0.8433
Latin 0.7111 0.1425 0.7552
Spanish 0.8099 0.2809 0.7572
Turkish 0.7411 0.4384 0.8922

Greedy trigram model (inspired by Bohnet et al., 2012)

Language BLEU
Basque 0.3309
Catalan 0.2509
Chinese 0.2686
Czech 0.3107
German 0.3569
English 0.3739
Spanish 0.2985
Finnish 0.2733
Turkish 0.2694

Treebank statistics

Basque

Corpus sentence length statistic Test Train
Size 1799 5396
Minimum 3.0000 3.0000
Maximum 39.0000 64.0000
Range 36.0000 61.0000
Median 13.0000 12.0000
First quartile 9.0000 9.0000
Third quartile 17.0000 17.0000
Inter-quartile range 8.0000 8.0000
Mean 13.5486 13.5237
Standard deviation 6.2603 6.4147

Catalan

Corpus sentence length statistic Test Train
Size 834 5810
Minimum 2.0000 2.0000
Maximum 103.0000 151.0000
Range 101.0000 149.0000
Median 23.0000 22.0000
First quartile 15.5000 14.0000
Third quartile 32.0000 32.0000
Inter-quartile range 16.5000 18.0000
Mean 24.8261 24.6737
Standard deviation 13.3924 14.1311

Chinese

Corpus sentence length statistic Test Train
Size 500 3997
Minimum 7.0000 4.0000
Maximum 97.0000 111.0000
Range 90.0000 107.0000
Median 21.5000 22.0000
First quartile 15.0000 16.0000
Third quartile 30.0000 31.0000
Inter-quartile range 15.0000 15.0000
Mean 24.0240 24.6705
Standard deviation 11.9059 12.3384

Czech

Corpus sentence length statistic Test Train
Size 9835 66500
Minimum 1.0000 1.0000
Maximum 132.0000 194.0000
Range 131.0000 193.0000
Median 15.0000 15.0000
First quartile 9.0000 9.0000
Third quartile 23.0000 23.0000
Inter-quartile range 14.0000 14.0000
Mean 16.7784 16.8329
Standard deviation 10.7504 10.9095

English

Corpus sentence length statistic Test Train
Size 2077 12543
Minimum 1.0000 1.0000
Maximum 81.0000 159.0000
Range 80.0000 158.0000
Median 9.0000 14.0000
First quartile 4.0000 7.0000
Third quartile 17.0000 23.0000
Inter-quartile range 13.0000 16.0000
Mean 12.0828 16.3108
Standard deviation 10.6050 12.4016

Finnish

Corpus sentence length statistic Test Train
Size 648 12217
Minimum 1.0000 1.0000
Maximum 98.0000 238.0000
Range 97.0000 237.0000
Median 12.0000 12.0000
First quartile 8.0000 7.0000
Third quartile 18.0000 17.0000
Inter-quartile range 10.0000 10.0000
Mean 14.1049 13.3192
Standard deviation 9.7217 9.4880

German

Corpus sentence length statistic Test Train
Size 746 10289
Minimum 1.0000 2.0000
Maximum 49.0000 106.0000
Range 48.0000 104.0000
Median 13.0000 16.0000
First quartile 8.0000 11.0000
Third quartile 20.0000 22.0000
Inter-quartile range 12.0000 11.0000
Mean 14.8029 17.3008
Standard deviation 8.7140 9.1374

Latin

Corpus sentence length statistic Test Train
Size 230 2660
Minimum 5.0000 1.0000
Maximum 44.0000 78.0000
Range 39.0000 77.0000
Median 21.0000 12.0000
First quartile 16.0000 7.0000
Third quartile 25.0000 18.0000
Inter-quartile range 9.0000 11.0000
Mean 21.0087 14.2177
Standard deviation 6.5484 9.6301

Spanish

Corpus sentence length statistic Test Train
Size 164 9080
Minimum 3.0000 2.0000
Maximum 76.0000 118.0000
Range 73.0000 116.0000
Median 19.0000 20.0000
First quartile 12.0000 14.0000
Third quartile 29.0000 29.0000
Inter-quartile range 17.0000 15.0000
Mean 23.0244 22.6374
Standard deviation 15.3902 12.1597

Turkish

Corpus sentence length statistic Test Train
Size 612 3022
Minimum 1.0000 1.0000
Maximum 47.0000 49.0000
Range 46.0000 48.0000
Median 7.0000 7.0000
First quartile 4.5000 5.0000
Third quartile 11.0000 11.0000
Inter-quartile range 6.5000 6.0000
Mean 8.5588 9.1343
Standard deviation 5.8103 6.6433