You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to reproduce BART's results on xsum using 'bart-large-xsum' and modified examples/summarization/bart/evaluate_cnn.py (max_length=60, min_length=10, beam=6, lenpen=1) but got lower ROUGE scores than reported.
I first obtained comparable results on CNNDM using 'bart-large-cnndm' and the dataset on s3:
CNNDM
R-1
R-2
R-L
BART (Lewis et al., 2019)
44.16
21.28
40.9
BART (ours)
44.32
21.12
41.13
I then obtained the raw xsum dataset from the original authors and saved them to test.source and test.target (cased) as for CNNDM. Then I ran evaluate_cnn.py with the new parameters above. Is there anything that I am missing? Thank you!
XSum
R-1
R-2
R-L
BART (Lewis et al., 2019)
45.14
22.27
37.25
BART (ours)
44.7
21.04
35.64
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I downloaded data from here and was able to get 45.37 / 22.30 / 37.19 using facebook/bart-large-xsum model
Hi @swethmandava , this dataset seems to have different train/valid/test split from the original dataset. Can you reproduce the scores with the original dataset?
Greetings,
I am trying to reproduce BART's results on xsum using 'bart-large-xsum' and modified
examples/summarization/bart/evaluate_cnn.py
(max_length=60, min_length=10, beam=6, lenpen=1) but got lower ROUGE scores than reported.I first obtained comparable results on CNNDM using 'bart-large-cnndm' and the dataset on s3:
I then obtained the raw xsum dataset from the original authors and saved them to test.source and test.target (cased) as for CNNDM. Then I ran evaluate_cnn.py with the new parameters above. Is there anything that I am missing? Thank you!
The text was updated successfully, but these errors were encountered: