Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

Non-matching quotation marks in some dev/devtest sets #36

Open
DCSaunders opened this issue Feb 2, 2022 · 0 comments
Open

Non-matching quotation marks in some dev/devtest sets #36

DCSaunders opened this issue Feb 2, 2022 · 0 comments

Comments

@DCSaunders
Copy link

There are a lot of double (or more than double) quotation marks in the Flores dev and devtest sets

E.g.:

grep '""' flores101_dataset/dev/*dev

The affected sentences seem to vary - eng.dev has none, tel.dev has 72.

While users can clean the files and the effect on evaluation is probably not too strong, it seemed worth flagging if there is ever a dataset update.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant