-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feedback from Bob Mesibov #245
Comments
Hm, https://github.com/gnames/gnparser/blob/master/testdata/test_data.md#combination-of-two-uninomials Name: Aaleniella (Danocythere) Canonical: Aaleniella subgen. Danocythere Name: Cordia (Adans.) Kuntze sect. Salimori Canonical: Cordia sect. Salimori Name: Calathus (Lindrothius) KURNAKOV 1961 Canonical: Calathus subgen. Lindrothius Can you add examples that show your cases? |
Can you please show examples for |
Looks like I need to add "dem" as an author word: |
@dimus, sorry, I wasn't paying attention to this issue. The "Genus (Subgenus)" and "Author in Author, Year" cases I was thinking of can be found in in https://github.com/gnames/gnames/files/12587991/regex_OK_gnparser_no.txt. Both forms throw up a quality rating of 2. Please also note that in "Eutrochatella babei (Arango y Molina, 1876)", the "y" is part of the author's surname, so the quality 2 indicator "Spanish 'y' is used instead of '&'" does not apply. |
Thank you @Mesibov for explanation. I do think that I am not sure what to do if Added #251 |
In case of |
I did try to address most of the problems in v1.7.5 |
(1) One problem is that gnparser adds quotes when I use the TSV output option. Originals in the Naturalis Mollusca list, followed by the gnparser output:
"""Glyptothauma"" cf ankasana" | """""""Glyptothauma"""" cf ankasana"""
"""Glyptothauma"" cf. ankasana" | """""""Glyptothauma"""" cf. ankasana"""
"""Glyptothauma"" cf. ankasana de Winter, 1996" | """""""Glyptothauma"""" cf. ankasana de Winter, 1996"""
"""Glyptothauma"" sp. 2" | """""""Glyptothauma"""" sp. 2"""
"Sepietta oweniana (D""Orbigny, 1839-1841)" | """Sepietta oweniana (D""""Orbigny, 1839-1841)"""
"Sepiola atlantica D""Orbigny, 1839-1842" | """Sepiola atlantica D""""Orbigny, 1839-1842"""
"""Triphora"" osclausum Rolán & Fernández-Garcés, 1995" | """""""Triphora"""" osclausum Rolán & Fernández-Garcés, 1995"""
(2) Another issue is that "D'Orbigny" in the original is "D’Orbigny" in the gnparser output. Why change UTF-8 27 to e2 80 99?
(3) regex says reject, gnparser says OK (regex_yes_gnparser_no file)
Please see. A lot of these end with "cf/CF" or "ms/MS".
(4) regex says OK, gnparser rejects (regex_OK_gnparser_no file)
Please see. It looks like gnparser doesn't like "Genus (Subgenus)", which I would have thought OK, and worries about "Author in Author, Year". Note also that the Dutch-persons at Naturalis have used "Von dem Busch" rather than "von dem Busch".
regex_OK_gnparser_no.txt
regex_yes_gnparser_OK.txt
The text was updated successfully, but these errors were encountered: