Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authors are interpreted as subgenera #265

Open
KatjaSchulz opened this issue Jun 19, 2024 · 6 comments
Open

Authors are interpreted as subgenera #265

KatjaSchulz opened this issue Jun 19, 2024 · 6 comments

Comments

@KatjaSchulz
Copy link

These are all valid/accepted names from the current version of the Catalogue of Life

Plant genera
Nassella (Trin.) É.Desv. – simple: Trin. – full: Nassella subgen. Trin.
Dacrycarpus (Endl.) de Laub. – simple: Endl. – full: Dacrycarpus subgen. Endl.
Lysiphyllum (Benth.) de Wit – simple: Benth. – full: Lysiphyllum subgen. Benth.
Tricholemma (Röser) Röser – simple: Roeser – full: Tricholemma subgen. Roeser
Isogonium (Kützing) de Bary – simple: Kuetzing – full: Isogonium subgen. Kuetzing
Euptilota (Kützing) Kützing, 1849 – simple: Kuetzing – full: Euptilota subgen. Kuetzing
Setiechinopsis (Backeb.) de Haas – simple: Backeb. – full: Setiechinopsis subgen. Backeb.

Chromista genera
Cyclotella (Kützing) de Brebisson – simple: Kuetzing – full: Cyclotella subgen. Kuetzing
Tabularia (Kützing) Williams & Round – simple: Kuetzing – full: Tabularia subgen. Kuetzing
Cyrtolophosis (Schew.) – simple: Schew. – full: Cyrtolophosis subgen. Schew.
Pyrocystis (Schütt) Lemmermann, 1899 – simple: Schuett – full: Pyrocystis subgen. Schuett

Chromista families
Anaulaceae (Schütt) Lemmermann – simple: Schuett – full: Anaulaceae subgen. Schuett
Triceratiaceae (Schütt) Lemmermann – simple: Schuett – full: Triceratiaceae subgen. Schuett
Pyxillaceae (Schütt) Simonsen – simple: Schuett – full: Pyxillaceae subgen. Schuett
Pyrocystaceae (Schütt) Lemmermann, 1899 – simple: Schuett – full: Pyrocystaceae subgen. Schuett
Aulacodiscaceae (Schütt) Lemmermann – simple: Schuett – full: Aulacodiscaceae subgen. Schuett
Stictodiscaceae (Schütt) Simonsen – simple: Schuett – full: Stictodiscaceae subgen. Schuett
Lauderiaceae (Schütt) Lemmermann – simple: Schuett – full: Lauderiaceae subgen. Schuett

Protozoa family
Cyrtolophosidiidae (Schew.) – simple: Schew. – full: Cyrtolophosidiidae subgen. Schew.

@dimus
Copy link
Member

dimus commented Jul 30, 2024

thank you @KatjaSchulz for catching this, I am not sure yet how to fix this, because in many cases Aus (Bus) does mean Aus subgen Bus.

Do names like this happen for bonaty names specifically?

@KatjaSchulz
Copy link
Author

Yes, this is a tricky one. All the examples I found were taxa under the botanical code, except for the Cyrtolophosidiidae (Schew.) example which is a really weird one that has since been removed from COL.

One approach to fix this could be a blacklist of strings that can never be interpreted as subgenus names. I think it's pretty safe to put the author strings above on that list. But after digging some more, I also found this name: Sigmoidotropis (Piper) A.Delgado. I don't think there are any subgenera named Piper, but I don't know if I would be comfortable putting that name on the blacklist.

Another approach would be to add processing of rank information to gnparser. I usually have that information for most names I am trying to parse, and I use it to double-check the gnparser results. I realize that would probably be quite a bit of work to implement.

Anyway, here are a few more names I found in the COL 2024 annual archive:

Plant genera;

Hexaphylla (Klokov) P.Caputo & Del Guacchio – simple: Klokov – full: Hexaphylla subgen. Klokov
Parogonum (Haraldson) Desjardins & J. P. Bailey – simple: Haraldson – full: Parogonum subgen. Haraldson
Ericetorum (Jermy) Li Bing Zhang & X. M. Zhou – simple: Jermy – full: Ericetorum subgen. Jermy
Archidasyphyllum (Cabrera) P. L. Ferreira, Saavedra & Groppo – simple: Cabrera – full: Archidasyphyllum subgen. Cabrera
Lamyropsis (Kharadze) Dittrich – simple: Kharadze – full: Lamyropsis subgen. Kharadze
Sigmoidotropis (Piper) A.Delgado – simple: Piper – full: Sigmoidotropis subgen. Piper
Moquiniastrum (Cabrera) G. Sancho – simple: Cabrera – full: Moquiniastrum subgen. Cabrera

Chromista genera:

Hormosira (Endlichter) Meneghini, 1838 – simple: Endlichter – full: Hormosira subgen. Endlichter
Syracolithus (Kamptner) Deflandre in Grassé, 1952 – simple: Kamptner – full: Syracolithus subgen. Kamptner

@dimus
Copy link
Member

dimus commented Jul 31, 2024

I do have a list of Botanical genera authors (https://github.com/gnames/gnparser/blob/master/io/dict/data/genera_auth_icn.txt), and, if they are not ambiguous, I treat the author-matching text in parentheses after genus for bi- trinomials as authorship. I can expand this rule to uninomials as well.

This is pretty close to your suggestion @KatjaSchulz, as I understood it

@dimus
Copy link
Member

dimus commented Oct 24, 2024

@KatjaSchulz would implementation of #267 help for your use case? If all names are botanical, we would not have ambiguity in parsing such names

@KatjaSchulz
Copy link
Author

Yes, I think so. Since I am usually running comprehensive data sets through gnparser, it would be a little bit more work to separate names by code, but it would be feasible. There may be lingering problems with some microorganisms, but I think those would be negligible. Thanks!

dimus added a commit that referenced this issue Nov 7, 2024
dimus added a commit that referenced this issue Nov 7, 2024
To help with ambiguous cases, for example when it is not
clear if `Aus (Bus) cus` has a genus Author `Bus` (bot.)
or it is a `Bus` is a subgenus of `Aus` (zool.). It also
deprecates cultivar flag, which becomes another option for
the code flag. The cultivar flag will be kept for
backward compatibility.
dimus added a commit that referenced this issue Nov 8, 2024
To help with ambiguous cases, for example when it is not
clear if `Aus (Bus) cus` has a genus Author `Bus` (bot.)
or it is a `Bus` is a subgenus of `Aus` (zool.). It also
deprecates cultivar flag, which becomes another option for
the code flag. The cultivar flag will be kept for
backward compatibility.
dimus added a commit that referenced this issue Nov 8, 2024
To help with ambiguous cases, for example when it is not
clear if `Aus (Bus) cus` has a genus Author `Bus` (bot.)
or it is a `Bus` is a subgenus of `Aus` (zool.). It also
deprecates cultivar flag, which becomes another option for
the code flag. The cultivar flag will be kept for
backward compatibility.
@dimus dimus closed this as completed in afaed33 Nov 11, 2024
@dimus
Copy link
Member

dimus commented Nov 11, 2024

Ups, did not mean to close this one, reopening...

Some plant names are now recognized, some still have problems, and Chromista authors are not recognized yet.

There is a new option: code. It allows to force names to be parsed by ICN rules:

https://parser.globalnames.org/api/Hormosira%20(Endlichter)%20Meneghini,%201838?code=bot

https://parser.globalnames.org/?code=botanical&format=html&names=Syracolithus+%28Kamptner%29+Deflandre+in+Grass%C3%A9%2C+1952&with_details=on

Supported values: bact, bacterial, ICNP, bot,
botanical, ICN, cult, cultivar, ICNCP, zoo, zoological, ICZN.

@dimus dimus reopened this Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants