-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why ’
is converted to '
?
#247
Comments
I do try to normalize/simplify characters if it does not change semantic meaning. My impression is that The original spelling of the authorship is preserved in JSON format in the verbatim field: "authorship": {
"verbatim": "B.D’Orbigny",
"normalized": "B. D' Orbigny",
"authors": [
"B. D' Orbigny"
],
"originalAuth": {
"authors": [
"B. D' Orbigny"
]
}
}, It might make sense to leave verbatim authorship in csv/tsv output, let me think about it a bit. |
@dimus, I've rechecked the original dataset and found that the compilers used both characters: gnparser converted both to apostrophe in Author, which is OK. I was looking at "D’Orbigny" in the verbatim field and thinking I had inputted "D'Orbigny", so my mistake, all is well. In my pseudo-duplicate search the results are fine: Acteocina candei (D’Orbigny, 1841) [3] |
From #245
Another issue is that "D'Orbigny" in the original is "D’Orbigny" in the gnparser output. Why change UTF-8 27 to e2 80 99?
The text was updated successfully, but these errors were encountered: