Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bioschemas JSON-LD not properly encoded/escaped #316

Closed
sneumann opened this issue Nov 2, 2021 · 4 comments
Closed

Bioschemas JSON-LD not properly encoded/escaped #316

sneumann opened this issue Nov 2, 2021 · 4 comments
Labels

Comments

@sneumann
Copy link
Member

sneumann commented Nov 2, 2021

Hi, we got a report from @AlasdairGray that:

In preparation for the BioHackathon next week, we have been harvesting data from as many sites as possible. Whilst harvesting the pages from MassBank, we found 10,326 pages with invalid JSON-LD on them. From the page that I inspected, this was due to the use of quotation marks within a text field with the quotation mark not being properly encoded. For example, you can see the error at the following link to the Schema.org syntax validator
https://validator.schema.org/#url=https%3A%2F%2Fmassbank.eu%2FMassBank%2FRecordDisplay%3Fid%3DMSJ00172

A fix probably requires proper encoding of strings in

Yours, Steffen

@sneumann
Copy link
Member Author

sneumann commented Nov 2, 2021

A light-weight choice could be
https://stackoverflow.com/a/22756976/2974851
but note comment on slashes (we have both URLs and InChIs containing slashes ...)
https://stackoverflow.com/questions/3020094/how-should-i-escape-strings-in-json/11610833#comment86039244_22756976

@tsufz
Copy link
Member

tsufz commented Nov 3, 2021

Ah, good to know, I checked the crawlers, they do also complain:

Parsing error: Missing ',' or '}

Example:
image

@tsufz
Copy link
Member

tsufz commented Nov 3, 2021

The second error is a bad escape sequence in the SMILES string:

image

@sneumann
Copy link
Member Author

sneumann commented Dec 2, 2022

With the proper serialisation this can now be closed. Thanks Rene, yours, Steffen

@sneumann sneumann closed this as completed Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants