You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi!
I have a resource that may be helpful: https://fatcat.wiki
This is a catalog of papers, with metadata coming daily from Crossref and Datacite (and most of Pubmed and Arxiv), with crawling into the Internet Archive web corpus.
PDFs can be downloaded directly from the web links or from the Wayback Machine. If you know of alternative sources (URLs) for PDFs, you can submit them via "save paper now". The entire project is free software (open source), there is a documented API, users can register to edit metadata immediately, bulk data is available, etc, etc. On the back-end I have processed copies of all PDFs in XML form via GROBID, which may be useful for text and data mining.
The text was updated successfully, but these errors were encountered:
Hi!
I have a resource that may be helpful: https://fatcat.wiki
This is a catalog of papers, with metadata coming daily from Crossref and Datacite (and most of Pubmed and Arxiv), with crawling into the Internet Archive web corpus.
PDFs can be downloaded directly from the web links or from the Wayback Machine. If you know of alternative sources (URLs) for PDFs, you can submit them via "save paper now". The entire project is free software (open source), there is a documented API, users can register to edit metadata immediately, bulk data is available, etc, etc. On the back-end I have processed copies of all PDFs in XML form via GROBID, which may be useful for text and data mining.
The text was updated successfully, but these errors were encountered: