Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource: fatcat.wiki #11

Open
bnewbold opened this issue Mar 16, 2020 · 0 comments
Open

resource: fatcat.wiki #11

bnewbold opened this issue Mar 16, 2020 · 0 comments

Comments

@bnewbold
Copy link

Hi!
I have a resource that may be helpful: https://fatcat.wiki
This is a catalog of papers, with metadata coming daily from Crossref and Datacite (and most of Pubmed and Arxiv), with crawling into the Internet Archive web corpus.
PDFs can be downloaded directly from the web links or from the Wayback Machine. If you know of alternative sources (URLs) for PDFs, you can submit them via "save paper now". The entire project is free software (open source), there is a documented API, users can register to edit metadata immediately, bulk data is available, etc, etc. On the back-end I have processed copies of all PDFs in XML form via GROBID, which may be useful for text and data mining.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant