A curated list of biomedical knowledge graphs and of resources for their construction.
This repository is inspired by awesome lists and follows the style guide of the awesome manifesto.
The goal of this repository is to provide an overview of knowledge graphs in the domain of biomedicine and of resources for their construction. This is achieved in four complementary ways:
- A survey presents a broad overview of academic and commercial projects that provide biomedical knowledge graphs or associated resources.
- A curated list contains an opinionated selection of some of these projects that I find worth highlighting for various reasons.
- A collection of notebooks delivers an in-depth inspection of a handful of projects by performing an exploratory data analysis on their knowledge graphs.
- A Python package named "kgw" provides workflows for downloading selected biomedical knowledge graphs from different web data repositoriers, converting them into desired target formats (SQLite, SQL, CSV, JSONL, GraphML, MeTTa) and analyzing their contents (statistics, schema visualizations).
I hope this work serves you well! If you have a suggestion, notice an error, or just want to drop a message, please don't hesitate to contact me. Direct contributions via a pull request are also highly welcome.
A PDF report and accompanying website present a comprehensive overview of available biomedical knowledge graphs and of resources for their construction.
A curated list presents and characterizes a carefully selected subset of the survey's entries in the style of an awesome list.
The following Jupyter notebooks provide detailed inspections of five projects, with previews of their knowledge graph schemata:
The Python 3 package kgw and its documentation enable simple retrieving and conversion of several biomedical knowledge graphs. It is a clean reimplementation of functionality that was explored here previously in Jupyter notebooks. For example, instead of using a CSV file as intermediate format, a file-based SQLite database was chosen for faster and more flexible querying of the knowledge graph contents. This simplifies all downstream conversions and analyses. In future, a greater variety of knowledge graphs may be covered by simply adding an extraction function to get it from its web data repository and a transformation function to convert it into the shared SQLite representation. From there on, all existing conversion functions can be reused immediately. Contributions to the package are welcome and encouraged!