Skip to content

Commit

Permalink
docs: updated missing docs
Browse files Browse the repository at this point in the history
  • Loading branch information
CS76 committed Jul 10, 2023
1 parent 98d88f0 commit d4e7d28
Show file tree
Hide file tree
Showing 5 changed files with 25 additions and 6 deletions.
18 changes: 17 additions & 1 deletion docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,20 @@ outline: deep

# Architecture

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Cheminformatics toolkits are based on different underlying programming languages. RDkit is based on C++ and Python, CDK is based on JAVA, and OpenBabel is based on C++, to name a few. These toolkits require different environmental setups for their integration. Being able to package and use any or all of these toolkits through a unified API would offer several advantages in the ease of use and integration into the existing workflows or applications. In addition, including these toolkits with the application code can be challenging because of compatibility issues and could become a maintenance nightmare. To address these issues, we have reached out to two typical software development techniques: containerization and microservices.

<p align="center">
<img align="center" src="/architecture.png" alt="Logo" width="90%">
</p>

Microservices, also known as microservice architecture, is a software development approach that involves building applications as a collection of small, independent services that can be deployed, scaled and maintained independently. Each microservice performs a specific business function and communicates with other microservices using well-defined APIs. Containers are lightweight and isolated environments that package applications and their dependencies, allowing them to run consistently across different systems and environments. Containers provide a consistent and reproducible execution environment, ensuring that applications work the same way across development, testing, and production environments. Docker is a leading platform for containerization, providing a comprehensive set of tools and services for creating and managing containers. CPM is containerized using Docker and is distributed publicly via the docker hub, a cloud-based registry provided by Docker that allows developers to store, share, and distribute Docker images.

REST (Representational State Transfer) API is widely used and preferred for application development due to several advantages it offers in terms of simplicity and ease of use, scalability, and performance. REST API also offers platform and language independence, flexibility and extensibility and wide range compatibility in its integrations. We have chosen FAST API, a modern, fast, and highly efficient web framework for building APIs with Python. It allows you to create robust and scalable APIs quickly and easily. Our REST API is built on the OpenAPI Specification 3.1.0 (OpenAPI, formerly known as Swagger, is an open standard for defining, documenting, and designing RESTful APIs. It allows you to describe the endpoints, request/response payloads, authentication methods, and other details of your API in a machine-readable format), which improves the functionality of REST APIs by offering standard documentation, promoting interoperability, enabling code generation, simplifying validation, and integrating with various tools and libraries.

The Cheminformatics Micro Service project utilizes the containerized microservices approach to package chemistry toolkits and state-of-the-art deep learning tools to provide various functionalities from format conversions, OSR and chemical data standardisers accessible via standard REST API. Cheminformatics Micro Service comes pre-packaged with toolkits RDKit, CDK, OpenBabel and deep learning tools (DECIMER, STOUT) for handling chemical data - OSR, format conversions, and descriptor calculation. This enables efficient handling of large data volumes and improved performance and development of cheminformatics applications that are scalable and interoperable.

Combining FastAPI with Docker will also simplify the deployment process, making it easier to distribute and run your API in various environments. Moreover, the microservice architecture can help improve the maintainability and flexibility of cheminformatics applications. Changes to one microservice can be made without affecting the other services, which reduces the risk of introducing bugs or errors. It also allows developers to modify or update individual services without having to rewrite the entire application.

It's important to note that the cheminformatics toolkits distributed with CPM are all packaged under one container. We consciously chose to go against the usual notion/practice that containers are supposed to do one thing, so every cheminformatics toolkit needs to be packaged as a separate microservice. This is to avoid unnecessary complexity of container orchestration across multiple containers while the containers, as such, can scale indefinitely as they are stateless.

CPM Docker file, a docker-compose YAML file, and other deployment scripts are available on the GitHub repository for anyone to orchestrate their deployment and manage multiple Docker containers as a single unit. HELM charts are also available for users to deploy the CPM docker container and its dependencies to their Kubernetes cluster. Prometheus (a monitoring and alerting tool that collects and stores time-series data metrics from various targets in real-time. It has a flexible query language and powerful data model that allows you to aggregate, analyze, and alert on your metrics data.) and Grafana (a popular open-source data visualization tool that works seamlessly with Prometheus and other data sources. It provides a rich set of features for creating and sharing dynamic, customizable dashboards that display metrics in real-time) popular open-source tools are implemented for logging, monitoring, and visualizing usage statistics in a standalone or distributed system.
5 changes: 2 additions & 3 deletions docs/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,11 @@ outline: deep

<br/>

# Cheminformatics Python Microservice

<div style="text-align: justify;">

This collection of vital and invaluable microservices is specifically crafted for cheminformatics support, accessible through API calls. Primarily optimized for SMILES-based inputs, these microservices facilitate tasks such as translating between various machine-readable representations, obtaining Natural Product (NP) likeliness scores, visualizing chemical structures, and generating descriptors. Additionally, within this microservice suite, there is an instance of [STOUT](https://github.com/Kohulan/Smiles-TO-iUpac-Translator) and another instance of [DECIMER](https://github.com/Kohulan/DECIMER-Image_Transformer), both of which are deep learning models utilized for IUPAC name generation and Optical Chemical Structure Recognition[(OCSR)](https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00465-0), respectively.
The Cheminformatics Python Microservice offers a collection of versatile functions accessible via REST endpoints that can handle chemical data and perform various cheminformatics tasks. These tasks include generating chemical structure depictions, 3D conformers, descriptors, IUPAC names, and converting machine-readable formats. Researchers and developers can effectively access open-source cheminformatics toolkits through these microservices and extend them easily based on their use case.

This microservice packaged a docker image (container) enables effortless deployment and scalability, making them suitable for academic research and industry applications. Because of their modular nature, these microservices can be customized and combined to meet various needs in cheminformatics research and chemical data analysis.
</div>

<div style="text-align: justify;">
Expand Down
2 changes: 1 addition & 1 deletion docs/public-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ outline: deep

# API - Public Instance

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
A CPM public server is deployed for anyone to integrate its functionality into their tools/workflows. The CPM server is accessible at https://api.naturalproducts.net. CPM public server hosts the latest release of the CPM deployed via GitHub actions. Bugs and feature requests are tracked via Github issue or bug tracking functionality.
Binary file added docs/public/architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 5 additions & 1 deletion docs/versions.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,14 @@ outline: deep

<div style="text-align: justify;">

The Cheminformatics Python Microservices framework is developed using Python and [FastAPI](https://fastapi.tiangolo.com/), leveraging the [RDKit](https://www.rdkit.org/) and [Chemistry Development Kit (CDK)](https://cdk.github.io/) libraries in the backend. CDK is accessed through [jpype](https://jpype.readthedocs.io/en/latest/index.html), allowing the implementation of functions that can be seamlessly utilized within the Python environment. For detailed information about the specific versions employed, please refer to the list provided below.
To ensure accurate data reproduction, it is essential to have a strong tool versioning system in place. The best practices for research data management recommend documenting the software and its components' versions. This is especially important for tools like CPM, which involve multiple dependencies. CPM uses multi-level versioning to document the API and underlying software dependencies. This approach is user-friendly and avoids overwhelming users with confusing versions.

The CPM codebase is updated twice a year (bi-annual release cycle) with documentation for the underlying toolkits, tools, and environment dependencies for each release. The REST API also has release cycles that run parallel to the software release cycle, allowing you to introduce changes and enhancements to your API without breaking existing client applications. However, the REST API releases are only based on changes in REST communication and will not be updated if there are no changes in REST endpoints. In principle, researchers can update the underlying cheminformatics toolkits as and when new releases are available without updating their code bases since the REST API would remain the same.

The Cheminformatics Python Microservices framework is developed using Python and [FastAPI](https://fastapi.tiangolo.com/), leveraging the [RDKit](https://www.rdkit.org/), [OpenBabel](http://openbabel.org/wiki/Main_Page) and [Chemistry Development Kit (CDK)](https://cdk.github.io/) libraries in the backend. CDK is accessed through [jpype](https://jpype.readthedocs.io/en/latest/index.html), allowing the implementation of functions that can be seamlessly utilized within the Python environment. For detailed information about the specific versions employed, please refer to the list provided below.
</div>

<br/>
<p align="center">
<b> Cheminformatics Python Microservice: V1.0.0 </b>
</p><br/>
Expand Down

0 comments on commit d4e7d28

Please sign in to comment.