Feat/talk2scholars docker update #118

ansh-info · 2025-02-28T23:42:36Z

For authors

Both Build and tested locally

Description

This pull request introduces Docker support for the talk2scholars module, aligning it with the existing talk2biomodels setup. The changes include:

Added a Dockerfile for talk2scholars
- Uses python:3.12-slim as the base image
- Installs required dependencies, including g++ and build-essential for compiling pcst_fast
- Exposes port 8501 for Streamlit
- Configures environment variables to be passed at runtime for security
- Preventing API key exposure by passing them at runtime instead of storing them in the Dockerfile

Additional features on request of @gurdeep330 -> Please check the conversation thread below

Enhanced GitHub Actions (ci.yml) for improved CI/CD
- Added trigger to automatically run after the RELEASE workflow completes
  - This ensures Docker images are built whenever a new version is released
  - Maintains backward compatibility with existing triggers (push, PR, manual)
- Implemented module-specific build jobs with path filtering
  - Separate build jobs for talk2biomodels and talk2scholars
  - Only builds Docker images for modules that have been modified
  - Uses dorny/paths-filter to detect changes in specific directories
- Added automatic versioning for Docker images
  - Uses semantic versioning from git tags for release builds
  - Adds both version tags and latest tags to all images
  - Includes git commit SHA in image tags for traceability

Updated README.md
- Documented setup instructions for both talk2biomodels and talk2scholars
- Running containers persistent and detached(-d) mode
- Provided example commands for pulling and running the images

Context and Motivation

The goal of this update is to streamline the deployment process for both talk2biomodels and talk2scholars, ensuring a consistent and scalable setup. By integrating talk2scholars into the CI/CD pipeline, the module can now be built and deployed automatically alongside talk2biomodels.

The enhanced CI workflow improves efficiency by only building Docker images for modules that have changed and automatically handling versioning, reducing maintenance overhead and ensuring consistency across releases.

Dependencies

Environment variables (OPENAI_API_KEY, ZOTERO_API_KEY, ZOTERO_USER_ID, NVIDIA_API_KEY) must be set when running the container.

Fixes # (issue) Mention the issue number.

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests you conducted to verify your changes. These may involve creating new test scripts or updating existing ones.

Added new test(s) in the tests folder
Added new function(s) to an existing test(s) (e.g.: tests/testX.py)
No new tests added (Please explain the rationale in this case)

Checklist

My code follows the style guidelines mentioned in the Code/DevOps guides
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation (e.g. MkDocs)
My changes generate no new warnings
I have added or updated tests (in the tests folder) that prove my fix is effective or that my feature works
New and existing tests pass locally with my changes
Any dependent changes have been merged and published in downstream modules

For reviewers

Checklist pre-approval

Is there enough documentation?
If a new feature has been added, or a bug fixed, has a test been added to confirm good behavior?
Does the test(s) successfully test edge/corner cases?
Does the PR pass the tests? (if the repository has continuous integration)

Checklist post-approval

Does this PR merge develop into main? If so, please make sure to add a prefix (feat/fix/chore) and/or a suffix BREAKING CHANGE (if it's a major release) to your commit message.
Does this PR close an issue? If so, please make sure to descriptively close this issue when the PR is merged.

Checklist post-merge

When you approve of the PR, merge and close it (Read this article to know about different merge methods on GitHub)
Did this PR merge develop into main and is it suppose to run an automated release workflow (if applicable)? If so, please make sure to check under the "Actions" tab to see if the workflow has been initiated, and return later to verify that it has completed successfully.

gurdeep330

Thx @ansh-info for setting this up. Looks very good 👍

I don't have any comments but I have a request in the workflow file. Let me know if it (or a subset) would be possible in this PR? Thx

gurdeep330 · 2025-03-01T10:20:33Z

.github/workflows/ci.yml

Would it be possible to:

Have this workflow automatically run after the RELEASE workflow was successful.

Separate it into different jobs based on the agent, and invoke the jobs corresponding to the changes made. For example, if the changes were made to T2S, then only the docker job of T2S is invoked.

Automatically increment the version (and not use v1 always) :-)

Hi @gurdeep330, Thank you for the review -> Yes we can add these changes, but I want your review first
Please review the below, also regarding the last point we can use GitHub tags, please let me know if this solution will work about tags(will be fetched from the GitHub environment variable)

1. Trigger This Workflow After the RELEASE Workflow Completes

Solution: Modify the workflow trigger to run after release.yml completes successfully using workflow_run:

Update ci.yml:

on: workflow_run: workflows: ["RELEASE"] types: - completed workflow_dispatch:

2. Separate Jobs by Agent and Run Only for Relevant Changes

Solution:
Modify the workflow to conditionally run jobs based on changes in talk2biomodels or talk2scholars.

Update ci.yml:

jobs: detect-changes: runs-on: ubuntu-latest outputs: build_t2b: ${{ steps.filter.outputs.talk2biomodels }} build_t2s: ${{ steps.filter.outputs.talk2scholars }} steps: - name: Checkout Repository uses: actions/checkout@v4 - name: Detect Changes id: filter uses: dorny/paths-filter@v2 with: filters: | talk2biomodels: - 'aiagents4pharma/talk2biomodels/**' talk2scholars: - 'aiagents4pharma/talk2scholars/**' docker-build-t2b: needs: detect-changes if: needs.detect-changes.outputs.build_t2b == 'true' runs-on: ubuntu-latest steps: - name: Checkout Repository uses: actions/checkout@v4 - name: Login to Docker Hub uses: docker/login-action@v3 with: username: ${{ vars.DOCKERHUB_USERNAME }} password: ${{ secrets.DOCKERHUB_TOKEN }} - name: Build and Push Talk2Biomodels uses: docker/build-push-action@v6 with: file: aiagents4pharma/talk2biomodels/Dockerfile push: true tags: virtualpatientengine/talk2biomodels:latest docker-build-t2s: needs: detect-changes if: needs.detect-changes.outputs.build_t2s == 'true' runs-on: ubuntu-latest steps: - name: Checkout Repository uses: actions/checkout@v4 - name: Login to Docker Hub uses: docker/login-action@v3 with: username: ${{ vars.DOCKERHUB_USERNAME }} password: ${{ secrets.DOCKERHUB_TOKEN }} - name: Build and Push Talk2Scholars uses: docker/build-push-action@v6 with: file: aiagents4pharma/talk2scholars/Dockerfile push: true tags: virtualpatientengine/talk2scholars:latest

3. Automatically Increment the Version Instead of Always Using v1

Solution:
Modify the workflow to auto-increment the version using the latest GitHub tag.

Update the Build and Push steps in both jobs:

- name: Get Version id: version run: echo "VERSION=$(git describe --tags --abbrev=0)" >> $GITHUB_ENV - name: Build and Push uses: docker/build-push-action@v6 with: file: aiagents4pharma/talk2biomodels/Dockerfile push: true tags: | virtualpatientengine/talk2biomodels:${{ env.VERSION }} virtualpatientengine/talk2biomodels:latest

Uses the latest Git tag as the version number instead of hardcoding v1.

Yes, this looks fine to me. But could you double-check these tests for different conditions by running them in your forked repo? Thx

…er-update

…abbrev=0

…trigger without any additional conditions.

ansh-info · 2025-03-01T12:42:58Z

HI @gurdeep330 please check the screenshot, The first point you mentioned is completed
"1. Have this workflow automatically run after the RELEASE workflow was successful."

Let me explain what's happening:

The workflow is correctly executing and successfully building the Docker image
It correctly determined the version tag as "v1.10.2" the repository(for my fork)
It's correctly attempting to push to Docker Hub using the account "virtualpatientengine"

The error "denied: requested access to the resource is denied" means that my GitHub Actions workflow doesn't have permission to push to that Docker Hub repository. This is expected in a fork scenario

ansh-info · 2025-03-01T12:54:43Z

@gurdeep330 Please check the screenshot, I am able to achive the second point you mentioned as well:
"2. Separate it into different jobs based on the agent, and invoke the jobs corresponding to the changes made. For example, if the changes were made to T2S, then only the docker job of T2S is invoked."

Previously, with the matrix strategy, when the first job failed (talk2biomodels), the second job (talk2scholars) was automatically canceled. This was because the matrix jobs were part of a single logical job.
Now, with our changes:

Both modules are running as separate, independent jobs
Both jobs ran to completion (even though both failed at the same point)
Each job has its own build summary and record
Both jobs failed with the same error message about Docker Hub access permissions

This confirms that:

The path filtering for detecting changes is working (both modules were detected as changed)
The separation of jobs is working correctly (they run independently)
The versioning is being properly applied to both jobs

The only error happening now is still the Docker Hub permission issue, which is expected in my fork scenario and doesn't indicate any problem with the workflow structure itself.

… specific directories

ansh-info · 2025-03-01T13:08:43Z

@gurdeep330 Your 3rd point is also completed, Please check the attached screenshots
"3. Automatically increment the version (and not use v1 always) :-)"

We have a version check workflow added now

Checks for the latest tag from RELEASE workflow

Build summary -> both got build

Implementation of step 3: automatically incrementing the version instead of using a static version. Since our project is already using semantic-release in the RELEASE workflow, we leveraged that work and ensure the Docker images use the correctly incremented versions.

I've implemented step 3, adding automatic version incrementing with a sophisticated approach:

Key Improvements:

Centralized Version Generation:
- Added a new version job that computes version information once
- Other jobs depend on this job and use its outputs
- This ensures consistent versioning across all Docker images
Smart Version Types:
- For releases triggered by the RELEASE workflow, uses the proper semantic version from git tags
- For development builds (PRs, manual runs), creates a development version like v1.2.3-dev.5 where 5 is the number of commits since the last tag
- Also generates a short Git SHA for an additional tag option
Multiple Docker Tags:
- Each image now gets tagged with:
  - The specific version number (v1.2.3 or v1.2.3-dev.5)
  - The Git short SHA for easy traceability (a1b2c3d)
  - The latest tag for convenience

This approach gives you several benefits:

Images built from release workflows get clean semantic version tags
Images built from PRs or development builds get clearly marked development version tags
Using the Git SHA as a tag makes it easy to trace builds back to specific commits
The version calculation is centralized, ensuring consistent versioning across modules

This implementation completes the third and final step requested in the PR review. The workflow now:

Runs after the RELEASE workflow
Separates jobs based on which modules have changed
Automatically increments version numbers appropriately

Let me know if you'd like any further refinements to this implementation!

ansh-info · 2025-03-01T17:30:13Z

Hello @gurdeep330, Just to make sure, I change the username to my GitHub credentials on my test fork/test brach, and everything ran successfully. Please check the screenshots bellow(-> This was just to test I have removed the images from my docker hub😊):

Github Actions Overview

Docker Hub preview

gurdeep330 · 2025-03-02T09:01:57Z

Thanks very much @ansh-info This is very nicely done 💯 I'll run a few tests after the merge is complete. Great work 🥇

You just need an approval from @dmccloskey :-)

dmccloskey

Very nice work with the CI and DockerHub build and push 👍.

github-actions · 2025-03-02T16:41:24Z

🎉 This PR is included in version 1.24.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

ansh-info added 11 commits February 28, 2025 22:02

fix: updated same as README

29ec9ef

fix: date time error while diplaying papers from zotero

f729c96

chores: updated index for talk2scholars dockerfile

27c503f

chores: updated readme for talk2scholars dockerfile

7013379

feat: added dockerfile for talk2scholars

43c1e9f

fix: updated docker file for better support

98bb654

feat: matrix strategy for parallel deployement

6383f88

chores: updated readme for talk2scholars dockerfile

56dce7d

chores: updated readme for talk2scholars dockerfile

893d107

chores: updated readme for talk2scholars dockerfile

2c947d8

chores: updated readme for talk2scholars dockerfile

031265a

gurdeep330 reviewed Mar 1, 2025

View reviewed changes

gurdeep330 assigned ansh-info Mar 1, 2025

gurdeep330 requested a review from dmccloskey March 1, 2025 10:23

gurdeep330 added enhancement New feature or request Talk2Scholars labels Mar 1, 2025

ansh-info added 14 commits March 1, 2025 11:49

Merge branch 'VirtualPatientEngine:main' into feat/talk2scholars-dock…

38912c4

…er-update

feat: Detects changes and runs only the necessary build jobs

d378d04

feat: Detects changes and runs only the necessary build jobs

173a5c5

fix: Fix Docker login: Use vars for DOCKERHUB_USERNAME

f1603eb

fix: Debug Git Tag

42e53d6

fix: git tags

8c63498

fix: Auto-increments version

7bd5934

fix: retry with with method

14af285

fix: Now uses the latest tag directly from semantic-release

6520ecf

fix: If no tag is found, it provides a meaningful fallback version

100462f

fix: Moved version generation directly into each build job

8e15d12

fix: Using a hardcoded v0.1.0-test version for both modules

c92940d

fix: tags error fix

33a2fef

feat: Make CI workflow run after a RELEASE workflow

ea8e9eb

ansh-info added 3 commits March 1, 2025 13:26

fix: A step that gets the latest git tag using git describe --tags --…

8e0d0d9

…abbrev=0

fix: Removed the invalid condition from the workflow trigger

e4d53d5

fix: I've simplified the workflow to just use the basic workflow_run …

56effa9

…trigger without any additional conditions.

feat: Uses the dorny/paths-filter@v2 action to detect file changes in…

158c2d1

… specific directories

feat: adding automatic version incrementing

8db74ed

ansh-info requested a review from gurdeep330 March 1, 2025 13:09

gurdeep330 approved these changes Mar 2, 2025

View reviewed changes

dmccloskey approved these changes Mar 2, 2025

View reviewed changes

dmccloskey merged commit 9f662ed into VirtualPatientEngine:main Mar 2, 2025
8 of 16 checks passed

github-actions bot added the released label Mar 2, 2025

ansh-info deleted the feat/talk2scholars-docker-update branch March 2, 2025 18:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/talk2scholars docker update #118

Feat/talk2scholars docker update #118

ansh-info commented Feb 28, 2025 •

edited

Loading

gurdeep330 left a comment

gurdeep330 Mar 1, 2025

ansh-info Mar 1, 2025 •

edited

Loading

gurdeep330 Mar 1, 2025

ansh-info commented Mar 1, 2025 •

edited

Loading

ansh-info commented Mar 1, 2025 •

edited

Loading

ansh-info commented Mar 1, 2025

ansh-info commented Mar 1, 2025 •

edited

Loading

gurdeep330 commented Mar 2, 2025

dmccloskey left a comment

github-actions bot commented Mar 2, 2025

Feat/talk2scholars docker update #118

Feat/talk2scholars docker update #118

Conversation

ansh-info commented Feb 28, 2025 • edited Loading

For authors

Both Build and tested locally

Description

Additional features on request of @gurdeep330 -> Please check the conversation thread below

Context and Motivation

Dependencies

Fixes # (issue) Mention the issue number.

Type of change

How Has This Been Tested?

Checklist

For reviewers

Checklist pre-approval

Checklist post-approval

Checklist post-merge

gurdeep330 left a comment

Choose a reason for hiding this comment

gurdeep330 Mar 1, 2025

Choose a reason for hiding this comment

ansh-info Mar 1, 2025 • edited Loading

Choose a reason for hiding this comment

1. Trigger This Workflow After the RELEASE Workflow Completes

2. Separate Jobs by Agent and Run Only for Relevant Changes

3. Automatically Increment the Version Instead of Always Using v1

gurdeep330 Mar 1, 2025

Choose a reason for hiding this comment

ansh-info commented Mar 1, 2025 • edited Loading

ansh-info commented Mar 1, 2025 • edited Loading

ansh-info commented Mar 1, 2025

We have a version check workflow added now

Checks for the latest tag from RELEASE workflow

Build summary -> both got build

Key Improvements:

ansh-info commented Mar 1, 2025 • edited Loading

Github Actions Overview

Docker Hub preview

gurdeep330 commented Mar 2, 2025

dmccloskey left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 2, 2025

ansh-info commented Feb 28, 2025 •

edited

Loading

ansh-info Mar 1, 2025 •

edited

Loading

1. Trigger This Workflow After the `RELEASE` Workflow Completes

3. Automatically Increment the Version Instead of Always Using `v1`

ansh-info commented Mar 1, 2025 •

edited

Loading

ansh-info commented Mar 1, 2025 •

edited

Loading

ansh-info commented Mar 1, 2025 •

edited

Loading