Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Install subworkflows with modules from different remotes #3083

Open
wants to merge 78 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
9825d89
tests: Add test case for cross-organization subwf
jvfe Jun 21, 2024
568363a
tests: Check module on module list instead
jvfe Jun 28, 2024
534da92
chore: Dummy commit to test action
jvfe Jul 1, 2024
286ab86
Revert "chore: Dummy commit to test action"
jvfe Jul 1, 2024
423517e
tests: Change installed subworkflow
jvfe Jul 2, 2024
6816cc1
Merge pull request #6 from sanger-tol/cross-org-test
jvfe Jul 8, 2024
af816fd
feat: Read from meta.yml in get_components_to_install
jvfe Jul 8, 2024
d0e6031
refact: Change module return type to always be dict
jvfe Jul 15, 2024
1737c88
fix: Add correct return type
jvfe Jul 15, 2024
f7aaea2
fix: Use any for values
jvfe Jul 15, 2024
d1c490d
test: Fix module key in across orgs
jvfe Jul 15, 2024
6fa5cf4
refact: Reset modules repo when git_remote not defined
jvfe Jul 15, 2024
d903282
refact: Copy parent attribute
jvfe Jul 15, 2024
7f095fa
refact: Keep old strategy as fallback
jvfe Jul 15, 2024
012e9d6
refact: Check component not in subwf list
jvfe Jul 15, 2024
d77fccb
refact: Change return type to optional
jvfe Jul 24, 2024
faa7faf
refact: Change way of handling dicts in modulesjson
jvfe Jul 24, 2024
0f08bbe
refact: Handle dicts in meta_yml lint
jvfe Jul 24, 2024
6d052de
fix: Pass branch in install too
jvfe Jul 28, 2024
c6ce67f
fix: Check if module in meta.yml is imported
jvfe Aug 5, 2024
792f68b
Merge pull request #9 from sanger-tol/fix/duplicated-downloads
jvfe Aug 5, 2024
7d23f15
Merge branch 'dev' into fix/1927
jvfe Aug 5, 2024
6f9a60e
Merge branch 'dev' into review-round-1
jvfe Aug 10, 2024
fca6f27
refact: Define org path based on git remote
jvfe Aug 10, 2024
092bf91
feat: Allow defining branches in meta.yml
jvfe Aug 10, 2024
208d796
fix: Add empty branch in other dicts
jvfe Aug 10, 2024
71c43be
refact: Rework logic to use subwfs as well
jvfe Aug 12, 2024
8bcf737
refact: Support subwf dict in recreate deps
jvfe Aug 12, 2024
efde7b7
fix: Change modules to subwfs
jvfe Aug 12, 2024
8e25587
fix: Use name value in recreate deps
jvfe Aug 12, 2024
4947913
Revert "fix: Use name value in recreate deps"
jvfe Aug 12, 2024
4888d32
fix: Use subworkflow name in recreate deps
jvfe Aug 12, 2024
aa265d8
fix: Use sw_name in appends too
jvfe Aug 12, 2024
ab0f398
Merge branch 'dev' into fix/1927
jvfe Aug 12, 2024
33c36cf
Merge branch 'fix/1927' into review-round-1
jvfe Aug 12, 2024
9356cd3
Merge branch 'feat/use-subwf' into review-round-1
jvfe Aug 12, 2024
a764c76
fix: Only add module if it's in main.nf too
jvfe Aug 12, 2024
869d816
fix: Handle incomplete meta.yml
jvfe Aug 13, 2024
c3ac1b8
refact: Remove isinstance check in lint/meta.yml
jvfe Aug 13, 2024
4eb95e6
style: Format meta_yml.py
jvfe Aug 13, 2024
e9bf238
refact: Remove isinstance check in components/install.py
jvfe Aug 13, 2024
c8b8a83
Revert "refact: Remove isinstance check in components/install.py"
jvfe Aug 13, 2024
eba8391
refact: Remove isinstance check in recreate_deps
jvfe Aug 13, 2024
e3f7b3f
Merge pull request #10 from sanger-tol/review-round-1
jvfe Aug 13, 2024
ddb3dd0
Merge branch 'dev' into merge-dev
jvfe Aug 13, 2024
634ff8f
Merge pull request #11 from sanger-tol/merge-dev
jvfe Aug 13, 2024
c72b94a
refact: Change function structure to use dicts not lists
jvfe Aug 15, 2024
bb31d78
fix: Change access key in inner dicts
jvfe Aug 15, 2024
7f4cce8
refact: Use component_name every time
jvfe Aug 16, 2024
e342678
refact: Don't default to master
jvfe Aug 16, 2024
50216c4
refact: Move instance check up
jvfe Aug 20, 2024
0e83d9a
Refactoring of component_utils.py
muffato Aug 20, 2024
f8c8251
Revert "Refactoring of component_utils.py"
jvfe Aug 20, 2024
313853e
refact: Use get in components/install.py
jvfe Aug 21, 2024
e8e4da0
refact: Avoid redefining keys when possible
jvfe Aug 21, 2024
f783d58
refact: Use get in modules_json
jvfe Aug 21, 2024
eb59308
refact: Use dictionary input in check_up_to_date
jvfe Aug 21, 2024
be4d78f
Use the power of get to skip if tests
muffato Aug 21, 2024
3c493b9
Merge pull request #13 from sanger-tol/refact/change_structure_levera…
jvfe Aug 21, 2024
b13a406
refact: Raise error if org_path not found
jvfe Aug 22, 2024
e96341e
style: Run ruff format
jvfe Aug 22, 2024
2b9a573
refact: Change UserWarning message
jvfe Aug 22, 2024
e55c74d
style: Run ruff format
jvfe Aug 22, 2024
4b9fea5
Merge pull request #12 from sanger-tol/refact/change_structure
jvfe Aug 22, 2024
ea7d1ca
Merge remote-tracking branch 'upstream/dev' into merge-dev2
jvfe Aug 23, 2024
a40d5d9
Merge remote-tracking branch 'upstream/dev' into fix/1927
jvfe Sep 2, 2024
2491207
refact: Use walrus operator in meta.yml check
jvfe Sep 6, 2024
a870787
refact: Use a union type for component arg
jvfe Sep 6, 2024
7ad4a8c
Merge remote-tracking branch 'upstream/dev' into second-review
jvfe Sep 6, 2024
f3259bd
Merge pull request #14 from sanger-tol/second-review
jvfe Sep 9, 2024
1a8c3be
docs: Add comment for clarification about the feature
jvfe Sep 13, 2024
864eb7f
refact: Add suggestions from the third round of reviews (#15)
jvfe Dec 4, 2024
c0b51d7
Merge dev third rev (#20)
jvfe Dec 4, 2024
c9d9fa4
Merge branch 'dev' into fix/1927
jvfe Dec 4, 2024
1ecb3ec
refact: Use same org_path as modulesrepo first (#22)
jvfe Dec 16, 2024
fd98f95
Squashed commit of the following:
jvfe Dec 16, 2024
f7ee052
Merge branch 'dev' into fix/1927
jvfe Dec 16, 2024
ec41bf2
test: Use correct org name in remove (#23)
jvfe Dec 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion nf_core/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@
subworkflows_test,
subworkflows_update,
)
from nf_core.components.components_utils import NF_CORE_MODULES_REMOTE
from nf_core.components.constants import NF_CORE_MODULES_REMOTE
from nf_core.pipelines.download import DownloadError
from nf_core.utils import check_if_outdated, nfcore_logo, rich_force_colors, setup_nfcore_dir

Expand Down
55 changes: 40 additions & 15 deletions nf_core/components/components_utils.py
Original file line number Diff line number Diff line change
@@ -1,24 +1,18 @@
import logging
import re
from pathlib import Path
from typing import TYPE_CHECKING, List, Optional, Tuple, Union
from typing import Dict, List, Optional, Tuple, Union

import questionary
import requests
import rich.prompt

if TYPE_CHECKING:
from nf_core.modules.modules_repo import ModulesRepo
import yaml

import nf_core.utils
from nf_core.modules.modules_repo import ModulesRepo

log = logging.getLogger(__name__)

# Constants for the nf-core/modules repo used throughout the module files
NF_CORE_MODULES_NAME = "nf-core"
NF_CORE_MODULES_REMOTE = "https://github.com/nf-core/modules.git"
NF_CORE_MODULES_DEFAULT_BRANCH = "master"


def get_repo_info(directory: Path, use_prompt: Optional[bool] = True) -> Tuple[Path, Optional[str], str]:
"""
Expand Down Expand Up @@ -143,12 +137,15 @@ def prompt_component_version_sha(
return git_sha


def get_components_to_install(subworkflow_dir: Union[str, Path]) -> Tuple[List[str], List[str]]:
def get_components_to_install(
subworkflow_dir: Union[str, Path],
) -> Tuple[List[Dict[str, Optional[str]]], List[Dict[str, Optional[str]]]]:
"""
Parse the subworkflow main.nf file to retrieve all imported modules and subworkflows.
"""
modules = []
subworkflows = []
modules: Dict[str, Dict[str, Optional[str]]] = {}
subworkflows: Dict[str, Dict[str, Optional[str]]] = {}

with open(Path(subworkflow_dir, "main.nf")) as fh:
for line in fh:
regex = re.compile(
Expand All @@ -159,10 +156,38 @@ def get_components_to_install(subworkflow_dir: Union[str, Path]) -> Tuple[List[s
name, link = match.groups()
if link.startswith("../../../"):
name_split = name.lower().split("_")
modules.append("/".join(name_split))
component_name = "/".join(name_split)
component_dict: Dict[str, Optional[str]] = {
"name": component_name,
}
modules[component_name] = component_dict
elif link.startswith("../"):
subworkflows.append(name.lower())
return modules, subworkflows
component_name = name.lower()
component_dict = {"name": component_name}
subworkflows[component_name] = component_dict

if (sw_meta := Path(subworkflow_dir, "meta.yml")).exists():
with open(sw_meta) as fh:
meta = yaml.safe_load(fh)
if "components" in meta:
components = meta["components"]
for component in components:
if isinstance(component, dict):
component_name = list(component.keys())[0].lower()
branch = component[component_name].get("branch")
git_remote = component[component_name]["git_remote"]
modules_repo = ModulesRepo(git_remote, branch=branch)
current_comp_dict = subworkflows if component_name in subworkflows else modules

component_dict = {
"org_path": modules_repo.repo_path,
"git_remote": git_remote,
"branch": branch,
}

current_comp_dict[component_name].update(component_dict)

return list(modules.values()), list(subworkflows.values())


def get_biotools_id(tool_name) -> str:
Expand Down
4 changes: 4 additions & 0 deletions nf_core/components/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Constants for the nf-core/modules repo used throughout the module files
NF_CORE_MODULES_NAME = "nf-core"
NF_CORE_MODULES_REMOTE = "https://github.com/nf-core/modules.git"
NF_CORE_MODULES_DEFAULT_BRANCH = "master"
2 changes: 1 addition & 1 deletion nf_core/components/info.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

import nf_core.utils
from nf_core.components.components_command import ComponentCommand
from nf_core.components.components_utils import NF_CORE_MODULES_REMOTE
from nf_core.components.constants import NF_CORE_MODULES_REMOTE
from nf_core.modules.modules_json import ModulesJson

log = logging.getLogger(__name__)
Expand Down
46 changes: 35 additions & 11 deletions nf_core/components/install.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import logging
import os
from pathlib import Path
from typing import List, Optional, Union
from typing import Dict, List, Optional, Union

import questionary
from rich import print
Expand All @@ -15,11 +15,14 @@
import nf_core.utils
from nf_core.components.components_command import ComponentCommand
from nf_core.components.components_utils import (
NF_CORE_MODULES_NAME,
get_components_to_install,
prompt_component_version_sha,
)
from nf_core.components.constants import (
NF_CORE_MODULES_NAME,
)
from nf_core.modules.modules_json import ModulesJson
from nf_core.modules.modules_repo import ModulesRepo

log = logging.getLogger(__name__)

Expand All @@ -38,15 +41,33 @@ def __init__(
installed_by: Optional[List[str]] = None,
):
super().__init__(component_type, pipeline_dir, remote_url, branch, no_pull)
self.current_remote = ModulesRepo(remote_url, branch)
self.branch = branch
self.force = force
self.prompt = prompt
self.sha = sha
self.current_sha = sha
if installed_by is not None:
self.installed_by = installed_by
else:
self.installed_by = [self.component_type]

def install(self, component: str, silent: bool = False) -> bool:
def install(self, component: Union[str, Dict[str, str]], silent: bool = False) -> bool:
if isinstance(component, dict):
jvfe marked this conversation as resolved.
Show resolved Hide resolved
# Override modules_repo when the component to install is a dependency from a subworkflow.
remote_url = component.get("git_remote", self.current_remote.remote_url)
branch = component.get("branch", self.branch)
self.modules_repo = ModulesRepo(remote_url, branch)
component = component["name"]

if self.current_remote is None:
self.current_remote = self.modules_repo

if self.current_remote.remote_url == self.modules_repo.remote_url and self.sha is not None:
self.current_sha = self.sha
else:
self.current_sha = None

if self.repo_type == "modules":
log.error(f"You cannot install a {component} in a clone of nf-core/modules")
return False
Expand All @@ -70,8 +91,8 @@ def install(self, component: str, silent: bool = False) -> bool:
return False

# Verify SHA
if not self.modules_repo.verify_sha(self.prompt, self.sha):
err_msg = f"SHA '{self.sha}' is not a valid commit SHA for the repository '{self.modules_repo.remote_url}'"
if not self.modules_repo.verify_sha(self.prompt, self.current_sha):
err_msg = f"SHA '{self.current_sha}' is not a valid commit SHA for the repository '{self.modules_repo.remote_url}'"
log.error(err_msg)
return False

Expand Down Expand Up @@ -114,7 +135,7 @@ def install(self, component: str, silent: bool = False) -> bool:
modules_json.update(self.component_type, self.modules_repo, component, current_version, self.installed_by)
return False
try:
version = self.get_version(component, self.sha, self.prompt, current_version, self.modules_repo)
version = self.get_version(component, self.current_sha, self.prompt, current_version, self.modules_repo)
except UserWarning as e:
log.error(e)
return False
Expand Down Expand Up @@ -199,15 +220,17 @@ def collect_and_verify_name(
if component is None:
component = questionary.autocomplete(
f"{'Tool' if self.component_type == 'modules' else 'Subworkflow'} name:",
choices=sorted(modules_repo.get_avail_components(self.component_type, commit=self.sha)),
choices=sorted(modules_repo.get_avail_components(self.component_type, commit=self.current_sha)),
style=nf_core.utils.nfcore_question_style,
).unsafe_ask()

if component is None:
return ""

# Check that the supplied name is an available module/subworkflow
if component and component not in modules_repo.get_avail_components(self.component_type, commit=self.sha):
if component and component not in modules_repo.get_avail_components(
self.component_type, commit=self.current_sha
):
log.error(f"{self.component_type[:-1].title()} '{component}' not found in available {self.component_type}")
print(
Panel(
Expand All @@ -223,9 +246,10 @@ def collect_and_verify_name(

raise ValueError

if not modules_repo.component_exists(component, self.component_type, commit=self.sha):
warn_msg = f"{self.component_type[:-1].title()} '{component}' not found in remote '{modules_repo.remote_url}' ({modules_repo.branch})"
log.warning(warn_msg)
if self.current_remote.remote_url == modules_repo.remote_url:
if not modules_repo.component_exists(component, self.component_type, commit=self.current_sha):
warn_msg = f"{self.component_type[:-1].title()} '{component}' not found in remote '{modules_repo.remote_url}' ({modules_repo.branch})"
log.warning(warn_msg)

return component

Expand Down
67 changes: 51 additions & 16 deletions nf_core/components/update.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ def __init__(
limit_output=False,
):
super().__init__(component_type, pipeline_dir, remote_url, branch, no_pull)
self.current_remote = ModulesRepo(remote_url, branch)
self.branch = branch
self.force = force
self.prompt = prompt
self.sha = sha
Expand Down Expand Up @@ -92,6 +94,13 @@ def update(self, component=None, silent=False, updated=None, check_diff_exist=Tr
Returns:
(bool): True if the update was successful, False otherwise.
"""
if isinstance(component, dict):
# Override modules_repo when the component to install is a dependency from a subworkflow.
remote_url = component.get("git_remote", self.current_remote.remote_url)
branch = component.get("branch", self.branch)
self.modules_repo = ModulesRepo(remote_url, branch)
component = component["name"]

self.component = component
if updated is None:
updated = []
Expand Down Expand Up @@ -882,7 +891,17 @@ def get_components_to_update(self, component):
if self.component_type == "modules":
# All subworkflow names in the installed_by section of a module are subworkflows using this module
# We need to update them too
subworkflows_to_update = [subworkflow for subworkflow in installed_by if subworkflow != self.component_type]
git_remote = self.current_remote.remote_url
for subworkflow in installed_by:
if subworkflow != component:
for remote_url, content in mods_json["repos"].items():
if all_subworkflows := content.get("subworkflows"):
for _, details in all_subworkflows.items():
if subworkflow in details:
git_remote = remote_url
if subworkflow != self.component_type:
subworkflows_to_update.append({"name": subworkflow, "git_remote": git_remote})

elif self.component_type == "subworkflows":
for repo, repo_content in mods_json["repos"].items():
for component_type, dir_content in repo_content.items():
Expand All @@ -893,9 +912,9 @@ def get_components_to_update(self, component):
# We need to update it too
if component in comp_content["installed_by"]:
if component_type == "modules":
modules_to_update.append(comp)
modules_to_update.append({"name": comp, "git_remote": repo, "org_path": dir})
elif component_type == "subworkflows":
subworkflows_to_update.append(comp)
subworkflows_to_update.append({"name": comp, "git_remote": repo, "org_path": dir})

return modules_to_update, subworkflows_to_update

Expand All @@ -910,7 +929,7 @@ def update_linked_components(
Update modules and subworkflows linked to the component being updated.
"""
for s_update in subworkflows_to_update:
if s_update in updated:
if s_update["name"] in updated:
continue
original_component_type, original_update_all = self._change_component_type("subworkflows")
self.update(
Expand All @@ -922,7 +941,7 @@ def update_linked_components(
self._reset_component_type(original_component_type, original_update_all)

for m_update in modules_to_update:
if m_update in updated:
if m_update["name"] in updated:
continue
original_component_type, original_update_all = self._change_component_type("modules")
try:
Expand All @@ -945,28 +964,42 @@ def update_linked_components(
def manage_changes_in_linked_components(self, component, modules_to_update, subworkflows_to_update):
"""Check for linked components added or removed in the new subworkflow version"""
if self.component_type == "subworkflows":
subworkflow_directory = Path(self.directory, self.component_type, self.modules_repo.repo_path, component)
org_path = self.current_remote.repo_path

subworkflow_directory = Path(self.directory, self.component_type, org_path, component)
included_modules, included_subworkflows = get_components_to_install(subworkflow_directory)
# If a module/subworkflow has been removed from the subworkflow
for module in modules_to_update:
if module not in included_modules:
log.info(f"Removing module '{module}' which is not included in '{component}' anymore.")
module_name = module["name"]
included_modules_names = [m["name"] for m in included_modules]
if module_name not in included_modules_names:
log.info(f"Removing module '{module_name}' which is not included in '{component}' anymore.")
remove_module_object = ComponentRemove("modules", self.directory)
remove_module_object.remove(module, removed_by=component)
remove_module_object.remove(module_name, removed_by=component)
for subworkflow in subworkflows_to_update:
if subworkflow not in included_subworkflows:
log.info(f"Removing subworkflow '{subworkflow}' which is not included in '{component}' anymore.")
subworkflow_name = subworkflow["name"]
included_subworkflow_names = [m["name"] for m in included_subworkflows]
if subworkflow_name not in included_subworkflow_names:
log.info(
f"Removing subworkflow '{subworkflow_name}' which is not included in '{component}' anymore."
)
remove_subworkflow_object = ComponentRemove("subworkflows", self.directory)
remove_subworkflow_object.remove(subworkflow, removed_by=component)
remove_subworkflow_object.remove(subworkflow_name, removed_by=component)
# If a new module/subworkflow is included in the subworklfow and wasn't included before
for module in included_modules:
if module not in modules_to_update:
log.info(f"Installing newly included module '{module}' for '{component}'")
module_name = module["name"]
module["git_remote"] = module.get("git_remote", self.current_remote.remote_url)
module["branch"] = module.get("branch", self.branch)
if module_name not in modules_to_update:
log.info(f"Installing newly included module '{module_name}' for '{component}'")
install_module_object = ComponentInstall(self.directory, "modules", installed_by=component)
install_module_object.install(module, silent=True)
for subworkflow in included_subworkflows:
if subworkflow not in subworkflows_to_update:
log.info(f"Installing newly included subworkflow '{subworkflow}' for '{component}'")
subworkflow_name = subworkflow["name"]
subworkflow["git_remote"] = subworkflow.get("git_remote", self.current_remote.remote_url)
subworkflow["branch"] = subworkflow.get("branch", self.branch)
if subworkflow_name not in subworkflows_to_update:
log.info(f"Installing newly included subworkflow '{subworkflow_name}' for '{component}'")
install_subworkflow_object = ComponentInstall(
self.directory, "subworkflows", installed_by=component
)
Expand All @@ -985,3 +1018,5 @@ def _reset_component_type(self, original_component_type, original_update_all):
self.component_type = original_component_type
self.modules_json.pipeline_components = None
self.update_all = original_update_all
if self.current_remote is None:
self.current_remote = self.modules_repo
Loading
Loading