Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add profile for Charliecloud container engine to template #824

Closed
phue opened this issue Jan 11, 2021 · 11 comments · Fixed by #826
Closed

Add profile for Charliecloud container engine to template #824

phue opened this issue Jan 11, 2021 · 11 comments · Fixed by #826
Assignees
Labels
template nf-core pipeline/component template

Comments

@phue
Copy link
Member

phue commented Jan 11, 2021

Charliecloud is an alternative container engine that runs completely in user space.

Nextflow supports it since v20.12.0-edge, thus it would also be a useful addition to nf-core, as some research institutions do not allow singularity because of strict security policies.

My proposal would be to add conf/charliecloud.config with the following content to the template:

charliecloud {
  enabled = true
}

manifest {
    nextflowVersion = '>=20.12.0-edge'
}

env {
  PATH = "/opt/conda/bin:/opt/conda/envs/{{ cookiecutter.name_noslash }}-{{ cookiecutter.version }}/bin:$PATH"
}

Note that the env part is necessary because Charliecloud does not honor ENV layers in Docker containers, meaning the path to the conda env within the docker container has to be passed to the container explicitly.
This should work for most pipelines. Unfortunately, for pipelines that have multiple environments, the PATH would need to be appended manually in the respective pipeline repository.

Once all nf-core pipelines use biocontainers this can be simplified to:

env {
   PATH = "/opt/conda/bin"
}
@apeltzer apeltzer added the template nf-core pipeline/component template label Jan 11, 2021
@apeltzer
Copy link
Member

Sounds all good to me. Multiple environments are normally not an issue I guess. Most people will not be affected by this, though it might make sense to also provide CI tests for running with CharlieCloud then at some point. Something we should also do with podman, which is supported since a while now too...

Multiple environments = meaning in the same docker image or per process having different ones?

@phue
Copy link
Member Author

phue commented Jan 11, 2021

Multiple environments = meaning in the same docker image or per process having different ones?

Both actually, we need to pass every location that contains dependencies to the container.
AFAIK pipelines with multiple environment.ymls have multiple Dockerfiles as well (one per environment). nf-core Sarek for example

For biocontainers, it is much easier because their environment is always installed to /opt/conda/bin without any names or version tags.

@phue phue self-assigned this Jan 11, 2021
phue added a commit to phue/tools that referenced this issue Jan 11, 2021
@maxulysse
Copy link
Member

We do have multiple Dockerfile for specific processes in Sarek, but they do contain annotation database, which we can't get in conda, so path should still be the same actually.
But that does look interesting.

@apeltzer
Copy link
Member

Thanks for the clarification @phue :-)

@phue
Copy link
Member Author

phue commented Jan 11, 2021

We do have multiple Dockerfile for specific processes in Sarek, but they do contain annotation database, which we can't get in conda, so path should still be the same actually.
But that does look interesting.

But aren't there specific conda environments in the containers for snpeff and vep?
It seems that they have /opt/conda/envs/nf-core-sarek-snpeff-2.6.1/bin and /opt/conda/envs/nf-core-sarek-vep-2.6.1/bin as opposed to /opt/conda/envs/nf-core-sarek-2.6.1/bin

Therefore, running Sarek with Charliecloud would require this config:

charliecloud {
  enabled = true
}

manifest {
    nextflowVersion = '>=20.12.0-edge'
}

env {
  PATH = "/opt/conda/envs/nf-core-sarek-2.6.1/bin:/opt/conda/envs/nf-core-sarek-snpeff-2.6.1/bin:/opt/conda/envs/nf-core-sarek-vep-2.6.1/bin:$PATH"
}

@phue phue linked a pull request Jan 11, 2021 that will close this issue
@phue
Copy link
Member Author

phue commented Jan 12, 2021

it might make sense to also provide CI tests for running with CharlieCloud then at some point. Something we should also do with podman, which is supported since a while now too...

I've been toying around with this, for podman it should be piece of cake as the Github Actions Runner has it installed.

So I guess we just need to add a matrix for the different config profiles? Similar to this:

jobs:
  CI:
    env:
      PROFILE: ${{ matrix.engine }}
    runs-on: ubuntu-latest
    strategy:
      matrix:
        engine: ["docker", "podman", "charliecloud"]

For podman, the current ci.yml will work as is.

Charliecloud will need some setup steps, for example:

      - name: Set up Python
        if: matrix.engine == 'charliecloud'
        uses: actions/setup-python@v2
        with:
          python-version: '3.x'

      - name: Install Charliecloud
        if: matrix.engine == 'charliecloud'
        run: |
          python -m pip install lark-parser requests
          wget -qO- https://github.com/hpc/charliecloud/releases/download/v0.21/charliecloud-0.21.tar.gz | tar -xvz
          cd charliecloud-0.21
          ./configure
          make
          sudo make install

Should be similar for Singularity, but I have not tested that yet. conda should be a bit easier

@phue
Copy link
Member Author

phue commented Jan 12, 2021

Whoops, I just saw @drpatelh already created an issue for that 😅
#815

@ewels
Copy link
Member

ewels commented Jan 14, 2021

All sounds good - only comment would be that all other container engines have profiles in the main pipeline template, instead of in nf-core/configs. I wonder if we should wait for this to be in a stable Nextflow release and then do the same?

Whilst we're at it, might also be worth implementing Shifter? Then I think we have explicit support for all container engines supported by Nextflow..

@phue
Copy link
Member Author

phue commented Jan 14, 2021

All sounds good - only comment would be that all other container engines have profiles in the main pipeline template, instead of in nf-core/configs. I wonder if we should wait for this to be in a stable Nextflow release and then do the same?

The idea is to add the profile to each pipeline repo in an upcoming template sync, see #826

I wonder if we should wait for this to be in a stable Nextflow release

Agreed, I think it will land in 21.01.0

Whilst we're at it, might also be worth implementing Shifter?

This is probably a oneliner, we just need to find somebody with access to a shifter system to test it

phue added a commit to phue/tools that referenced this issue Jan 14, 2021
@phue phue closed this as completed Feb 9, 2021
@reidpr
Copy link

reidpr commented Mar 2, 2021

Thanks for supporting Charliecloud! Project lead here. Please let us know how we can help make your job easier; we've merged many changes over time that help packagers.

FWIW, we do now support ENV, if that helps.

@phue
Copy link
Member Author

phue commented Mar 2, 2021

Thanks for reaching out @reidpr !
The ENV support is great, makes things much simpler for us.
Nextflow already has the changes merged (nextflow-io/nextflow@8c7f059), after the next nextflow release we can simplify the configuration here 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
template nf-core pipeline/component template
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants