Skip to content

Commit

Permalink
Sentieon license server - testing (#5856)
Browse files Browse the repository at this point in the history
* test it out

* ci: Add license_message script

nf-core/sarek#1380
https://github.com/DonFreed/docker-actions-test?tab=readme-ov-file

* test: Add tests for sentieon bwamem

Because the pytest-workflow tests are taking so long

* Add scratch from meeting

* Clean up sentieon secrets

* ci: Fix Nextflow secrets setup

* fix(sentieon): Remove encryption key because it won't get used

* fix #

* ci: Fix when to setup secrets

* feat: Add sentieon auth mech secret

* chore(sentieon): Remove bwamem pytest-workflow tests

* fix: Spike in with environment variables

* fix: Add SENTIEON_AUTH_DATA in env file

The reason for this is because when we require it in the module it
fails. That doesn't make sense for normal Sentieon users. They don't
care, this is more of a GitHub actions thing.

* docs(sentieon): Make this a README

* test: Add local testing setup

* test: Remove publishDir

* style: Remove sentieon/bwamem

* fix: Change comments to prints for debugging and better reporting

* fix: Add a quote?

* chore: Remove unnecessary variable

* debug: Print variables

* Fix typo

Co-authored-by: Anders Sune Pedersen <37172585+asp8200@users.noreply.github.com>

* chore: Clean up code

* docs: Write up all the learnings from Sentieon

* style: Run prettier

* style: Add tags for nf-core lint

* docs: Clean up notes

Co-authored-by: maxulysse <maxulysse@users.noreply.github.com>

* chore: Add @DonFreed as maintainer

---------

Co-authored-by: Edmund Miller <git@edmundmiller.dev>
Co-authored-by: Edmund Miller <20095261+edmundmiller@users.noreply.github.com>
Co-authored-by: Anders Sune Pedersen <37172585+asp8200@users.noreply.github.com>
Co-authored-by: maxulysse <maxulysse@users.noreply.github.com>
  • Loading branch information
5 people authored Jul 8, 2024
1 parent e4fc46a commit ef272ea
Show file tree
Hide file tree
Showing 14 changed files with 451 additions and 125 deletions.
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -854,6 +854,7 @@ modules/nf-core/seacr/callpeak/** @chris-cheshire
modules/nf-core/segemehl/align/** @BarryDigby
modules/nf-core/segemehl/index/** @BarryDigby
modules/nf-core/semibin/singleeasybin/** @BigDataBiology
modules/nf-core/sentieon/** @DonFreed
modules/nf-core/sentieon/applyvarcal/** @assp8200
modules/nf-core/sentieon/bwaindex/** @drpatelh @maxulysse
modules/nf-core/sentieon/bwamem/** @asp8200
Expand Down
111 changes: 111 additions & 0 deletions .github/scripts/license_message.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
#!/usr/bin/env python3

"""
Functions for generating and sending license messages
"""

# Modified from - https://stackoverflow.com/a/59835994

import argparse
import base64
import calendar
import re
import secrets
import sys

from cryptography.hazmat.primitives.ciphers.aead import AESGCM
from datetime import datetime as dt

MESSAGE_TIMEOUT = 60 * 60 * 24 # Messages are valid for 1 day
NONCE_BYTES = 12


class DecryptionTimeout(Exception):
# Decrypting a message that is too old
pass


def generate_key():
key = secrets.token_bytes(32)
return key


def handle_generate_key(args):
key = generate_key()
key_b64 = base64.b64encode(key)
print(key_b64.decode("utf-8"), file=args.outfile)


def encrypt_message(key, message):
nonce = secrets.token_bytes(NONCE_BYTES)
timestamp = calendar.timegm(dt.now().utctimetuple())
data = timestamp.to_bytes(10, byteorder="big") + b"__" + message
ciphertext = nonce + AESGCM(key).encrypt(nonce, data, b"")
return ciphertext


def handle_encrypt_message(args):
key = base64.b64decode(args.key.encode("utf-8"))
message = args.message.encode("utf-8")
ciphertext = encrypt_message(key, message)
ciphertext_b64 = base64.b64encode(ciphertext)
print(ciphertext_b64.decode("utf-8"), file=args.outfile)


def decrypt_message(key, ciphertext, timeout=MESSAGE_TIMEOUT):
nonce, ciphertext = ciphertext[:NONCE_BYTES], ciphertext[NONCE_BYTES:]
message = AESGCM(key).decrypt(nonce, ciphertext, b"")

msg_timestamp, message = re.split(b"__", message, maxsplit=1)
msg_timestamp = int.from_bytes(msg_timestamp, byteorder="big")
timestamp = calendar.timegm(dt.now().utctimetuple())
if (timestamp - msg_timestamp) > timeout:
raise DecryptionTimeout("The message has an expired timeout")
return message.decode("utf-8")


def handle_decrypt_message(args):
key = base64.b64decode(args.key.encode("utf-8"))
ciphertext = base64.b64decode(args.message.encode("utf-8"))
message = decrypt_message(key, ciphertext, timeout=args.timeout)
print(str(message), file=args.outfile)


def parse_args(argv=None):
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"--outfile",
default=sys.stdout,
type=argparse.FileType("w"),
help="The output file",
)

subparsers = parser.add_subparsers(help="Available sub-commands")

gen_parser = subparsers.add_parser(
"generate_key", help="Generate a random key string"
)
gen_parser.set_defaults(func=handle_generate_key)

encrypt_parser = subparsers.add_parser("encrypt", help="Encrypt a message")
encrypt_parser.add_argument("--key", required=True, help="The encryption key")
encrypt_parser.add_argument("--message", required=True, help="Message to encrypt")
encrypt_parser.set_defaults(func=handle_encrypt_message)

decrypt_parser = subparsers.add_parser("decrypt", help="Decyrpt a message")
decrypt_parser.add_argument("--key", required=True, help="The encryption key")
decrypt_parser.add_argument("--message", required=True, help="Message to decrypt")
decrypt_parser.add_argument(
"--timeout",
default=MESSAGE_TIMEOUT,
type=int,
help="A message timeout. Decryption will fail for older messages",
)
decrypt_parser.set_defaults(func=handle_decrypt_message)

return parser.parse_args(argv)


if __name__ == "__main__":
args = parse_args()
args.func(args)
29 changes: 13 additions & 16 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -466,14 +466,13 @@ jobs:
# Set up secrets
- name: Set up nextflow secrets
if: env.SENTIEON_LICENSE_BASE64 != null
env:
SENTIEON_LICENSE: ${{ secrets.LICSRVR_IP }}
SENTIEON_AUTH_MECH: "GitHub Actions - token"
ENCRYPTION_KEY: ${{ secrets.SENTIEON_ENCRYPTION_KEY }}
LICENSE_MESSAGE: ${{ secrets.SENTIEON_LICENSE_MESSAGE }}
run: |
nextflow secrets set SENTIEON_LICENSE_BASE64 ${{ secrets.SENTIEON_LICENSE_BASE64 }}
nextflow secrets set SENTIEON_AUTH_MECH_BASE64 ${{ secrets.SENTIEON_AUTH_MECH_BASE64 }}
SENTIEON_ENCRYPTION_KEY=$(echo -n "${{ secrets.ENCRYPTION_KEY_BASE64 }}" | base64 -d)
SENTIEON_LICENSE_MESSAGE=$(echo -n "${{ secrets.LICENSE_MESSAGE_BASE64 }}" | base64 -d)
SENTIEON_AUTH_DATA=$(python3 tests/modules/nf-core/sentieon/license_message.py encrypt --key "$SENTIEON_ENCRYPTION_KEY" --message "$SENTIEON_LICENSE_MESSAGE")
SENTIEON_AUTH_DATA_BASE64=$(echo -n "$SENTIEON_AUTH_DATA" | base64 -w 0)
nextflow secrets set SENTIEON_AUTH_DATA_BASE64 $SENTIEON_AUTH_DATA_BASE64
nextflow secrets set SENTIEON_AUTH_DATA $(python3 tests/modules/nf-core/sentieon/license_message.py encrypt --key "${{ secrets.SENTIEON_ENCRYPTION_KEY }}" --message "$SENTIEON_LICENSE_MESSAGE")
# Test the module
- name: Run pytest-workflow
Expand Down Expand Up @@ -649,7 +648,6 @@ jobs:
path: subworkflows/nf-core/fastq_align_bwa
env:
NXF_ANSI_LOG: false
SENTIEON_LICENSE_BASE64: ${{ secrets.SENTIEON_LICENSE_BASE64 }}
NFTEST_VER: "0.8.4"

steps:
Expand Down Expand Up @@ -707,21 +705,20 @@ jobs:
# Set up secrets
- name: Set up nextflow secrets
if: env.SENTIEON_LICENSE_BASE64 != null
# FIXME if: secrets.SENTIEON_ENCRYPTION_KEY != null
env:
SENTIEON_ENCRYPTION_KEY: ${{ secrets.SENTIEON_ENCRYPTION_KEY }}
SENTIEON_LICENSE_MESSAGE: ${{ secrets.SENTIEON_LICENSE_MESSAGE }}
run: |
nextflow secrets set SENTIEON_LICENSE_BASE64 ${{ secrets.SENTIEON_LICENSE_BASE64 }}
nextflow secrets set SENTIEON_AUTH_MECH_BASE64 ${{ secrets.SENTIEON_AUTH_MECH_BASE64 }}
SENTIEON_ENCRYPTION_KEY=$(echo -n "${{ secrets.ENCRYPTION_KEY_BASE64 }}" | base64 -d)
SENTIEON_LICENSE_MESSAGE=$(echo -n "${{ secrets.LICENSE_MESSAGE_BASE64 }}" | base64 -d)
SENTIEON_AUTH_DATA=$(python3 tests/modules/nf-core/sentieon/license_message.py encrypt --key "$SENTIEON_ENCRYPTION_KEY" --message "$SENTIEON_LICENSE_MESSAGE")
SENTIEON_AUTH_DATA_BASE64=$(echo -n "$SENTIEON_AUTH_DATA" | base64 -w 0)
nextflow secrets set SENTIEON_AUTH_DATA_BASE64 $SENTIEON_AUTH_DATA_BASE64
nextflow secrets set SENTIEON_AUTH_DATA $(python3 tests/modules/nf-core/sentieon/license_message.py encrypt --key "${{ secrets.SENTIEON_ENCRYPTION_KEY }}" --message "$SENTIEON_LICENSE_MESSAGE")
# Test the module
- name: Run nf-test
env:
NFT_DIFF: "pdiff"
NFT_DIFF_ARGS: "--line-numbers --width 120 --expand-tabs=2"
SENTIEON_LICSRVR_IP: ${{ secrets.SENTIEON_LICSRVR_IP }}
SENTIEON_AUTH_MECH: "GitHub Actions - token"
run: |
# use "docker_self_hosted" if it runs on self-hosted runner and matrix.profile=docker
if [ "${{ matrix.profile }}" == "docker" ]; then
Expand Down
78 changes: 78 additions & 0 deletions modules/nf-core/sentieon/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Sentieon Modules

## Usecases we're trying to cover

In rank order:

1. Paying Sentieon users running a license server inside their AWS account(or on-prem).
2. GitHub actions using a _test_ license server that can receive traffic from the internet. This has an auth layer to keep anyone from connecting to it.
3. Pipeline developers working with Sentieon modules locally with their own license server.
4. Pipeline developers working with Sentieon modules locally with access to the `SENTIEON_LICSRVR_IP` and `SENTIEON_ENCRYPTION_KEY`.
5. People using a local file(Sentieon doesn't use these test licenses often anymore.)

## How to use the sentieon modules

Except the Sentieon bwaindex module, these nextflow Sentieon modules require a license. Sentieon supply license in the form of a string-value (a url ex: `12.12.12.12:8990`) or a file. The file should be base64-encoded and stored in a nextflow secret named `SENTIEON_LICENSE_BASE64`. If a license string (url) is supplied, then the nextflow secret should be set like this:

### Most Users

<!-- NOTE Might be SENTIEON_LICENSE = "$SENTIEON_LICSRVR_IP" -->

```nextflow title="nextflow.config"
env {
SENTIEON_LICSRVR_IP = "$SENTIEON_LICSRVR_IP"
// or if you really want to use `nextflow secrets`
SENTIEON_LICSRVR_IP = secrets.SENTIEON_LICSRVR_IP
}
```

### License File Users

If a license file is supplied, then the nextflow secret should be set like this:

```bash
nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat <sentieon_license_file.lic> | base64 -w 0)
```

And in the config set

```nextflow title="nextflow.config"
env {
SENTIEON_LICENSE_BASE64 = secrets.SENTIEON_LICENSE_BASE64
}
```

## GitHub Actions

### Spiking in all the Sentieon variables

Previously we tried to get everything into one Nextflow secret. This made the setup extremely difficult to follow and reproduce.

The issue is that Nextflow secrets that are declared in a process cause it to fail if they're not used. This made it so we had to branch our logic off the size of the secret. Again, just making a confusing setup, even more confusing.

We tried to follow as closely as possible [DonFreed/docker-actions-test](https://github.com/DonFreed/docker-actions-test) and keep the process simple. At the end of the day Sentieon is also validating the license servers and can just make a new license for us if someone tries to misuse ours.

The main experience should be on end users of the Sentieon modules ease of use.

### Threat Models

After working with [@DonFreed](https://github.com/DonFreed) on a [rework for GitHub actions](https://github.com/nf-core/modules/pull/5856), we determined that using Nextflow secrets wasn't really necessary outside of the `SENTIEON_AUTH_DATA`.

Because the SENTIEON_ENCRYPTION_KEY and SENTIEON_LICENSE_MESSAGE are both stored in GitHub Secrets(and 1Password) contributors will never have access to them. They are just used to interact with the `license_message` script that generates some auth data to then come into contact with the server.

The `SENTIEON_AUTH_DATA` is only valid for 1 day, and the license that we have is valid for only a few CPUs.

The Server IP doesn't matter either because they would also need the `SENTIEON_ENCRYPTION_KEY` to authenticate with the license.

## Local Testing

```bash
export SENTIEON_AUTH_MECH="GitHub Actions - token"
export SENTIEON_LICSRVR_IP=$(op read "op://Dev/Sentieon License Server/SENTIEON_LICSRVR_IP")
SENTIEON_ENCRYPTION_KEY=$(op read "op://Dev/Sentieon License Server/GitHub Secrets/SENTIEON_ENCRYPTION_KEY")
SENTIEON_LICENSE_MESSAGE=$(op read "op://Dev/Sentieon License Server/GitHub Secrets/SENTIEON_LICENSE_MESSAGE")
nextflow secrets set SENTIEON_AUTH_DATA $(python3 tests/modules/nf-core/sentieon/license_message.py encrypt --key "$SENTIEON_ENCRYPTION_KEY" --message "$SENTIEON_LICENSE_MESSAGE")
```

> [!NOTE]
> If this fails run `op signin` to flip to nf-core account
22 changes: 13 additions & 9 deletions modules/nf-core/sentieon/bwamem/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@ process SENTIEON_BWAMEM {
label 'process_high'
label 'sentieon'

secret 'SENTIEON_LICENSE_BASE64'
// NOTE this is not required for the process to run, but it is required for the process to be run in GitHub actions or nf-core MegaTests
// The rest of the secrets aren't really "secrets" because they're not sensitive information for users
// secret SENTIEON_AUTH_DATA

conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
Expand Down Expand Up @@ -39,20 +41,22 @@ process SENTIEON_BWAMEM {
def sentieon_auth_data_base64 = task.ext.sentieon_auth_data_base64 ?: ''

"""
if [ "\${#SENTIEON_LICENSE_BASE64}" -lt "1500" ]; then # If the string SENTIEON_LICENSE_BASE64 is short, then it is an encrypted url.
export SENTIEON_LICENSE=\$(echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d)
else # Localhost license file
if [ "\${SENTIEON_LICSRVR_IP}" ]; then
# NOTE: This is how pipeline users will use Sentieon in real world
echo "Using a Sentieon License Server"
export SENTIEON_LICENSE="\${SENTIEON_LICSRVR_IP}"
else
# NOTE: This is how pipeline users will test out Sentieon
echo "Localhost license file"
# The license file is stored as a nextflow variable like, for instance, this:
# nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat <sentieon_license_file.lic> | base64 -w 0)
export SENTIEON_LICENSE=\$(mktemp)
echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d > \$SENTIEON_LICENSE
fi
if [ ${sentieon_auth_mech_base64} ] && [ ${sentieon_auth_data_base64} ]; then
# If sentieon_auth_mech_base64 and sentieon_auth_data_base64 are non-empty strings, then Sentieon is mostly likely being run with some test-license.
export SENTIEON_AUTH_MECH=\$(echo -n "${sentieon_auth_mech_base64}" | base64 -d)
export SENTIEON_AUTH_DATA=\$(echo -n "${sentieon_auth_data_base64}" | base64 -d)
echo "Decoded and exported Sentieon test-license system environment variables"
if [ "\${SENTIEON_AUTH_MECH}" ] && [ "\${SENTIEON_AUTH_DATA}" ]; then
# NOTE: This should only happen in GitHub Actions or nf-core/megatests
echo "If sentieon_auth_mech and sentieon_auth_data are non-empty strings, then Sentieon is mostly likely being run with some test-license."
fi
$fix_ld_library_path
Expand Down
1 change: 1 addition & 0 deletions modules/nf-core/sentieon/bwamem/meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -73,3 +73,4 @@ authors:
- "@asp8200"
maintainers:
- "@asp8200"
- "@DonFreed"
Loading

0 comments on commit ef272ea

Please sign in to comment.