-
Notifications
You must be signed in to change notification settings - Fork 56
Workflow scheduling
This page describes how workflow scheduling is working in REANA platform.
In general, workflows are scheduled in the following way:
- User requests
reana-server
to start a workflow; -
reana-server
calculates workflow priority and complexity based on configured scheduling strategy; -
reana-server
published a message toworkflow-submission
queue with workflow details, priority, and complexity; -
workflow scheduler
picks up the message from the queue and checks if the workflow can be scheduled; - If the workflow can be scheduled,
workflow scheduler
sends a request toreana-workflow-controller
to start the workflow; - If the workflow cannot be scheduled,
workflow scheduler
can either:- publish a message to the
workflow-submission
queue to try again; - fail a workflow.
- publish a message to the
In the following sections, we will go deeper into the details of each step.
Tip: Architecture page provides a nice overview diagram of the REANA platform that can be helpful when reading this page.
This is a queue that is used to submit workflows to the scheduler.
- publishes to the queue: reana-server
- consumes from the queue: workflow scheduler
Message schema:
{
"$id": "reana/workflow-submission-message.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "workflow-submission message",
"description": "Describes workflow submission message for scheduler",
"type": "object",
"properties": {
"user": {
"description": "The unique UUID identifier for a user",
"type": "string"
},
"workflow_id_or_name": {
"description": "The unique UUID identifier or name for a workflow",
"type": "string"
},
"priority": {
"description": "Priority number of the workflow",
"type": "integer"
},
"min_job_memory": {
"description": "Priority number of the workflow",
"type": "integer"
},
"parameters": {
"type": "object"
},
"retry_count": {
"description": "Number of times the workflow submission was retried",
"type": "integer"
}
},
"required": ["user", "workflow_id_or_name", "priority", "min_job_memory"]
}
This is priority queue.
Messages with higher integer in priority
field should be consumed first.
Currently, REANA supports two scheduling strategies:
-
fifo
, first-in first-out strategy, starting workflows as they come; -
balanced
, a weighted strategy taking into account existing multi-user workloads and the complexity of incoming workflows.
Workflow complexity is an internal concept we use in REANA in order to help
decide which workflow to schedule when balanced
strategy is used.
It expressed how many jobs the workflow would like to start, and how many memory each individual job would consume.
The workflow complexity value looks symbolically as follows [(4, 4G), (3, 2G)]
meaning that when the given workflow starts, it would like to launch 4
jobs of 4 GB RAM each, and 3 jobs of 2GB RAM each.
The workflow complexity numbers for given workflow can be obtained by parsing
the workflow DAG specification and studying how many jobs will be started in
parallel upon launch and how many kubernetes_memory_limit
each job asks for.
Despite the fact that the workflow complexity logic belongs to reana-server
,
it can be tested without the cluster running by importing the appropriate
functions from a python shell:
$ mkvirtualenv foo
$ pip install ../reana-client ../reana-server ipython
$ cd ../reana-demo-root6-roofit
$ ipython
and then in the Python REPL:
In [1]: from reana_client.utils import load_reana_spec
In [2]: from reana_server.complexity import estimate_complexity
In [3]: reana_yaml = load_reana_spec('./reana.yaml')
==> Verifying REANA specification file... ./reana.yaml
-> SUCCESS: Valid REANA specification file.
==> Verifying REANA specification parameters...
-> SUCCESS: REANA specification parameters appear valid.
==> Verifying workflow parameters and commands...
-> SUCCESS: Workflow parameters and commands appear valid.
==> Verifying dangerous workflow operations...
-> SUCCESS: Workflow operations appear valid.
In [4]: from reana_server import complexity as reana_server_complexity
In [5]: reana_server_complexity.REANA_COMPLEXITY_JOBS_MEMORY_LIMIT = '4Gi'
In [6]: estimate_complexity('serial', reana_yaml)
Out[6]: [(1, 4294967296.0)]
REANA reproducible analysis platform
blog.reana.io | docs.reana.io | forum.reana.io | www.reana.io |
@gitter | @mattermost | @twitter
Introduction
Getting started
- Setting up your system
- Cloning sources
- Using production-like development mode
- Using live-code-reload and debug mode
Issue lifecycle
Understanding code base
Technology tips and tricks
- Tips for Docker
- Tips for Git
- Tips for GitLab
- Tips for Keycloak
- Tips for Kind
- Tips for Kubernetes
- Tips for OpenAPI
- Tips for PostgreSQL
- Tips for Python
- Tips for RabbitMQ
- Tips for SQLAlchemy