The project aims to evaluate the seamless integration and performance of the Kueue batch scheduling system within the REANA platform, a robust open-source solution designed for reproducible and declarative data analysis in containerized computing clouds. As the REANA platform facilitates streamlined workflows for researchers in the field of High-Energy Physics (HEP), the project focus relies on testing the viability of Kueue within the context of runtime user jobs, laying the groundwork for future developments that may introduce FAIR share capabilities within the REANA ecosystem.
REANA is a reusable and reproducible research data analysis platform. It helps researchers to structure their input data, analysis code, containerised environments and computational workflows so that the analysis can be instantiated and run on remote compute clouds.
REANA was born to target the use case of particle physics analyses, but is applicable to any scientific discipline. The system paves the way towards reusing and reinterpreting preserved data analyses even several years after the original publication.
- structure research data analysis in reusable manner
- instantiate computational workflows on remote clouds
- rerun analyses with modified input data, parameters or code
- support for several compute clouds (Kubernetes/OpenStack)
- support for several workflow specifications (CWL, Serial, Yadage, Snakemake)
- support for several shared storage systems (Ceph)
- support for several container technologies (Docker)
You can install REANA locally, deploy it at scale on premises (in about 10 minutes) or use https://reana.cern.ch. Once the system is ready, you can follow the guide to run your first example. For more in depth information visit the official REANA documentation.
- Discuss on Forum
- Follow us on Twitter
- Collaborate on GitHub
Kueue is a set of APIs and controller for job queueing. It is a job-level manager that decides when a job should be admitted to start (as in pods can be created) and when it should stop (as in active pods should be deleted).
Read the overview to learn more.
- Job management: Support job queueing based on priorities with different strategies:
StrictFIFO
andBestEffortFIFO
. - Resource management: Support resource fair sharing and preemption with a variety of policies between different tenants.
- Dynamic resource reclaim: A mechanism to release quota as the pods of a Job complete.
- Resource flavor fungibility: Quota borrowing or preemption in ClusterQueue and Cohort.
- Integrations: Built-in support for popular jobs, e.g. BatchJob, Kubeflow training jobs, RayJob, RayCluster, JobSet, plain Pod.
- System insight: Build-in prometheus metrics to help monitor the state of the system, as well as Conditions.
- AdmissionChecks: A mechanism for internal or external components to influence whether a workload can be admitted.
- Advanced autoscaling support: Integration with cluster-autoscaler's provisioningRequest via admissionChecks.
- Sequential admission: A simple implementation of all-or-nothing scheduling.
- Partial admission: Allows jobs to run with a smaller parallelism, based on available quota, if the application supports it.
Requires Kubernetes 1.22 or newer.
To install the latest release of Kueue in your cluster, run the following command:
kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.6.2/manifests.yaml
Learn more about:
Learn how to engage with the Kubernetes community on the community page and the contributor's guide.
You can reach the maintainers of this project at: