Skip to content
This repository was archived by the owner on Feb 28, 2024. It is now read-only.

eth-cscs/slurm-replay

Repository files navigation

Slurm-Replay

Slurm Replay allows replaying traces of job scheduled on HPC system using Slurm. By using the same Slurm configuration and unmodified Slurm code-base used by a production HPC system, one can replay jobs that have been submitted. Slurm-Replay enables the capability to investigate different Slurm configurations or policies and see their impacts on an production workload.

Documentation

For more information, check out the

Citation

There is a paper in the proceedings of the SC 2018. We would appreciate a citation.

@inproceedings{
  author={M. Martinasso and M. Gila and M. Bianco and S. R. Alam and C. McMurtrie and T. C. Schulthess},
  title={{RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management}},
  year={2018},
  month={Nov.},
  booktitle={Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC18)},
  location={Dallas, Texas},
  publisher={IEEE Press},
  pages={},
  isbn={},
}

License

Slurm-Replay is published under the BSD 3-clause license, see here.

Contribute

You are very welcome to contribute to Slurm-Replay.

If you want to contribute code, there are a few things to consider:

  • a good start is to fork the repository
  • use GitHub pull requests to merge your contribution
  • consider documenting your code according

TODO list