Skip to content

GRomR1/influxdb-slurm-monitoring

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

The acct_gather_profile/influxdb plugin uses the same base as the HDF5 profiling plugin. It allows Slurm to coordinate collecting data on jobs it runs on a cluster that is more detailed than is practical to include in its database. The data comes from periodically sampling various performance data either collected by Slurm, the operating system, or component software. The plugin will record the data from each source as a Time Series into a custom InfluxDB server.

Collects exactly the same information as the HDF5 plugin:

Measurement Description
CPUFrequency CPU Frequency at time of sample
CPUTime Seconds of CPU time used during the sample
CPUUtilization CPU Utilization during the interval
Pages Pages used in sample
ReadMB Number of megabytes read from local disk
RSS Value of RSS at time of sample
VMSize Value of VM Size at time of sample
WriteMB Number of megabytes written to local disk

A small buffer (16KB) is used to avoid sending data for every sample collected. After task ended, plugin will send buffered data.

Information is sent to the central server using libcurl-devel library, so you should use this configure option:

--with-libcurl

It is a good idea to have a web layer over your InfluxDB server, such as Grafana, in order to visualize the data.

Here you can find some Screenshots.

Please, refer to INSTALL.md for installation instructions.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Makefile 43.9%
  • M4 30.8%
  • C 25.3%