Consider the task of conducting a simulation study that involves generating a stochastic process and applying an estimation procedure. In this setup, you want to vary a specific parameter in the data-generating process and aim to save the estimated parameters of the underlying process for each simulation for each setting.
One can efficiently parallelize such a simulation study using array
jobs in a slurm
- This demo is assuming that the user is aiming to parallelize the execution of simulation study using
. - This demo is assuming that the user is having access to a
cluster - All commands are assumed to be performed on a linux command line that have
Locate your $HOME
directory with:
echo $HOME
Create the following file tree in the $HOME
├── demo_array_job_slurm
│ ├── data_temp
│ ├── report
│ ├── outfile
mkdir demo_array_job_slurm
cd demo_array_job_slurm
mkdir data_temp
mkdir report
mkdir outfile
cd ..
We consider the simulation study of generating samples of
and of the variance
where we vary the sample size n
Create and save this file as demo_array_job_slurm/my_simu.R
# clean ws
# get environment variable
n = as.numeric(Sys.getenv("n"))
# set param
mean = 10
sd = 2
# get array job id environment variable
id_slurm <- as.numeric(Sys.getenv("SLURM_ARRAY_TASK_ID"))
# set seed
set.seed(123 + id_slurm)
# generate data
data = rnorm(n = n, mean=mean, sd = sd)
xbar = mean(data)
sd_hat = sd(data)
# create df
df_to_save = data.frame(matrix(NA, ncol=6))
colnames(df_to_save) = c("id_slurm","n","mu", "sd", "xbar", "sd_hat" )
# save in df
df_to_save[1,1] = id_slurm
df_to_save[1,2] = n
df_to_save[1,3] = mean
df_to_save[1,4] = sd
df_to_save[1,5] = xbar
df_to_save[1,6] = sd_hat
# save file for each simu
file_name = paste0("demo_array_job_slurm/data_temp/", "results_my_simu_",id_slurm ,"_",n, ".rda")
save(df_to_save, file = file_name)
# clean after simu
Create and save the BATCH
file that will launch your R
script as demo_array_job_slurm/
#SBATCH --partition=shared-cpu,shared-bigmem,public-cpu,public-bigmem,public-longrun-cpu
#SBATCH --time=00-00:10:00
#SBATCH --cpus-per-task=1
#SBATCH --ntasks=1
#SBATCH --mail-user=your_email
#SBATCH --job-name=demo_array_job_slurm
#SBATCH --mail-type=NONE
#SBATCH --output=/dev/null
#SBATCH --error=/dev/null
module load GCC/9.3.0 OpenMPI/4.0.3 R/4.0.0
exec > $OUTLOG 2>&1
srun R CMD BATCH --no-save --no-restore $INFILE $OUTFILE
and --no-restore
are used to prevent errors such as:
In load(name, envir = .GlobalEnv) :
cannot open compressed file '.RData', probable reason 'No such file or directory'
We then create the file demo_array_job_slurm/
to launch all three settings with different n
for n in 100 200 500
eval "export n=$n"
sbatch --array=1-50 demo_array_job_slurm/
Then, make this file executable with:
chmod u+x demo_array_job_slurm/
The recombination script allows to recombine all results.
Create and save this file as demo_array_job_slurm/recombine.R
# define path
folder = "demo_array_job_slurm/"
path = paste0(folder, "data_temp")
# list files
all_files = list.files(path = path)
# load first file
load(paste0(path, "/", all_files[1]))
ncol_file = ncol(df_to_save)
# create df to save
df_all_results = data.frame(matrix(NA, ncol=ncol_file))
colnames(df_all_results) = colnames(df_to_save)
# for all files load and bind
for(file_index in seq_along(all_files)){
file_i = all_files[file_index]
file_name = paste0(path,"/",file_i)
df_all_results = rbind(df_all_results, df_to_save)
colnames(df_all_results) = colnames(df_to_save)
df_all_results = df_all_results[-1,]
# save matrix of results
time = Sys.time()
time_2 = gsub(" ", "_", time)
time_3 = gsub(":", "-", time_2)
file_name_to_save = paste0(paste0(folder, paste("df_results_demo_array_job_slurm_", time_3, sep="_"),
save(df_all_results, file=file_name_to_save)
Create and save this file as demo_array_job_slurm/
#SBATCH --job-name=recombine
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --time=00:10:00
#SBATCH --partition=shared-cpu,shared-bigmem,public-cpu,public-bigmem
#SBATCH --mail-user=your_email
#SBATCH --mail-type=NONE
#SBATCH --output demo_array_job_slurm/outfile/outfile_recombine.out
module load GCC/9.3.0 OpenMPI/4.0.3 R/4.0.0
Make sure to have the following file tree before launching the simulation:
├── data_temp
├── my_simu.R
├── outfile
├── recombine.R
└── report
Make sure you are root ($HOME) and launch the array job with
will then returns something like:
Submitted batch job 37936807
Submitted batch job 37936808
Submitted batch job 37936809
You can check if the array task is launched with:
squeue -u username
Once all simulations are run, you then submit the recombination R
script with:
sbatch demo_array_job_slurm/
You should now have a file like:
in the folder demo_array_job_slurm
Well done! 🤓 😎