Containerized pipeline run #334

sigmafelix · 2024-05-29T22:02:44Z

After a long journey of configuring different software versions on HPC (cf. #333 ), I ended up finding countless and inconsistent errors across nodes and sessions in HPC. Now I am trying to move on to a fully containerized approach, where we use an Apptainer image with recent stable versions of GDAL and its dependencies then mount the project root to a container internal path to make the container detect data files. container-engine branch includes ongoing works for that transition. According to this approach, we submit a job with a R script with tar_make() or tar_make_future() command with sufficient amount of threads and memory (e.g., 80 threads and 640GB of memory) to SLURM, then parallelize the workload by crew or future.callr inside the container.

Apptainer image is based on the geospatial:latest Dockerfile available in the rocker-versioned2 repository (Ubuntu 22.04, GDAL 3.4.1).

crew based: nested parallelism failed, especially with future::multicore plan inside a mirai worker. Copilot argued that nested parallelism is not supported in mirai:

Typically, each worker in a parallel computing setup like the one provided by the mirai package in R is expected to use a single core. This is because each worker is usually a separate process, and each process is typically run on a single core.
However, it's important to note that this doesn't mean that the entire computation is limited to a single core. The idea behind parallel computing is to distribute the computation across multiple workers, each running on its own core, to speed up the computation.
If you're using the future package for parallel computing, you can specify the plan to use multiple cores with plan(multicore), plan(multiprocess), or plan(cluster, workers = N), where N is the number of cores.
Nested parallelism, where each worker itself tries to use multiple cores, can be more complex to manage and is not supported by all parallel computing frameworks. If you're trying to use nested parallelism with mirai and future, you might encounter issues if mirai is not designed to handle nested parallelism or if it's not compatible with the parallel backend you're using with future.
future.callr with future::plan(future.callr) works okay and I confirmed that it submitted multiple workers simultaneously.

A very strange behavior was found in vector operations in this approach, where the intersection between the unique sites and the Ecoregion polygons returned the different number of results (1096 in triton run, 1051 in Apptainer run). I attempted to repair the Ecoregion polygons by terra::makeValid() or terra::buffer(x, width=0) in no avail.

I am still working on investigating the issues and try to figure out what the exact cause is; I feel much more efforts are put into this work than what I expected and it is getting more complex as the time goes.

The text was updated successfully, but these errors were encountered:

sigmafelix · 2024-06-03T13:31:45Z

The pipeline runs okay with the custom build GDAL and R packages on GEO. Further investigation on the unconventional behavior is on hold.

kyle-messier · 2024-06-03T13:38:01Z

Thanks @sigmafelix

sigmafelix · 2024-07-23T17:34:58Z

renv experiment needs figuring out an undetected GitHub packages such as beethoven and amadeus. Hash, repository URL, and other properties are not populated in renv.lock file when a renv is initiated. I will investigate this issue thoroughly.

sigmafelix · 2024-08-19T03:01:16Z

After 0.4.0 merge into main, I will update container-engine to match all updates that are non container-related .

sigmafelix closed this as completed Jun 13, 2024

sigmafelix reopened this Jul 23, 2024

sigmafelix self-assigned this Jul 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Containerized pipeline run #334

Containerized pipeline run #334

sigmafelix commented May 29, 2024 •

edited

Loading

sigmafelix commented Jun 3, 2024

kyle-messier commented Jun 3, 2024

sigmafelix commented Jul 23, 2024

sigmafelix commented Aug 19, 2024

Containerized pipeline run #334

Containerized pipeline run #334

Comments

sigmafelix commented May 29, 2024 • edited Loading

sigmafelix commented Jun 3, 2024

kyle-messier commented Jun 3, 2024

sigmafelix commented Jul 23, 2024

sigmafelix commented Aug 19, 2024

sigmafelix commented May 29, 2024 •

edited

Loading