diff --git a/.env.TEMPLATE b/.env.TEMPLATE
index 62f36ff..d93b2be 100644
--- a/.env.TEMPLATE
+++ b/.env.TEMPLATE
@@ -11,7 +11,10 @@
 # specified environment
 DEFAULT_ENV=dev
 
-# if 0, doesn't open a browser to the frontend web app on a normal stack launch
+# registry in which to store the images (uses dockerhub if unspecified)
+REGISTRY_PREFIX="us-central1-docker.pkg.dev/cuhealthai-foundations/jravilab-public"
+
+# if 0, doesn't open a browser to the frontend webapp on a normal stack launch
 DO_OPEN_BROWSER=1
 
 # database (postgres)
@@ -19,3 +22,18 @@ POSTGRES_USER=molevolvr
 POSTGRES_PASSWORD=
 POSTGRES_DB=molevolvr
 POSTGRES_HOST=db-${DEFAULT_ENV}
+
+# slurm accounting database (mariadb)
+MARIADB_ROOT_PASSWORD=
+MARIADB_USER=slurmdbd
+MARIADB_PASSWORD=
+MARIADB_DATABASE=slurm_acct_db
+MARIADB_HOST=accounting-${DEFAULT_ENV}
+MARIADB_PORT=3306
+
+# slurm-specific vars
+CLUSTER_NAME=molevolvr-${DEFAULT_ENV}
+SLURM_MASTER=master-${DEFAULT_ENV} # who's running slurmctld
+SLURM_DBD_HOST=master-${DEFAULT_ENV}  # who's running slurmdbd
+SLURM_WORKER=worker-${DEFAULT_ENV}  # who's running slurmd
+SLURM_CPUS=10 # how many cpus to allocate on the worker node
diff --git a/backend/README.md b/backend/README.md
index 4cc7858..d08f0cc 100644
--- a/backend/README.md
+++ b/backend/README.md
@@ -1,29 +1,119 @@
 # MolEvolvR Backend
 
-The backend is implemented as a RESTful API over the following entities:
-
-- `User`: Represents a user of the system. At the moment logins aren't
-required, so all regular users are the special "Anonymous" user. Admins
-have individual accounts.
-- `Analysis`: Represents an analysis submitted by a user. Each analysis has a unique ID
-and is associated with a user. analyses contain the following sub-entities:
-    - `Submission`: Represents the submission of a Analysis, e.g. the data
-       itself as well the submission's parameters (both selected by the 
+The backend is implemented as a RESTful API. It currently provides endpoints for
+just the `analysis` entity, but will be expanded to include other entities as
+well.
+
+## Usage
+
+Run the `launch_api.sh` script to start API server in a hot-reloading development mode.
+The server will run on port 9050, unless the env var `API_PORT` is set to another
+value. Once it's running, you can access it at http://localhost:9050.
+
+If the env var `USE_SLURM` is equal to 1, the script will create a basic SLURM
+configuration and then launch `munge`, a client used to authenticate to the
+SLURM cluster. The template that configures the backend's connection to SLURM
+can be found at `./cluster_config/slurm.conf.template`.
+
+The script then applies any outstanding database migrations via
+[atlas](https://github.com/ariga/atlas). Finally the API server is started by
+executing the `entrypoint.R` script via
+[drip](https://github.com/siegerts/drip), which restarts the server whenever
+there are changes to the code.
+
+*(Side note: the entrypoint contains a bit of custom logic to
+defer actually launching the server until the port it listens on is free, since
+drip doesn't cleanly shut down the old instance of the server.)*
+
+## Implementation
+
+The backend is implemented in [Plumber](https://www.rplumber.io/index.html), a
+package for R that allows for the creation of RESTful APIs. The API is defined
+in the `api/plumber.R` file, which defines the router and some shared metadata
+routes. The rest of the routes are brought in from the `endpoints/` directory.
+
+Currently implemented endpoints:
+- `POST /analyses`: Create a new analysis
+- `GET  /analyses`: Get all analyses
+- `GET  /analyses/:id`: Get a specific analysis by its ID
+- `GET  /analyses/:id/status`: Get just the status field for an analysis by its ID
+
+*(TBC: more comprehensive docs; see the [Swagger docs](http://localhost:9050/__docs__/) for now)*
+
+## Database Schema
+
+The backend uses a PostgreSQL database to store analyses. The database's schema
+is managed by [atlas](https://github.com/ariga/atlas); you can find the current
+schema definition at `./schema/schema.pg.hcl`. After changing the schema, you
+can create a "migration", i.e. a set of SQL statements that will bring the
+database up to date with the new schema, by running `./schema/makemigration.sh
+<reason>`; if all is well with the schema, the new migration will be put in
+`./schema/migrations/`.
+
+Any pending migrations are applied automatically when the backend starts up, but
+you can manually apply new migrations by running `./schema/apply.sh`.
+
+## Testing
+
+You can run the tests for the backend by running the `run_tests.sh` script. The
+script will recursively search for all files with the pattern `test_*.R` in the
+`tests/` directory and run them. Tests are written using the
+[testthat](https://testthat.r-lib.org/) package.
+
+Note that the tests currently depend on the stack's services being available, so
+you should run the tests from within the backend container after having started
+the stack normally. An easy way to do that is to execute `./run_stack.sh shell`
+in the repo root, which will give you an interactive shell in the backend
+container. Eventually, we'll have them run in their own environment, which the
+`run_tests.sh` script will likely orchestrate.
+
+## Implementation Details
+
+### Domain Entities
+
+*NOTE: the backend is as of now a work in progress, so expect this to change.*
+
+The backend includes, or will include, the following entities:
+
+- `User`: Represents a user of the system. At the moment logins aren't required,
+so all regular users are the special "Anonymous" user. Admins have individual
+accounts.
+- `Analysis`: Represents an analysis submitted by a user. Each analysis has a
+unique ID and is associated with a user. analyses contain the following
+sub-entities:
+    - `AnalysisSubmission`: Represents the submission of a Analysis, e.g. the
+       data itself as well the submission's parameters (both selected by the
        user and supplied by the system).
-    - `AnalysisStatus`: Represents the status of a Analysis. Each Analysis has a status
-    associated with it, which is updated as the Analysis proceeds through its
-    processing stages.
+    - `AnalysisStatus`: Represents the status of a Analysis. Each Analysis has a
+    status associated with it, which is updated as the Analysis proceeds through
+    its processing stages.
     - `AnalysisResult`: Represents the result of a Analysis.
-- `Cluster`: Represents the status of the overall cluster, including
-how many analyses have been completed, how many are in the queue,
-and other statistics related to the processing of analyses.
+- `Queue`: Represents the status of processing analyses, including how many
+analyses have been completed, how many are in the queue, and other statistics.
+- `System`: Represents the system as a whole, including the version of the
+backend, the version of the frontend, and other metadata about the system.
+Includes runtime statistics about the execution environment as well, such as RAM
+and CPU usage. Includes cluster information, too, such as node uptime and
+health.
 
-## Implementation
+### Job Processing
+
+*NOTE: we use the term "job" here to indicate any asynchronous task that the
+backend needs to perform outside of the request-response cycle. It's not related
+to the app domain's terminology of a "job" (i.e. an analysis).*
 
-The backend is implemented in Plumber, a package for R that allows for the
-creation of RESTful APIs. The API is defined in the `api/router.R` file, which
-contains the endpoints for the API. Supporting files are found in
-`api/resources/`.
+The backend makes use of
+[future.batchtools](https://future.batchtools.futureverse.org/), an extension
+that adds [futures](https://future.futureverse.org/) support to
+[batchtools](https://mllg.github.io/batchtools/index.html), a package for
+processing asynchronous jobs. The package provides support for many
+job-processing systems, including
+[SLURM](https://slurm.schedmd.com/documentation.html); more details on
+alternative systems can be found in the [`batchtools` package
+documentation](https://mllg.github.io/batchtools/articles/batchtools.html).
 
-The API is then run using the `launch_api.R` file, which starts the Plumber
-server.
+In our case, we use SLURM; `batchtools` basically wraps SLURM's `sbatch` command
+and handles producing a job script for an R callable, submitting the script to
+the cluster for execution, and collecting the results to be returned to R. The
+template for the job submission script can be found at
+`./cluster_config/slurm.tmpl`.
diff --git a/backend/api/cluster.R b/backend/api/cluster.R
new file mode 100644
index 0000000..cfec29d
--- /dev/null
+++ b/backend/api/cluster.R
@@ -0,0 +1,33 @@
+# contains shared state for interacting with the job dispatch system
+
+box::use(
+    batchtools[makeRegistry],
+    future.batchtools[...],
+    future[plan, future, value]
+)
+
+.on_load <- function(ns) {
+    options(future.cache.path = "/opt/shared-jobs/.future", future.delete = TRUE)
+
+    # create a registry
+    dir.create("/opt/shared-jobs/jobs-scratch", recursive = TRUE, showWarnings = FALSE)
+    # reg <- makeRegistry(file.dir = NA, work.dir = "/opt/shared-jobs/jobs-scratch")
+    # call plan()
+    plan(
+        batchtools_slurm,
+        template = "/app/cluster_config/slurm.tmpl",
+        resources = list(nodes = 1, cpus = 1, walltime=2700, ncpus=1, memory=1000)
+    )
+}
+
+#' Takes in a block of code and runs it asynchronously, returning the future
+#' @param callable a function that will be run asynchronously in a slurm job
+#' @param work.dir the directory to run the code in, which should be visible to worker nodes
+#' @return a future object representing the asynchronous job
+dispatch <- function(callable, work.dir="/opt/shared-jobs/jobs-scratch") {
+    # ensure we run jobs in a place where slurm nodes can access them, too
+    setwd(work.dir)
+    future(callable())
+}
+
+box::export(dispatch)
diff --git a/backend/api/dispatch/submit.R b/backend/api/dispatch/submit.R
new file mode 100644
index 0000000..8e8833f
--- /dev/null
+++ b/backend/api/dispatch/submit.R
@@ -0,0 +1,56 @@
+box::use(
+  analyses = api/models/analyses,
+  api/cluster[dispatch]
+)
+
+#' Dispatch an analysis, i.e. create a record for it in the database and submit
+#' it to the cluster for processing
+#' @param name the name of the analysis
+#' @param type the type of the analysis
+#' @return the id of the new analysis
+dispatchAnalysis <- function(name, type) {
+    # create the analysis record
+    analysis_id <- analyses$db_submit_analysis(name, type)
+
+    # print to the error log that we're dispatching this
+    cat("Dispatching analysis", analysis_id, "\n")
+
+    # dispatch the analysis async (to slurm, or wherever)
+    promise <- dispatch(function() {
+
+        tryCatch({
+            # do the analysis
+            analyses$db_update_analysis_status(analysis_id, "analyzing")
+
+            # FIXME: implement calls to the molevolvr package to perform the
+            #  analysis. we may fire off additional 'dispatch()' calls if we
+            #  need to parallelize things.
+
+            # --- begin testing section which should be removed ---
+
+            # for now, just do a "task"
+            Sys.sleep(1) # pretend we're doing something
+
+            # if type is "break", raise an error to test the handler
+            if (type == "break") {
+                stop("test error")
+            }
+
+            # --- end testing section ---
+
+            # finalize when we're done
+            analyses$db_update_analysis_status(analysis_id, "complete")
+        }, error = function(e) {
+            # on error, log the error and update the status
+            analyses$db_update_analysis_status(analysis_id, "error", reason=e$message)
+            cat("Error in analysis ", analysis_id, ": ", e$message, "\n")
+            flush()
+        })
+
+        cat("Analysis", analysis_id, " completed\n")
+    })
+
+    return(analysis_id)
+}
+
+box::export(dispatchAnalysis)
diff --git a/backend/api/endpoints/analyses.R b/backend/api/endpoints/analyses.R
index f9062fc..b9818de 100644
--- a/backend/api/endpoints/analyses.R
+++ b/backend/api/endpoints/analyses.R
@@ -1,11 +1,13 @@
 # endpoints for submitting and checking information about analyses.
 # included by the router aggregator in ./plumber.R; all these endpoints are
-# prefixed with /analysis/ by the aggregator.
+# prefixed with /analyses/ by the aggregator.
 
 box::use(
   analyses = api/models/analyses,
+  api/dispatch/submit[dispatchAnalysis],
+  api/helpers/responses[api_404_if_empty],
   tibble[tibble],
-  dplyr[select, any_of, mutate],
+  dplyr[select, any_of, mutate, pull],
   dbplyr[`%>%`]
 )
 
@@ -18,6 +20,10 @@ box::use(
 analysis_list <- function() {
   result <- analyses$db_get_analyses()
 
+  # NOTE: this is 'postprocessing' is required when jsonlite's force param is
+  # FALSE, because it can't figure out how to serialize the types otherwise.
+  # while we just set force=TRUE now, i don't know all the implications of that
+  # choice, so i'll leave this code here in case we need it.
   # postprocess types in the result
   # result <- result %>%
   #   mutate(
@@ -32,30 +38,48 @@ analysis_list <- function() {
 #* @tag Analyses
 #* @serializer jsonExt
 #* @get /<id:str>/status
-analysis_status <- function(id) {
-  result <- analyses$db_get_analysis_by_id(id)
-  result$status
+#* @response 404 error_message="Analysis with id '...' not found"
+analysis_status <- function(id, res) {
+  api_404_if_empty(
+    analyses$db_get_analysis_by_id(id) %>% pull(status), res,
+     error_message=paste0("Analysis with id '", id, "' not found")
+  )
 }
 
 
 #* Query the database for an analysis's complete information.
 #* @tag Analyses
-#* @serializer jsonExt
+#* @serializer jsonExt list(auto_unbox=TRUE)
 #* @get /<id:str>
-analysis_by_id <- function(id){
-  result <- analyses$db_get_analysis_by_id(id)
-  # result is a tibble with one row, so just
-  # return that row rather than the entire tibble
-  result
+#* @response 200 schema=analysis
+#* @response 404 error_message="Analysis with id '...' not found"
+analysis_by_id <- function(id, res) {
+  # below we return the analysis object; we have to unbox it again
+  # because auto_unbox only unboxes length-1 lists and vectors, not
+  # dataframes
+  api_404_if_empty(
+    jsonlite::unbox(analyses$db_get_analysis_by_id(id)),
+    res, error_message=paste0("Analysis with id '", id, "' not found")
+  )
 }
 
 #* Submit a new MolEvolvR analysis, returning the analysis ID
 #* @tag Analyses
 #* @serializer jsonExt
 #* @post /
+#* @param name:str A friendly name for the analysis chosen by the user
+#* @param type:str Type of the analysis (e.g., "FASTA")
 analysis_submit <- function(name, type) {
-    # submit the analysis
-    result <- analyses$db_submit_analysis(name, type)
-    # the result is a scalar in a vector, so just return the scalar
-    # result[[1]]
+  # submits the analysis, which handles:
+  # - inserting the analysis into the database
+  # - dispatching the analysis to the cluster
+  # - returning the analysis ID
+  analysis_id <- dispatchAnalysis(name, type)
+
+  # NOTE: unboxing (again?) gets it return a single string rather than a list
+  # with a string in it. while it works, it's a hack, and i should figure out
+  # how to make the serializer do this for me.
+  return(
+    jsonlite::unbox(analyses$db_get_analysis_by_id(analysis_id))
+  )
 }
diff --git a/backend/api/support/custom_serializers.R b/backend/api/helpers/custom_serializers.R
similarity index 98%
rename from backend/api/support/custom_serializers.R
rename to backend/api/helpers/custom_serializers.R
index 1b7b443..c612a4d 100644
--- a/backend/api/support/custom_serializers.R
+++ b/backend/api/helpers/custom_serializers.R
@@ -4,7 +4,7 @@
 
 box::use(
   plumber[register_serializer, serializer_content_type],
-  api/support/string_helpers[inline_str_list]
+  api/helpers/string_helpers[inline_str_list]
 )
 
 #' Register custom serializers, e.g. for JSON with specific defaults
diff --git a/backend/api/helpers/responses.R b/backend/api/helpers/responses.R
new file mode 100644
index 0000000..fbf5642
--- /dev/null
+++ b/backend/api/helpers/responses.R
@@ -0,0 +1,13 @@
+#' Helpers for returning error responses from the API
+
+api_404_if_empty <- function(result, res, error_message="Not found") {
+    if (isTRUE(nrow(result) == 0 || is.null(result) || length(result) == 0)) {
+        cat("Returning 404\n")
+        res$status <- 404
+        return(error_message)
+    }
+    
+    return(result)
+}
+
+box::export(api_404_if_empty)
diff --git a/backend/api/support/string_helpers.R b/backend/api/helpers/string_helpers.R
similarity index 100%
rename from backend/api/support/string_helpers.R
rename to backend/api/helpers/string_helpers.R
diff --git a/backend/api/models/analyses.R b/backend/api/models/analyses.R
index 6c11fa6..4f99080 100644
--- a/backend/api/models/analyses.R
+++ b/backend/api/models/analyses.R
@@ -4,6 +4,13 @@ box::use(
   api/db[getCon, insert_get_id]
 )
 
+statuses <- list(
+  submitted="submitted",
+  analyzing="analyzing",
+  complete="complete",
+  error="error"
+)
+
 #' submit a new analysis, which starts in the "submitted" state
 #' @param name the name of the analysis
 #' @param type the type of the analysis
@@ -24,6 +31,42 @@ db_submit_analysis <- function(name, type, con=NULL) {
   return(insert_get_id("analyses", new_entry, con=con))
 }
 
+#' update an analysis' status field
+#' @param id the id of the analysis to update
+#' @param status the new status of the analysis
+#' @export
+#' @return the number of affected rows (typically 1, unless the analysis doesn't exist)
+db_update_analysis_status <- function(id, status, reason=NULL, con=NULL) {
+  if (is.null(con)) {
+    con <- getCon()
+    on.exit(DBI::dbDisconnect(con))
+  }
+
+  # check that status is in statuses
+  if (!(status %in% names(statuses))) {
+    stop("status must be one of: ", paste(names(statuses), collapse=", "))
+  }
+
+  # check that it's not 'submitted', since we can't revert to that status
+  if (status == statuses$submitted) {
+    stop("status cannot be set to 'submitted' after creation")
+  }
+
+  # update the record
+  if (status == statuses$analyzing) {
+    DBI::dbSendQuery(con, "UPDATE analyses SET status = $1, started = now() WHERE id = $2", params = list(status, id))
+  }
+  else if (status == statuses$complete) {
+    DBI::dbSendQuery(con, "UPDATE analyses SET status = $1, completed = now() WHERE id = $2", params = list(status, id))
+  }
+  else if (status == statuses$error) {
+    DBI::dbSendQuery(con, "UPDATE analyses SET status = $1, reason = $2, completed = now() WHERE id = $3", params = list(status, reason, id))
+  }
+  else {
+    DBI::dbSendQuery(con, "UPDATE analyses SET status = $1 WHERE id = $2", params = list(status, id))
+  }
+}
+
 #' query the 'analyses' table using dbplyr for all analyses
 #' @return a data frame containing all analyses
 #' @export
diff --git a/backend/api/plumber.R b/backend/api/plumber.R
index d55cf09..45ed3c0 100644
--- a/backend/api/plumber.R
+++ b/backend/api/plumber.R
@@ -2,7 +2,7 @@
 
 box::use(
   plumber[...],
-  api/support/custom_serializers[setup_custom_serializers]
+  api/helpers/custom_serializers[setup_custom_serializers]
 )
 
 # bring in custom serializers
@@ -33,12 +33,12 @@ index <- function() {
   )
 }
 
-# Define a custom error handler that includes a traceback
+# define a custom error handler that includes a traceback
 custom_error_handler <- function(req, res, err) {
-  # Capture the traceback
+  # capture the traceback
   traceback <- paste(capture.output(traceback()), collapse = "\n")
 
-  # Set the response status code and body
+  # set the response status code and body
   res$status <- 500
   list(
     error = err$message,
@@ -49,7 +49,7 @@ custom_error_handler <- function(req, res, err) {
 #' @plumber
 function(pr) {
   pr %>%
-    pr_set_debug(TRUE) %>%
+    pr_set_debug(Sys.getenv("PLUMBER_DEBUG", unset="0") == "1") %>%
     pr_set_error(custom_error_handler) %>%
     pr_mount("/analyses", pr("./endpoints/analyses.R")) %>%
     pr_mount("/stats", pr("./endpoints/stats.R"))
diff --git a/backend/api/tests/testthat/dispatch/test-slurm-submit.R b/backend/api/tests/testthat/dispatch/test-slurm-submit.R
new file mode 100644
index 0000000..e4be8aa
--- /dev/null
+++ b/backend/api/tests/testthat/dispatch/test-slurm-submit.R
@@ -0,0 +1,68 @@
+test_that("slurm jobs run on the cluster", {
+    # skip the test if the env var USE_SLURM != 1
+    skip_if(!identical(Sys.getenv("USE_SLURM"), "1"), message="SLURM disabled, skipping SLURM tests")
+
+    options(box.path = "/app")
+
+    box::use(
+        api/cluster[dispatch],
+        future[value]
+    )
+
+    # test that the dispatch function works
+    job <- dispatch(function() {
+        "done"
+    })
+
+    expect_equal(value(job), "done")
+
+    # test that we can run a second job, too
+    job2 <- dispatch(function() {
+        "done, again"
+    })
+
+    expect_equal(value(job2), "done, again")
+})
+
+test_that("multiple slurm jobs run concurrently on the cluster", {
+    # skip the test if the env var USE_SLURM != 1
+    skip_if(!identical(Sys.getenv("USE_SLURM"), "1"), message="SLURM disabled, skipping SLURM tests")
+
+    options(box.path = "/app")
+
+    box::use(
+        api/cluster[dispatch],
+        future[value]
+    )
+
+    # fire off job 1 and 2
+    job <- dispatch(function() { "done" })
+    job2 <- dispatch(function() { "done, again" })
+
+    # collect the results of each job
+    expect_equal(value(job), "done")
+    expect_equal(value(job2), "done, again")
+})
+
+test_that("nested jobs complete as expected", {
+    # skip the test if the env var USE_SLURM != 1
+    skip_if(!identical(Sys.getenv("USE_SLURM"), "1"), message="SLURM disabled, skipping SLURM tests")
+    
+    options(box.path = "/app")
+
+    box::use(
+        api/cluster[dispatch],
+        future[value]
+    )
+
+    # create a job which contains a second job
+    job <- dispatch(function() {
+        inside_job <- dispatch(function() {
+            "it was an inside job"
+        })
+        value(inside_job)
+    })
+
+    # collect the results of each job
+    expect_equal(value(job), "it was an inside job")
+})
diff --git a/backend/api/tests/testthat/models/test-analysis.R b/backend/api/tests/testthat/models/test-analysis.R
index 4ddc3bc..e543314 100644
--- a/backend/api/tests/testthat/models/test-analysis.R
+++ b/backend/api/tests/testthat/models/test-analysis.R
@@ -29,7 +29,7 @@ test_that("submitted analyses can be retrieved and matches submission", {
 
   box::use(
     api/db[getCon],
-    api/models/analyses[db_submit_analysis, db_get_analyses]
+    api/models/analyses[db_submit_analysis, db_get_analyses, db_get_analysis_by_id]
   )
 
   con <- getCon()
@@ -51,6 +51,11 @@ test_that("submitted analyses can be retrieved and matches submission", {
   expect_equal( last_analysis$type, "there" )
   expect_equal( toString(last_analysis$status), "submitted" )
 
+  # wait for a bit, then quwry and check the status again
+  Sys.sleep(10)
+  result <- db_get_analysis_by_id(code, con=con)
+  expect_equal( toString(result$status), "submitted" )
+
   # and revert so we don't affect the database
   DBI::dbRollback(con)
   DBI::dbDisconnect(con)
diff --git a/backend/cluster_config/slurm.conf.template b/backend/cluster_config/slurm.conf.template
new file mode 100644
index 0000000..cfad120
--- /dev/null
+++ b/backend/cluster_config/slurm.conf.template
@@ -0,0 +1,3 @@
+# this is a simplified slurm.conf for clients of the cluster
+ClusterName=${CLUSTER_NAME}
+SlurmctldHost=${SLURM_MASTER}
diff --git a/backend/cluster_config/slurm.tmpl b/backend/cluster_config/slurm.tmpl
new file mode 100644
index 0000000..9c49c26
--- /dev/null
+++ b/backend/cluster_config/slurm.tmpl
@@ -0,0 +1,51 @@
+#!/bin/bash
+
+# from https://raw.githubusercontent.com/mllg/batchtools/master/inst/templates/slurm-simple.tmpl
+
+## Job Resource Interface Definition
+##
+## ntasks [integer(1)]:       Number of required tasks,
+##                            Set larger than 1 if you want to further parallelize
+##                            with MPI within your job.
+## ncpus [integer(1)]:        Number of required cpus per task,
+##                            Set larger than 1 if you want to further parallelize
+##                            with multicore/parallel within each task.
+## walltime [integer(1)]:     Walltime for this job, in seconds.
+##                            Must be at least 60 seconds for Slurm to work properly.
+## memory   [integer(1)]:     Memory in megabytes for each cpu.
+##                            Must be at least 100 (when I tried lower values my
+##                            jobs did not start at all).
+##
+## Default resources can be set in your .batchtools.conf.R by defining the variable
+## 'default.resources' as a named list.
+
+<%
+# relative paths are not handled well by Slurm
+log.file = fs::path_expand(log.file)
+-%>
+
+
+#SBATCH --job-name=<%= job.name %>
+#SBATCH --output=<%= log.file %>
+#SBATCH --error=<%= log.file %>
+## #SBATCH --time=<%= ceiling(resources$walltime / 60) %>
+#SBATCH --ntasks=1
+#SBATCH --cpus-per-task=<%= resources$ncpus %>
+## #SBATCH --mem-per-cpu=<%= resources$memory %>
+<%= if (!is.null(resources$partition)) sprintf(paste0("#SBATCH --partition='", resources$partition, "'")) %>
+<%= if (array.jobs) sprintf("#SBATCH --array=1-%i", nrow(jobs)) else "" %>
+
+## Initialize work environment like
+## source /etc/profile
+## module add ...
+
+## Export value of DEBUGME environemnt var to slave
+export DEBUGME=<%= Sys.getenv("DEBUGME") %>
+
+<%= sprintf("export OMP_NUM_THREADS=%i", resources$omp.threads) -%>
+<%= sprintf("export OPENBLAS_NUM_THREADS=%i", resources$blas.threads) -%>
+<%= sprintf("export MKL_NUM_THREADS=%i", resources$blas.threads) -%>
+
+## Run R:
+## we merge R output with stdout from SLURM, which gets then logged via --output option
+Rscript -e 'batchtools::doJobCollection("<%= uri %>")'
diff --git a/backend/docker/Dockerfile b/backend/docker/Dockerfile
index c64ec02..8387ed5 100644
--- a/backend/docker/Dockerfile
+++ b/backend/docker/Dockerfile
@@ -3,7 +3,13 @@
 # this Dockerfile should be used with the ./backend/ folder as the context
 # and ./backend/docker/Dockerfile as the dockerfile
 
-FROM rocker/tidyverse:4.3
+ARG PRIOR_LAYER=backend-base
+
+# -----------------------------------------------------------------------------
+# --- basic backend
+# -----------------------------------------------------------------------------
+
+FROM rocker/tidyverse:4.3 AS backend-base
 
 # install ccache, a compiler cache
 RUN apt-get update && apt-get install -y ccache
@@ -33,6 +39,90 @@ RUN   Rscript /tmp/install.r
 # RUN --mount=type=cache,target=/usr/local/lib/R/site-library \
 #     Rscript /tmp/install.r
 
+# -----------------------------------------------------------------------------
+# --- backend w/slurm support
+# -----------------------------------------------------------------------------
+
+FROM backend-base AS backend-slurm
+
+# install a more capable envsubst drop-in replacement, i.e. that supports
+# default values. we use this to generate the actual slurm.conf and other config
+# files from templates in the entrypoint script
+RUN curl -L -o envsubst \
+    "https://github.com/a8m/envsubst/releases/download/v1.2.0/envsubst-$( uname -s )-$( uname -m )" && \
+    chmod +x envsubst && \
+    mv envsubst /usr/local/bin
+
+# install slurm client, munge, so we can submit jobs
+# NOTE: we would ordinarily install it as a package since it's dramatically
+# faster than building from source, but the slurm included in our version of
+# ubuntu is *very* old, and the backport sources weren't working for me. leaving
+# this here in case we do find a way to install it via apt.
+# RUN apt-get install -y slurm-client
+
+# ------
+
+# set up munge user, because the package will create it with the wrong UID if
+# it doesn't already exist
+ENV MUNGE_UID=981
+RUN groupadd -g $MUNGE_UID munge \
+    && useradd  -m -c "MUNGE Uid 'N' Gid Emporium" -d /var/lib/munge -u $MUNGE_UID -g munge  -s /sbin/nologin munge
+
+# just do one apt update rather than one per package
+RUN apt-get update
+
+# install the following deps for slurm functions:
+# - munge + libmunge: an authentication broker that slurm uses to verify cluster
+#   entities
+# - lua + liblua: so we can run lua scripts in slurm
+# - http-parser, json-c, yaml, jwt: build requirements for slurmrestd (yaml,
+#   jwt add extra funcs)
+# - mailutils: adds /bin/mail, so slurm can send notices via email
+# - hwloc, libbpf-dev, libdbus-1-dev: enables cgroup support in slurm
+# - libpam0g-dev: support for PAM (pluggable auth modules)
+# - libreadline-dev: adds readline support to slurm commands
+RUN DEBIAN_FRONTEND=noninteractive apt-get install -y \
+    munge libmunge-dev \
+    lua5.4 liblua5.4-dev \
+    libhttp-parser-dev libjson-c-dev libyaml-dev libjwt-dev \
+    mailutils \
+    hwloc libhwloc-dev \
+    libmariadb-dev \
+    libbpf-dev libdbus-1-dev \
+    libpam0g-dev \
+    libreadline-dev
+
+# ----------------------
+# slurm 24.05.1 from source
+# ----------------------
+
+# let's try installing slurm from source
+RUN apt-get install -y wget gcc make bzip2 \
+    && cd /tmp \
+    && wget https://download.schedmd.com/slurm/slurm-24.05.1.tar.bz2 \
+    && tar -xvf slurm-24.05.1.tar.bz2 \
+    && cd slurm-24.05.1 \
+    && ./configure \
+        --sysconfdir=/etc/slurm/ \
+        --enable-slurmd=no --enable-controller=no \
+    && make && make install
+
+# ------
+
+# copy slurm config into the image
+COPY ./cluster_config/slurm.conf.template /opt/config-templates/slurm.conf.template
+RUN mkdir -p /etc/slurm/
+
+# instructs the entrypoint to enable slurm
+ENV USE_SLURM=1
+
+
+# -----------------------------------------------------------------------------
+# --- final backend layer: brings in app code, sets up entrypoint
+# -----------------------------------------------------------------------------
+
+FROM ${PRIOR_LAYER} AS backend-final
+
 WORKDIR /app
 
 # copy the app into the image
diff --git a/backend/docker/install.R b/backend/docker/install.R
index 76c03d7..31e6f1c 100644
--- a/backend/docker/install.R
+++ b/backend/docker/install.R
@@ -1,11 +1,13 @@
 # install packages depended on by the molevolvr API server
 install.packages(
     c(
-        "plumber",    # REST API framework
-        "DBI",        # Database interface
-        "RPostgres",  # PostgreSQL-specific impl. for DBI
-        "dbplyr",     # dplyr for databases
-        "box"         # allows R files to be referenced as modules
+        "plumber",              # REST API framework
+        "DBI",                  # Database interface
+        "RPostgres",            # PostgreSQL-specific impl. for DBI
+        "dbplyr",               # dplyr for databases
+        "box",                  # allows R files to be referenced as modules
+        "R6",                   # allows us to create python-like classes
+        "future.batchtools"     # allows us to run async jobs on a variety of backends
     ),
     Ncpus = 6
 )
diff --git a/backend/launch_api.sh b/backend/launch_api.sh
index 2c50b17..4a0602e 100755
--- a/backend/launch_api.sh
+++ b/backend/launch_api.sh
@@ -1,5 +1,15 @@
 #!/bin/bash
 
+# write slurm.conf from template
+if [ "${USE_SLURM}" = "1" ]; then
+    echo "* Slurm enabled, configuring..."
+
+    # write slurm config from template
+    envsubst < /opt/config-templates/slurm.conf.template > /etc/slurm/slurm.conf
+    # ensure munge is running so we can auth to the cluster
+    service munge start
+fi
+
 # run schema migrations via ./schema/apply.sh
 (
     echo "* Running schema migrations, if any are available..."
diff --git a/backend/run_tests.sh b/backend/run_tests.sh
index 6ff3b87..b03e553 100755
--- a/backend/run_tests.sh
+++ b/backend/run_tests.sh
@@ -5,8 +5,6 @@
 # it's expected to run into error messages like 'rlang::abort("No test files
 # found")' when there aren't any tests to run in the target directory.
 
-cd /app/api
-
 # set some colors and styles
 RED='\033[0;31m'
 GREEN='\033[0;32m'
@@ -23,10 +21,31 @@ BOLD='\033[1m'
 # and our code is organized into subdirectories, we'll recursively
 # descend into each subdirectory and run tests there
 
-# find all directories in tests/testthat, including tests/testthat itself
-for dir in $( find tests/testthat -type d ); do
-  # run tests in each directory
+trap "echo 'Exited test harness via interrupt'; exit;" SIGINT SIGTERM
+
+function run_test_in_dir() {
+  dir=$1
   echo -e "${GRAY}* Running tests in ${WHITE}${BOLD}$dir${GRAY}...${NC}"
   Rscript -e "testthat::test_dir('$dir')"
   echo ""
+}
+
+# if we're given a specific directory to test, only test that directory
+if [ $# -eq 1 ]; then
+  # run tests in the specified directory
+  run_test_in_dir $1
+  # and exit right after
+  exit 0
+fi
+
+# find all directories in tests/testthat, including tests/testthat itself
+for dir in $( find /app/api/tests/testthat -type d ); do
+  # if the folder contains no R files, skip it
+  if [ $(ls -1 $dir/*.R 2>/dev/null | wc -l) -eq 0 ]; then
+    # echo -e "${WHITE}${BOLD}$dir${RED} contains no R files, continuing...${NC}"
+    continue
+  fi
+
+  # run tests in each directory
+  run_test_in_dir $dir
 done
diff --git a/backend/schema/migrations/20240911014316_added_analyses_reason.sql b/backend/schema/migrations/20240911014316_added_analyses_reason.sql
new file mode 100644
index 0000000..b3eeaed
--- /dev/null
+++ b/backend/schema/migrations/20240911014316_added_analyses_reason.sql
@@ -0,0 +1,2 @@
+-- Modify "analyses" table
+ALTER TABLE "analyses" ADD COLUMN "reason" text NULL;
diff --git a/backend/schema/migrations/atlas.sum b/backend/schema/migrations/atlas.sum
index f48afb4..f5a2c41 100644
--- a/backend/schema/migrations/atlas.sum
+++ b/backend/schema/migrations/atlas.sum
@@ -1,3 +1,4 @@
-h1:OXTcljWPUSyMY9EjrocgMrxW6jfax5+FRXuuPxqxqwE=
+h1:Lq+cjAdIufUxSBBwu3Ijxre0r+nFAfwkBByTw600V7c=
 20240715182613_install_extensions.sql h1:ZpO46s7ZEJEO2BMoRyz0zNMFb68mMmPbI3m2+RbLOPo=
 20240718152036_initial.sql h1:ADusoCeu+pFGr1zpcbvZ0kKFrvvfTAitO3LWPiPbucE=
+20240911014316_added_analyses_reason.sql h1:dPwIgtUDUGGYKbjK0M50TJCdrrChHBr/Xy/oeQroxCo=
diff --git a/backend/schema/schema.pg.hcl b/backend/schema/schema.pg.hcl
index d6e0d3b..e9acbcb 100644
--- a/backend/schema/schema.pg.hcl
+++ b/backend/schema/schema.pg.hcl
@@ -43,6 +43,10 @@ table "analyses" {
     null = false
     default = "submitted"
   }
+  column "reason" {
+    null = true
+    type = text
+  }
 
   primary_key {
     columns = [
diff --git a/cluster/Dockerfile b/cluster/Dockerfile
new file mode 100644
index 0000000..93b02dd
--- /dev/null
+++ b/cluster/Dockerfile
@@ -0,0 +1,131 @@
+# FROM --platform=linux/amd64 ubuntu:24.04
+FROM --platform=linux/amd64 rocker/tidyverse:4.3 AS slurm-build
+
+
+# ============================================================================
+# === slurm preamble: set up users, get dependencies
+# ============================================================================
+
+# --- user setup, used by munge and slurm packages
+# set up user IDs consistently across runs
+ENV MUNGE_UID=981 \
+    SLURM_UID=982 \
+    WORKER_UID=1005
+RUN groupadd -g $MUNGE_UID munge \
+  && useradd  -m -c "MUNGE Uid 'N' Gid Emporium" -d /var/lib/munge -u $MUNGE_UID -g munge  -s /sbin/nologin munge \
+  && groupadd -g $SLURM_UID slurm \
+  && useradd  -m -c "Slurm workload manager" -d /var/lib/slurm -u $SLURM_UID -g slurm  -s /bin/bash slurm \
+  && groupadd -g $WORKER_UID worker \
+  && useradd  -m -c "Workflow user" -d /home/worker -u $WORKER_UID -g worker  -s /bin/bash worker
+
+# just do one apt update rather than one per package
+RUN apt-get update
+
+# install the following deps for slurm functions:
+# - munge + libmunge: an authentication broker that slurm uses to verify cluster
+#   entities
+# - lua + liblua: so we can run lua scripts in slurm
+# - http-parser, json-c, yaml, jwt: build requirements for slurmrestd (yaml,
+#   jwt add extra funcs)
+# - mailutils: adds /bin/mail, so slurm can send notices via email
+# - hwloc, libbpf-dev, libdbus-1-dev: enables cgroup support in slurm
+# - libpam0g-dev: support for PAM (pluggable auth modules)
+# - libreadline-dev: adds readline support to slurm commands
+RUN DEBIAN_FRONTEND=noninteractive apt-get install -y \
+    munge libmunge-dev \
+    lua5.4 liblua5.4-dev \
+    libhttp-parser-dev libjson-c-dev libyaml-dev libjwt-dev \
+    mailutils \
+    hwloc libhwloc-dev \
+    libmariadb-dev \
+    libbpf-dev libdbus-1-dev \
+    libpam0g-dev \
+    libreadline-dev
+
+
+# ============================================================================
+# === build and install slurm
+# ============================================================================
+
+# ----------------------
+# option 1: slurm 24.05.1 from source
+# ----------------------
+
+# let's try installing slurm from source
+RUN apt-get install -y wget gcc make bzip2 \
+    && cd /tmp \
+    && wget https://download.schedmd.com/slurm/slurm-24.05.1.tar.bz2 \
+    && tar -xvf slurm-24.05.1.tar.bz2 \
+    && cd slurm-24.05.1 \
+    && ./configure \
+        --with-lua \
+        --sysconfdir=/etc/slurm/ \
+        --with-systemdsystemunitdir=/usr/lib/systemd/system/ \
+    && make && make install
+
+# ----------------------
+# option 2: slurm 21.08.5 from apt
+# ----------------------
+
+# install slurm from apt
+# RUN apt-get install -y slurm-wlm slurmctld slurmd slurmdbd slurmrestd
+
+
+# ============================================================================
+# === install app-level dependencies, e.g. R libraries, other tools
+# ============================================================================
+
+# install dependencies into the image
+COPY ./install.R /tmp/install.r
+RUN   Rscript /tmp/install.r
+
+# add curl, a useful tool for downloading stuff
+# add a helper tool, 'stress', for putting load on the cluster
+# docs: https://linux.die.net/man/1/stress
+# add a helper tool suite, cgroup-tools, to retrieve container
+# resource allocation info within the container
+RUN apt-get update && apt-get install -y curl stress cgroup-tools cron
+
+# install a more capable envsubst drop-in replacement, i.e. that supports
+# default values. we use this to generate the actual slurm.conf and other config
+# files from templates in the entrypoint script
+RUN curl -L -o envsubst \
+    "https://github.com/a8m/envsubst/releases/download/v1.2.0/envsubst-$( uname -s )-$( uname -m )" && \
+    chmod +x envsubst && \
+    mv envsubst /usr/local/bin
+
+
+# ============================================================================
+# === bring in configuration, init scripts; setup entrypoint
+# ============================================================================
+
+# copy in templates for configuration files
+# these will be interpolated with env vars by entrypoint.sh
+COPY ./slurm-config/ /opt/templates/
+
+# # move init scripts from /tmp/slurm-24.05.1/etc/init.d.<service> to /etc/init.d/
+# # and make them executable
+# RUN cp /tmp/slurm-24.05.1/etc/init.d.slurm /etc/init.d/slurm && \
+#     cp /tmp/slurm-24.05.1/etc/init.d.slurmdbd /etc/init.d/slurmdbd && \
+#     chmod +x /etc/init.d/slurm*
+
+# copy in system v init scripts, defaults
+COPY ./system-v/init.d/ /etc/init.d/
+COPY ./system-v/default/ /etc/default/
+
+# make folders slurm expects to see
+RUN mkdir -p /var/spool/slurmd /var/spool/slurmctld
+
+# copy in the entrypoint script and supporting scripts,
+# then define it as the entrypoint
+COPY ./entrypoint/ /var/slurm-init/
+ENTRYPOINT ["/var/slurm-init/entrypoint.sh"]
+
+# and keep the container running forever
+CMD /bin/bash -c "trap : TERM INT; sleep infinity & wait"
+
+# ----------------------
+# --- slurm-controller stage
+# ----------------------
+
+FROM slurm-build AS slurm-controller
diff --git a/cluster/entrypoint/entrypoint.sh b/cluster/entrypoint/entrypoint.sh
new file mode 100755
index 0000000..5785eb0
--- /dev/null
+++ b/cluster/entrypoint/entrypoint.sh
@@ -0,0 +1,41 @@
+#!/usr/bin/env bash
+
+# exit on any error
+# set -exuo pipefail
+
+# bring in common functions
+source /var/slurm-init/support/common.sh
+
+# ------------------------------------------------------------
+# --- generate configuration files from environment, other pre-steps
+# ------------------------------------------------------------
+
+# investigates the environment to determine what resources are available
+# generates slurm.conf, slurmdbd.conf, and other config files
+source /var/slurm-init/support/configuration.sh
+
+# ------------------------------------------------------------
+# --- launch support services used by all nodes
+# ------------------------------------------------------------
+
+# run services based on the CLUSTER_ROLE env var
+source /var/slurm-init/support/services.sh
+
+# ------------------------------------------------------------
+# --- final process
+# ------------------------------------------------------------
+
+# if we're the controller, gracefully shut down on exit
+# (this is still a WIP; i have to do more research on what
+# signals the container gets when it's being shut down)
+cleanup() {
+    is_in_role "controller" && (
+        echo "Cleaning up..."
+        scontrol shutdown
+        sleep 10
+    )
+}
+is_in_role "controller" && \
+trap 'cleanup' SIGTERM EXIT
+
+source /var/slurm-init/support/logtail.sh
diff --git a/cluster/entrypoint/support/common.sh b/cluster/entrypoint/support/common.sh
new file mode 100755
index 0000000..4155b47
--- /dev/null
+++ b/cluster/entrypoint/support/common.sh
@@ -0,0 +1,12 @@
+#!/usr/bin/env bash
+
+# contains common functions for the entrypoint scripts
+
+function is_in_role() {
+    # returns true if the given role is in the CLUSTER_ROLE env var
+    # note that the env var consists of comma-delimited strings
+    local role=$1
+
+    # [[ "${CLUSTER_ROLE,,}" == *"${role,,}"* ]]
+    [[ ${CLUSTER_ROLE} =~ (^|,)"$role"(,|$) ]]
+}
diff --git a/cluster/entrypoint/support/configuration.sh b/cluster/entrypoint/support/configuration.sh
new file mode 100755
index 0000000..ed3e918
--- /dev/null
+++ b/cluster/entrypoint/support/configuration.sh
@@ -0,0 +1,45 @@
+#!/usr/bin/env bash
+
+# bring in common functions
+source /var/slurm-init/support/common.sh
+
+# the name of our local cluster
+export CLUSTER_NAME=${CLUSTER_NAME:-localcluster}
+
+# determine the cores mapped to slurm here in order to generate
+# a correct CPUSpecList value for slurm.conf.
+# (note: because the master and worker configs should match,
+# we need to allocate the same cores to the master and worker nodes)
+ALLOWED_CPUS=$(
+    grep -i ^cpus_allowed_list /proc/self/status | \
+    cut -d':' -f2- | xargs
+)
+export ALLOWED_CPUS
+
+# split cpuset into first and last; e.g., '36-96' into 36 and 96
+IFS=- read -r FIRST_CPU LAST_CPU <<< "${ALLOWED_CPUS}"
+let TOTAL_CPUS="$LAST_CPU - $FIRST_CPU"
+export TOTAL_CPUS
+
+export FINAL_CPUSPECLIST=$(
+    seq 0 ${TOTAL_CPUS} | awk '{print $1 * 2}' | paste -sd "," -
+)
+
+let TOTAL_CPU_CNT="$TOTAL_CPUS + 1"
+export TOTAL_CPU_CNT
+
+ls -l /opt/templates/
+
+# generate slurm.conf, cgroup.conf from the template files
+mkdir -p /etc/slurm/
+envsubst -i /opt/templates/slurm.conf.template -o /etc/slurm/slurm.conf
+envsubst -i /opt/templates/cgroup.conf.template -o /etc/slurm/cgroup.conf
+envsubst -i /opt/templates/slurmdbd.conf.template -o /etc/slurm/slurmdbd.conf
+
+# create a few more folders that services expect to exist
+mkdir -p /var/spool/slurmctld/
+mkdir -p /sys/fs/cgroup/system.slice
+
+# slurmdbd insists on its config file being read-writeable by its owner, so
+# make that the case now
+chmod 0600 /etc/slurm/slurmdbd.conf
diff --git a/cluster/entrypoint/support/logtail.sh b/cluster/entrypoint/support/logtail.sh
new file mode 100755
index 0000000..42d597f
--- /dev/null
+++ b/cluster/entrypoint/support/logtail.sh
@@ -0,0 +1,31 @@
+#!/usr/bin/env bash
+
+# this script sets up log tailing; which logs are tailed depends on the node
+
+echo ""
+echo "==================================================================="
+echo "=== Slurm setup complete! monitoring logs forever..."
+echo "==================================================================="
+echo ""
+
+# the code below gathers logs from the various slurm processes and cats them
+# forever to stdout along with whatever command was passed to the container as
+# its CMD.
+
+# first, create a named pipe that we'll use to combine all the
+# stdout output we want to show in the docker logs
+COMBINED_OUT_FIFO="/var/combined-stdout"
+mkfifo ${COMBINED_OUT_FIFO}
+
+# capture the log with a long-running tail process
+# so it gets interleaved into the output from the CMD.
+# the log we output will depend on the 
+is_in_role "controller" && ( tail -f /var/log/slurmctld.log > ${COMBINED_OUT_FIFO} ) &
+is_in_role "worker" && ( tail -f /var/log/slurmd.log > ${COMBINED_OUT_FIFO} ) &
+
+# run the container CMD (just sleeping forever by default), writing it
+# to the combined out
+( exec "$@" > ${COMBINED_OUT_FIFO} ) &
+
+# tail the combined stdout log forever
+cat ${COMBINED_OUT_FIFO}
diff --git a/cluster/entrypoint/support/services.sh b/cluster/entrypoint/support/services.sh
new file mode 100755
index 0000000..36494fd
--- /dev/null
+++ b/cluster/entrypoint/support/services.sh
@@ -0,0 +1,70 @@
+#!/usr/bin/env bash
+
+# launches services specific to the type of node, as defined by the
+# env var CLUSTER_ROLE; the CLUSTER_ROLE env var consists of a
+# comma-separated list of roles, e.g., "controller,worker".
+# possible values are:
+# - controller: the controller (aka 'master') node, e.g. runs slurmctld and shouldn't run jobs
+# - worker: a worker node, e.g. runs slurmd and can run jobs
+# - dbd: runs the slurmdbd service
+
+# bring in common functions
+source /var/slurm-init/support/common.sh
+
+# checks that a service is running, waiting up to 60 seconds for it to start
+function wait_for_service() {
+    local SERVICE_NAME=$1
+    local CHECK_CMD=${2:-":"}
+
+    TIMEOUT=60
+    INTERVAL=5
+    ELAPSED=0
+
+    while [ ${ELAPSED} -lt ${TIMEOUT} ]; do
+        if service ${SERVICE_NAME} status; then
+            echo "${SERVICE_NAME} is running."
+            return 0
+        else
+            echo "${SERVICE_NAME} is not running. Checking again in ${INTERVAL} seconds..."
+            
+            if [ "${CHECK_CMD}" != ":" ]; then
+                eval ${CHECK_CMD}
+            fi
+
+            sleep ${INTERVAL}
+            ELAPSED=$((ELAPSED + INTERVAL))
+        fi
+    done
+
+    echo "Timeout reached. ${SERVICE_NAME} did not start within ${TIMEOUT} seconds."
+    return 1
+}
+
+# start common services
+service dbus start
+service munge start
+service cron start
+
+# first, start slurmdbd and wait for it to be up
+is_in_role "dbd" && service slurmdbd start && wait_for_service slurmdbd
+
+# start services based on our roles
+is_in_role "controller" && service slurmctld start
+
+# check that the services are running
+is_in_role "dbd" && ( wait_for_service slurmdbd || cat /var/log/slurmdbd.log )
+is_in_role "controller" && ( wait_for_service slurmctld "service slurmctld start" || cat /var/log/slurmctld.log )
+
+# the worker's a little complicated: it can fail if slurmctld hasn't started
+# yet, so we'll repeatedly attempt to start it until it does
+is_in_role "worker" && (
+    while ! sinfo; do
+        echo "* waiting for slurmctld to start before we start slurmd..."
+        sleep 5
+    done
+
+    # ok, now try to start it
+    service slurmd start
+
+    wait_for_service slurmd "cat /var/log/slurmd.log"
+)
diff --git a/cluster/install.R b/cluster/install.R
new file mode 100644
index 0000000..e85c198
--- /dev/null
+++ b/cluster/install.R
@@ -0,0 +1,15 @@
+# install packages needed to run molevolvr jobs
+# (this will likely include the molevolvr package and its dependencies,
+# as well as libraries we use for reporting to the app database and for
+# running jobs on the cluster)
+install.packages(
+    c(
+        "DBI",                  # Database interface
+        "RPostgres",            # PostgreSQL-specific impl. for DBI
+        "dbplyr",               # dplyr for databases
+        "box",                  # allows R files to be referenced as modules
+        "R6",                   # allows us to create python-like classes
+        "future.batchtools"     # allows us to run async jobs on a variety of backends
+    ),
+    Ncpus = 6
+)
diff --git a/cluster/slurm-config/cgroup.conf.template b/cluster/slurm-config/cgroup.conf.template
new file mode 100644
index 0000000..3693c72
--- /dev/null
+++ b/cluster/slurm-config/cgroup.conf.template
@@ -0,0 +1,10 @@
+# CgroupAutomount=no
+# CgroupReleaseAgentDir="/etc/slurm/cgroup"
+# CgroupMountpoint=/sys/fs/cgroup
+ConstrainCores=yes
+ConstrainDevices=yes
+ConstrainRAMSpace=yes
+
+# https://fossies.org/linux/slurm/src/plugins/cgroup/v2/cgroup_v2.c
+IgnoreSystemd=yes
+CgroupPlugin=autodetect
\ No newline at end of file
diff --git a/cluster/slurm-config/slurm.conf.template b/cluster/slurm-config/slurm.conf.template
new file mode 100644
index 0000000..ba3945f
--- /dev/null
+++ b/cluster/slurm-config/slurm.conf.template
@@ -0,0 +1,151 @@
+# slurm.conf file generated by configurator.html.
+# Put this file on all nodes of your cluster.
+# See the slurm.conf man page for more information.
+#
+ClusterName=${CLUSTER_NAME}
+SlurmctldHost=${SLURM_MASTER}
+
+DisableRootJobs=NO
+EnforcePartLimits=NO
+
+#Epilog=
+#EpilogSlurmctld=
+#FirstJobId=1
+#MaxJobId=67043328
+#GresTypes=
+#GroupUpdateForce=0
+#GroupUpdateTime=600
+#JobFileAppend=0
+#JobRequeue=1
+#JobSubmitPlugins=lua
+#KillOnBadExit=0
+#LaunchType=launch/slurm
+#Licenses=foo*4,bar
+MailProg=/bin/mail
+#MaxJobCount=10000
+#MaxStepCount=40000
+#MaxTasksPerNode=512
+#MpiDefault=
+#MpiParams=ports=#-#
+#PluginDir=
+#PlugStackConfig=
+#PrivateData=jobs
+ProctrackType=proctrack/linuxproc
+#Prolog=
+#PrologFlags=
+#PrologSlurmctld=
+#PropagatePrioProcess=0
+#PropagateResourceLimits=
+#PropagateResourceLimitsExcept=
+#RebootProgram=
+ReturnToService=2
+SlurmctldPidFile=/var/run/slurmctld.pid
+SlurmctldPort=6817
+SlurmdPidFile=/var/run/slurmd.pid
+SlurmdPort=6818
+SlurmdSpoolDir=/var/spool/slurmd
+SlurmUser=root
+SlurmdUser=root
+#SrunEpilog=
+#SrunProlog=
+StateSaveLocation=/var/spool/slurmctld
+#SwitchType=
+#TaskEpilog=
+TaskPlugin=task/affinity # ,task/cgroup
+TaskPluginParam=Verbose
+#TaskProlog=
+#TopologyPlugin=topology/tree
+#TmpFS=/tmp
+#TrackWCKey=no
+#TreeWidth=
+#UnkillableStepProgram=
+#UsePAM=0
+#
+#
+# TIMERS
+#BatchStartTimeout=10
+#CompleteWait=0
+#EpilogMsgTime=2000
+#GetEnvTimeout=2
+#HealthCheckInterval=0
+#HealthCheckProgram=
+InactiveLimit=0
+KillWait=30
+#MessageTimeout=10
+#ResvOverRun=0
+MinJobAge=300
+#OverTimeLimit=0
+SlurmctldTimeout=900
+SlurmdTimeout=0
+#UnkillableStepTimeout=60
+#VSizeFactor=0
+Waittime=0
+UnkillableStepTimeout=300 # copied from old slurm conf
+#
+#
+# SCHEDULING
+#DefMemPerCPU=0
+#MaxMemPerCPU=0
+#SchedulerTimeSlice=30
+SchedulerType=sched/backfill
+SelectType=select/cons_tres
+SelectTypeParameters=CR_Core
+#
+#
+# JOB PRIORITY
+#PriorityFlags=
+#PriorityType=priority/multifactor
+#PriorityDecayHalfLife=
+#PriorityCalcPeriod=
+#PriorityFavorSmall=
+#PriorityMaxAge=
+#PriorityUsageResetPeriod=
+#PriorityWeightAge=
+#PriorityWeightFairshare=
+#PriorityWeightJobSize=
+#PriorityWeightPartition=
+#PriorityWeightQOS=
+#
+#
+# LOGGING AND ACCOUNTING
+#AccountingStorageEnforce=0
+#AccountingStorageHost=
+#AccountingStoragePass=
+#AccountingStoragePort=
+AccountingStorageType=accounting_storage/slurmdbd
+#AccountingStorageUser=
+#AccountingStoreFlags=
+JobCompType=jobcomp/mysql
+JobCompHost=${MARIADB_HOST}
+JobCompPort=${MARIADB_PORT}
+JobCompLoc=${MARIADB_DATABASE}
+JobCompUser=${MARIADB_USER}
+JobCompPass=${MARIADB_PASSWORD}
+#JobCompParams=
+#JobContainerType=
+JobAcctGatherFrequency=30
+JobAcctGatherType=jobacct_gather/linux
+SlurmctldDebug=info
+SlurmctldLogFile=/var/log/slurmctld.log
+SlurmdDebug=info
+SlurmdLogFile=/var/log/slurmd.log
+#SlurmSchedLogFile=
+#SlurmSchedLogLevel=
+#DebugFlags=
+#
+#
+# POWER SAVE SUPPORT FOR IDLE NODES (optional)
+#SuspendProgram=
+#ResumeProgram=
+#SuspendTimeout=
+#ResumeTimeout=
+#ResumeRate=
+#SuspendExcNodes=
+#SuspendExcParts=
+#SuspendRate=
+#SuspendTime=
+#
+#
+# COMPUTE NODES
+NodeName=${SLURM_WORKER} CPUs=${SLURM_CPUS} State=IDLE
+PartitionName=LocalQ Nodes=ALL Default=YES MaxTime=INFINITE State=UP
diff --git a/cluster/slurm-config/slurmdbd.conf.template b/cluster/slurm-config/slurmdbd.conf.template
new file mode 100644
index 0000000..77b08ba
--- /dev/null
+++ b/cluster/slurm-config/slurmdbd.conf.template
@@ -0,0 +1,40 @@
+#
+# Sample slurmdbd.conf with tweaks
+# see https://slurm.schedmd.com/slurmdbd.conf.html for details
+#
+ArchiveEvents=yes
+ArchiveJobs=yes
+ArchiveResvs=yes
+ArchiveSteps=no
+ArchiveSuspend=no
+ArchiveTXN=no
+ArchiveUsage=no
+
+## FA: ArchiveScript was commented out in the reference, not sure why
+#ArchiveScript=/usr/sbin/slurm.dbd.archive
+## FA: i commented this out because we want to use the default intra-cluster munge auth
+AuthType=auth/munge
+AuthInfo=/var/run/munge/munge.socket.2
+
+# this is actually info for slurmdbd, not for the mariadb instance
+DbdHost=${SLURM_DBD_HOST}
+DebugLevel=info
+
+PurgeEventAfter=1month
+PurgeJobAfter=12month
+PurgeResvAfter=1month
+PurgeStepAfter=1month
+PurgeSuspendAfter=1month
+PurgeTXNAfter=12month
+PurgeUsageAfter=24month
+
+LogFile=/var/log/slurmdbd.log
+PidFile=/var/run/slurmdbd.pid
+SlurmUser=root
+
+StorageType=accounting_storage/mysql
+StorageHost=${MARIADB_HOST}
+StoragePort=${MARIADB_PORT}
+StorageLoc=${MARIADB_DATABASE}
+StorageUser=${MARIADB_USER}
+StoragePass=${MARIADB_PASSWORD}
diff --git a/cluster/system-v/default/slurmctld b/cluster/system-v/default/slurmctld
new file mode 100644
index 0000000..2cc3524
--- /dev/null
+++ b/cluster/system-v/default/slurmctld
@@ -0,0 +1,2 @@
+# Additional options that are passed to the slurmctld daemon
+#SLURMCTLD_OPTIONS=""
diff --git a/cluster/system-v/default/slurmd b/cluster/system-v/default/slurmd
new file mode 100644
index 0000000..93369be
--- /dev/null
+++ b/cluster/system-v/default/slurmd
@@ -0,0 +1,2 @@
+# Additional options that are passed to the slurmd daemon
+#SLURMD_OPTIONS=""
diff --git a/cluster/system-v/default/slurmdbd b/cluster/system-v/default/slurmdbd
new file mode 100644
index 0000000..8f1fabb
--- /dev/null
+++ b/cluster/system-v/default/slurmdbd
@@ -0,0 +1,2 @@
+# Additional options that are passed to the slurmdbd daemon
+#SLURMDBD_OPTIONS=""
diff --git a/cluster/system-v/init.d/slurmctld b/cluster/system-v/init.d/slurmctld
new file mode 100755
index 0000000..d420b6f
--- /dev/null
+++ b/cluster/system-v/init.d/slurmctld
@@ -0,0 +1,234 @@
+#!/bin/sh
+#
+# chkconfig: 345 90 10
+# description: SLURM is a simple resource management system which \
+#              manages exclusive access o a set of compute \
+#              resources and distributes work to those resources.
+#
+# processname: /usr/sbin/slurmctld
+# pidfile: /run/slurmctld.pid
+#
+# config: /etc/default/slurmctld
+#
+### BEGIN INIT INFO
+# Provides:          slurmctld
+# Required-Start:    $remote_fs $syslog $network munge
+# Required-Stop:     $remote_fs $syslog $network munge
+# Should-Start:      $named
+# Should-Stop:       $named
+# Default-Start:     2 3 4 5
+# Default-Stop:      0 1 6
+# Short-Description: slurm daemon management
+# Description:       Start slurm to provide resource management
+### END INIT INFO
+
+BINDIR=/usr/bin
+CONFDIR=/etc/slurm
+LIBDIR=/usr/local/lib
+SBINDIR=/usr/local/sbin
+
+# Source slurm specific configuration
+if [ -f /etc/default/slurmctld ] ; then
+    . /etc/default/slurmctld
+else
+    SLURMCTLD_OPTIONS=""
+fi
+
+# Checking for slurm.conf presence
+if [ ! -f $CONFDIR/slurm.conf ] ; then
+    if [ -n "$(echo $1 | grep start)" ] ; then
+      echo Not starting slurmctld
+    fi
+      echo slurm.conf was not found in $CONFDIR
+      echo Please follow the instructions in \
+            /usr/share/doc/slurmctld/README.Debian
+    exit 0
+fi
+
+
+DAEMONLIST="slurmctld"
+test -f $SBINDIR/slurmctld || exit 0
+
+#Checking for lsb init function
+if [ -f /lib/lsb/init-functions ] ; then
+  . /lib/lsb/init-functions
+else
+  echo Can\'t find lsb init functions
+  exit 1
+fi
+
+# setup library paths for slurm and munge support
+export LD_LIBRARY_PATH=$LIBDIR${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
+
+get_daemon_description()
+{
+    case $1 in
+      slurmd)
+        echo slurm compute node daemon
+	;;
+      slurmctld)
+	echo slurm central management daemon
+	;;
+      *)
+	echo slurm daemon
+	;;
+    esac
+}
+
+start() {
+  desc="$(get_daemon_description $1)"
+  log_daemon_msg "Starting $desc" "$1"
+  unset HOME MAIL USER USERNAME
+  #FIXME $STARTPROC $SBINDIR/$1 $2
+  STARTERRORMSG="$(start-stop-daemon --start --oknodo \
+                   --exec "$SBINDIR/$1" -- $2 2>&1)"
+  STATUS=$?
+  log_end_msg $STATUS
+  if [ "$STARTERRORMSG" != "" ] ; then
+    echo $STARTERRORMSG
+  fi
+  touch /var/lock/slurm
+}
+
+stop() {
+    desc="$(get_daemon_description $1)"
+    log_daemon_msg "Stopping $desc" "$1"
+    STOPERRORMSG="$(start-stop-daemon --oknodo --stop -s TERM \
+                    --exec "$SBINDIR/$1" 2>&1)"
+    STATUS=$?
+    log_end_msg $STATUS
+    if [ "$STOPERRORMSG" != "" ] ; then
+      echo $STOPERRORMSG
+    fi
+    rm -f /var/lock/slurm
+}
+
+getpidfile() {
+    dpidfile=`grep -i ${1}pid $CONFDIR/slurm.conf | grep -v '^ *#'`
+    if [ $? = 0 ]; then
+        dpidfile=${dpidfile##*=}
+        dpidfile=${dpidfile%#*}
+    else
+        dpidfile=/run/${1}.pid
+    fi
+
+    echo $dpidfile
+}
+
+#
+# status() with slight modifications to take into account
+# instantiations of job manager slurmd's, which should not be
+# counted as "running"
+#
+slurmstatus() {
+    base=${1##*/}
+
+    pidfile=$(getpidfile $base)
+
+    pid=`pidof -o $$ -o $$PPID -o %PPID -x $1 || \
+         pidof -o $$ -o $$PPID -o %PPID -x ${base}`
+
+    if [ -f $pidfile ]; then
+        read rpid < $pidfile
+        if [ "$rpid" != "" -a "$pid" != "" ]; then
+            for i in $pid ; do
+                if [ "$i" = "$rpid" ]; then
+                    echo "${base} (pid $pid) is running..."
+                    return 0
+                fi
+            done
+        elif [ "$rpid" != "" -a "$pid" = "" ]; then
+#           Due to change in user id, pid file may persist
+#           after slurmctld terminates
+            if [ "$base" != "slurmctld" ] ; then
+               echo "${base} dead but pid file exists"
+            fi
+            return 1
+        fi
+
+    fi
+
+    if [ "$base" = "slurmctld" -a "$pid" != "" ] ; then
+        echo "${base} (pid $pid) is running..."
+        return 0
+    fi
+
+    echo "${base} is stopped"
+
+    return 3
+}
+
+#
+# stop slurm daemons,
+# wait for termination to complete (up to 10 seconds) before returning
+#
+slurmstop() {
+    for prog in $DAEMONLIST ; do
+       stop $prog
+       for i in 1 2 3 4
+       do
+          sleep $i
+          slurmstatus $prog
+          if [ $? != 0 ]; then
+             break
+          fi
+       done
+    done
+}
+
+#
+# The pathname substitution in daemon command assumes prefix and
+# exec_prefix are same.  This is the default, unless the user requests
+# otherwise.
+#
+# Any node can be a slurm controller and/or server.
+#
+case "$1" in
+    start)
+	    start slurmctld "$SLURMCTLD_OPTIONS"
+        ;;
+    startclean)
+        SLURMCTLD_OPTIONS="-c $SLURMCTLD_OPTIONS"
+        start slurmctld "$SLURMCTLD_OPTIONS"
+        ;;
+    stop)
+	    slurmstop
+        ;;
+    status)
+        for prog in $DAEMONLIST ; do
+            slurmstatus $prog
+        done
+        ;;
+    restart)
+        $0 stop
+        $0 start
+        ;;
+    force-reload)
+        $0 stop
+        $0 start
+	    ;;
+    condrestart)
+        if [ -f /var/lock/subsys/slurm ]; then
+            for prog in $DAEMONLIST ; do
+                 stop $prog
+                 start $prog
+            done
+        fi
+        ;;
+    reconfig)
+        for prog in $DAEMONLIST ; do
+            PIDFILE=$(getpidfile $prog)
+            start-stop-daemon --stop --signal HUP --pidfile \
+            "$PIDFILE" --quiet $prog
+        done
+        ;;
+    test)
+        for prog in $DAEMONLIST ; do
+            echo "$prog runs here"
+        done
+        ;;
+    *)
+        echo "Usage: $0 {start|startclean|stop|status|restart|reconfig|condrestart|test}"
+        exit 1
+        ;;
+esac
diff --git a/cluster/system-v/init.d/slurmd b/cluster/system-v/init.d/slurmd
new file mode 100755
index 0000000..c664cbb
--- /dev/null
+++ b/cluster/system-v/init.d/slurmd
@@ -0,0 +1,234 @@
+#!/bin/sh
+#
+# chkconfig: 345 90 10
+# description: SLURM is a simple resource management system which \
+#              manages exclusive access o a set of compute \
+#              resources and distributes work to those resources.
+#
+# processname: /usr/sbin/slurmd
+# pidfile: /run/slurmd.pid
+#
+# config: /etc/default/slurmd
+#
+### BEGIN INIT INFO
+# Provides:          slurmd
+# Required-Start:    $remote_fs $syslog $network munge
+# Required-Stop:     $remote_fs $syslog $network munge
+# Should-Start:      $named
+# Should-Stop:       $named
+# Default-Start:     2 3 4 5
+# Default-Stop:      0 1 6
+# Short-Description: slurm daemon management
+# Description:       Start slurm to provide resource management
+### END INIT INFO
+
+BINDIR=/usr/bin
+CONFDIR=/etc/slurm
+LIBDIR=/usr/local/lib
+SBINDIR=/usr/local/sbin
+
+# Source slurm specific configuration
+if [ -f /etc/default/slurmd ] ; then
+    . /etc/default/slurmd
+else
+    SLURMD_OPTIONS=""
+fi
+
+# Checking for slurm.conf presence
+if [ ! -f $CONFDIR/slurm.conf ] ; then
+    if [ -n "$(echo $1 | grep start)" ] ; then
+      echo Not starting slurmd
+    fi
+      echo slurm.conf was not found in $CONFDIR
+      echo Please follow the instructions in \
+            /usr/share/doc/slurmd/README.Debian
+    exit 0
+fi
+
+
+DAEMONLIST="slurmd"
+test -f $SBINDIR/slurmd || exit 0
+
+#Checking for lsb init function
+if [ -f /lib/lsb/init-functions ] ; then
+  . /lib/lsb/init-functions
+else
+  echo Can\'t find lsb init functions
+  exit 1
+fi
+
+# setup library paths for slurm and munge support
+export LD_LIBRARY_PATH=$LIBDIR${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
+
+get_daemon_description()
+{
+    case $1 in
+      slurmd)
+        echo slurm compute node daemon
+	;;
+      slurmctld)
+	echo slurm central management daemon
+	;;
+      *)
+	echo slurm daemon
+	;;
+    esac
+}
+
+start() {
+  desc="$(get_daemon_description $1)"
+  log_daemon_msg "Starting $desc" "$1"
+  unset HOME MAIL USER USERNAME
+  #FIXME $STARTPROC $SBINDIR/$1 $2
+  STARTERRORMSG="$(start-stop-daemon --start --oknodo \
+                   --exec "$SBINDIR/$1" -- $2 2>&1)"
+  STATUS=$?
+  log_end_msg $STATUS
+  if [ "$STARTERRORMSG" != "" ] ; then
+    echo $STARTERRORMSG
+  fi
+  touch /var/lock/slurm
+}
+
+stop() {
+    desc="$(get_daemon_description $1)"
+    log_daemon_msg "Stopping $desc" "$1"
+    STOPERRORMSG="$(start-stop-daemon --oknodo --stop -s TERM \
+                    --exec "$SBINDIR/$1" 2>&1)"
+    STATUS=$?
+    log_end_msg $STATUS
+    if [ "$STOPERRORMSG" != "" ] ; then
+      echo $STOPERRORMSG
+    fi
+    rm -f /var/lock/slurm
+}
+
+getpidfile() {
+    dpidfile=`grep -i ${1}pid $CONFDIR/slurm.conf | grep -v '^ *#'`
+    if [ $? = 0 ]; then
+        dpidfile=${dpidfile##*=}
+        dpidfile=${dpidfile%#*}
+    else
+        dpidfile=/var/run/${1}.pid
+    fi
+
+    echo $dpidfile
+}
+
+#
+# status() with slight modifications to take into account
+# instantiations of job manager slurmd's, which should not be
+# counted as "running"
+#
+slurmstatus() {
+    base=${1##*/}
+
+    pidfile=$(getpidfile $base)
+
+    pid=`pidof -o $$ -o $$PPID -o %PPID -x $1 || \
+         pidof -o $$ -o $$PPID -o %PPID -x ${base}`
+
+    if [ -f $pidfile ]; then
+        read rpid < $pidfile
+        if [ "$rpid" != "" -a "$pid" != "" ]; then
+            for i in $pid ; do
+                if [ "$i" = "$rpid" ]; then
+                    echo "${base} (pid $pid) is running..."
+                    return 0
+                fi
+            done
+        elif [ "$rpid" != "" -a "$pid" = "" ]; then
+#           Due to change in user id, pid file may persist
+#           after slurmctld terminates
+            if [ "$base" != "slurmctld" ] ; then
+               echo "${base} dead but pid file exists"
+            fi
+            return 1
+        fi
+
+    fi
+
+    if [ "$base" = "slurmctld" -a "$pid" != "" ] ; then
+        echo "${base} (pid $pid) is running..."
+        return 0
+    fi
+
+    echo "${base} is stopped"
+
+    return 3
+}
+
+#
+# stop slurm daemons,
+# wait for termination to complete (up to 10 seconds) before returning
+#
+slurmstop() {
+    for prog in $DAEMONLIST ; do
+       stop $prog
+       for i in 1 2 3 4
+       do
+          sleep $i
+          slurmstatus $prog
+          if [ $? != 0 ]; then
+             break
+          fi
+       done
+    done
+}
+
+#
+# The pathname substitution in daemon command assumes prefix and
+# exec_prefix are same.  This is the default, unless the user requests
+# otherwise.
+#
+# Any node can be a slurm controller and/or server.
+#
+case "$1" in
+    start)
+	    start slurmd "$SLURMD_OPTIONS"
+        ;;
+    startclean)
+        SLURMD_OPTIONS="-c $SLURMD_OPTIONS"
+        start slurmd "$SLURMD_OPTIONS"
+        ;;
+    stop)
+	    slurmstop
+        ;;
+    status)
+        for prog in $DAEMONLIST ; do
+        slurmstatus $prog
+        done
+        ;;
+    restart)
+        $0 stop
+        $0 start
+        ;;
+    force-reload)
+        $0 stop
+        $0 start
+	    ;;
+    condrestart)
+        if [ -f /var/lock/subsys/slurm ]; then
+            for prog in $DAEMONLIST ; do
+                 stop $prog
+                 start $prog
+            done
+        fi
+        ;;
+    reconfig)
+        for prog in $DAEMONLIST ; do
+            PIDFILE=$(getpidfile $prog)
+            start-stop-daemon --stop --signal HUP --pidfile \
+            "$PIDFILE" --quiet $prog
+        done
+        ;;
+    test)
+        for prog in $DAEMONLIST ; do
+            echo "$prog runs here"
+        done
+        ;;
+    *)
+        echo "Usage: $0 {start|startclean|stop|status|restart|reconfig|condrestart|test}"
+        exit 1
+        ;;
+esac
diff --git a/cluster/system-v/init.d/slurmdbd b/cluster/system-v/init.d/slurmdbd
new file mode 100755
index 0000000..54606d1
--- /dev/null
+++ b/cluster/system-v/init.d/slurmdbd
@@ -0,0 +1,170 @@
+#!/bin/sh
+#
+# chkconfig: 345 90 10
+# description: SLURMDBD is a database server interface for \
+#              SLURM (Simple Linux Utility for Resource Management).
+#
+# processname: /usr/sbin/slurmdbd
+# pidfile: /run/slurmdbd.pid
+#
+# config: /etc/default/slurmdbd
+#
+### BEGIN INIT INFO
+# Provides:          slurmdbd
+# Required-Start:    $remote_fs $syslog $network munge
+# Required-Stop:     $remote_fs $syslog $network munge
+# Should-Start:      $named mysql
+# Should-Stop:       $named mysql
+# Default-Start:     2 3 4 5
+# Default-Stop:      0 1 6
+# Short-Description: SLURM database daemon
+# Description:       Start slurm to provide database server for SLURM
+### END INIT INFO
+
+SBINDIR=/usr/local/sbin
+LIBDIR=/usr/local/lib
+CONFFILE="/etc/slurm/slurmdbd.conf"
+DESCRIPTION="slurm-wlm database server interface"
+NAME="slurmdbd"
+
+# Source slurm specific configuration
+if [ -f /etc/default/slurmdbd ] ; then
+    . /etc/default/slurmdbd
+else
+    SLURMDBD_OPTIONS=""
+fi
+
+#Checking for configuration file
+if [ ! -f $CONFFILE ] ; then
+  if [ -n "$(echo $1 | grep start)" ] ; then 
+    echo Not starting slurmdbd
+  fi 
+  echo $CONFFILE not found
+  exit 0
+fi
+
+#Checking for lsb init function
+if [ -f /lib/lsb/init-functions ] ; then
+  . /lib/lsb/init-functions
+else
+  echo Can\'t find lsb init functions 
+  exit 1
+fi
+
+getpidfile() {
+    dpidfile=`grep PidFile $CONFFILE | grep -v '^ *#'`
+    if [ $? = 0 ]; then
+        dpidfile=${dpidfile##*=}
+        dpidfile=${dpidfile%#*}
+    else
+        dpidfile=/run/slurmdbd.pid
+    fi
+
+    echo $dpidfile
+}
+
+# setup library paths for slurm and munge support
+export LD_LIBRARY_PATH=$LIBDIR${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
+
+start() {
+
+    unset HOME MAIL USER USERNAME 
+    log_daemon_msg "Starting $DESCRIPTION"
+    STARTERRORMSG="$(start-stop-daemon --start --oknodo \
+    			--exec "$SBINDIR/$NAME" -- $SLURMDBD_OPTIONS 2>&1)"
+    STATUS=$?
+    if [ "$STARTERRORMSG" != "" ] ; then 
+      STARTERRORMSG=$(echo $STARTERRORMSG | sed "s/.$//")
+      log_progress_msg $STARTERRORMSG
+    else
+      log_progress_msg $NAME
+    fi
+    touch /var/lock/$NAME
+    log_end_msg $STATUS
+}
+
+stop() { 
+    log_daemon_msg "Stopping $DESCRIPTION"
+    STOPERRORMSG="$(start-stop-daemon --oknodo --stop -s TERM \
+    			--exec "$SBINDIR/$NAME" 2>&1)"
+    STATUS=$?
+    if [ "$STOPERRORMSG" != "" ] ; then 
+      STOPERRORMSG=$(echo $STOPERRORMSG | sed "s/.$//")
+      log_progress_msg $STOPERRORMSG
+    else
+      log_progress_msg "$NAME"
+    fi
+    log_end_msg $STATUS
+    rm -f /var/lock/$NAME
+}
+
+slurmstatus() {
+    base=${1##*/}
+
+    pidfile=$(getpidfile)
+
+    pid=`pidof -o $$ -o $$PPID -o %PPID -x slurmdbd`
+
+    if [ -f $pidfile ]; then
+        read rpid < $pidfile
+        if [ "$rpid" != "" -a "$pid" != "" ]; then
+            for i in $pid ; do
+                if [ "$i" = "$rpid" ]; then 
+                    echo "slurmdbd (pid $pid) is running..."
+                    return 0
+                fi     
+            done
+        elif [ "$rpid" != "" -a "$pid" = "" ]; then
+            echo "slurmdbd is stopped"
+            return 1
+        fi 
+
+    fi
+     
+    echo "slurmdbd is stopped"
+    
+    return 3
+}
+
+#
+# The pathname substitution in daemon command assumes prefix and
+# exec_prefix are same.  This is the default, unless the user requests
+# otherwise.
+#
+# Any node can be a slurm controller and/or server.
+#
+case "$1" in
+    start)
+	    start
+        ;;
+    stop)
+	    stop
+        ;;
+    status)
+	    slurmstatus
+        ;;
+    restart)
+        stop
+        sleep 1
+        start
+        ;;
+    force-reload)
+        $0 stop
+        $0 start
+	    ;;
+    condrestart)
+        if [ -f /var/lock/subsys/slurm ]; then
+                 stop
+                 start
+        fi
+        ;;
+    reconfig)
+        PIDFILE=$(getpidfile)
+        start-stop-daemon --stop --signal HUP --pidfile \
+                "$PIDFILE" --quiet slurmdbd
+        ;;
+    *)
+        echo "Usage: $0 {start|stop|status|restart|condrestart|reconfig}"
+        exit 1
+        ;;
+esac
diff --git a/cluster/system-v/init.d/slurmrestd b/cluster/system-v/init.d/slurmrestd
new file mode 100755
index 0000000..bb2bbb8
--- /dev/null
+++ b/cluster/system-v/init.d/slurmrestd
@@ -0,0 +1,170 @@
+#!/bin/sh
+#
+# chkconfig: 345 90 10
+# description: slurmrestd is a RESTful interface for \
+#              SLURM (Simple Linux Utility for Resource Management).
+#
+# processname: /usr/sbin/slurmrestd
+# pidfile: /run/slurmrestd.pid
+#
+# config: /etc/default/slurmrestd
+#
+### BEGIN INIT INFO
+# Provides:          slurmrestd
+# Required-Start:    $remote_fs $syslog $network munge
+# Required-Stop:     $remote_fs $syslog $network munge
+# Should-Start:      $named mysql
+# Should-Stop:       $named mysql
+# Default-Start:     2 3 4 5
+# Default-Stop:      0 1 6
+# Short-Description: SLURM database daemon
+# Description:       Start slurm to provide database server for SLURM
+### END INIT INFO
+
+SBINDIR=/usr/local/sbin
+LIBDIR=/usr/local/lib
+CONFFILE="/etc/slurm/slurmrestd.conf"
+DESCRIPTION="slurm-wlm database server interface"
+NAME="slurmrestd"
+
+# Source slurm specific configuration
+if [ -f /etc/default/slurmrestd ] ; then
+    . /etc/default/slurmrestd
+else
+    slurmrestd_OPTIONS=""
+fi
+
+#Checking for configuration file
+if [ ! -f $CONFFILE ] ; then
+  if [ -n "$(echo $1 | grep start)" ] ; then 
+    echo Not starting slurmrestd
+  fi 
+  echo $CONFFILE not found
+  exit 0
+fi
+
+#Checking for lsb init function
+if [ -f /lib/lsb/init-functions ] ; then
+  . /lib/lsb/init-functions
+else
+  echo Can\'t find lsb init functions 
+  exit 1
+fi
+
+getpidfile() {
+    dpidfile=`grep PidFile $CONFFILE | grep -v '^ *#'`
+    if [ $? = 0 ]; then
+        dpidfile=${dpidfile##*=}
+        dpidfile=${dpidfile%#*}
+    else
+        dpidfile=/run/slurmrestd.pid
+    fi
+
+    echo $dpidfile
+}
+
+# setup library paths for slurm and munge support
+export LD_LIBRARY_PATH=$LIBDIR${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
+
+start() {
+
+    unset HOME MAIL USER USERNAME 
+    log_daemon_msg "Starting $DESCRIPTION"
+    STARTERRORMSG="$(start-stop-daemon --start --oknodo \
+    			--exec "$SBINDIR/$NAME" -- $slurmrestd_OPTIONS 2>&1)"
+    STATUS=$?
+    if [ "$STARTERRORMSG" != "" ] ; then 
+      STARTERRORMSG=$(echo $STARTERRORMSG | sed "s/.$//")
+      log_progress_msg $STARTERRORMSG
+    else
+      log_progress_msg $NAME
+    fi
+    touch /var/lock/$NAME
+    log_end_msg $STATUS
+}
+
+stop() { 
+    log_daemon_msg "Stopping $DESCRIPTION"
+    STOPERRORMSG="$(start-stop-daemon --oknodo --stop -s TERM \
+    			--exec "$SBINDIR/$NAME" 2>&1)"
+    STATUS=$?
+    if [ "$STOPERRORMSG" != "" ] ; then 
+      STOPERRORMSG=$(echo $STOPERRORMSG | sed "s/.$//")
+      log_progress_msg $STOPERRORMSG
+    else
+      log_progress_msg "$NAME"
+    fi
+    log_end_msg $STATUS
+    rm -f /var/lock/$NAME
+}
+
+slurmstatus() {
+    base=${1##*/}
+
+    pidfile=$(getpidfile)
+
+    pid=`pidof -o $$ -o $$PPID -o %PPID -x slurmrestd`
+
+    if [ -f $pidfile ]; then
+        read rpid < $pidfile
+        if [ "$rpid" != "" -a "$pid" != "" ]; then
+            for i in $pid ; do
+                if [ "$i" = "$rpid" ]; then 
+                    echo "slurmrestd (pid $pid) is running..."
+                    return 0
+                fi     
+            done
+        elif [ "$rpid" != "" -a "$pid" = "" ]; then
+            echo "slurmrestd is stopped"
+            return 1
+        fi 
+
+    fi
+     
+    echo "slurmrestd is stopped"
+    
+    return 3
+}
+
+#
+# The pathname substitution in daemon command assumes prefix and
+# exec_prefix are same.  This is the default, unless the user requests
+# otherwise.
+#
+# Any node can be a slurm controller and/or server.
+#
+case "$1" in
+    start)
+	    start
+        ;;
+    stop)
+	    stop
+        ;;
+    status)
+	    slurmstatus
+        ;;
+    restart)
+        stop
+        sleep 1
+        start
+        ;;
+    force-reload)
+        $0 stop
+        $0 start
+	    ;;
+    condrestart)
+        if [ -f /var/lock/subsys/slurm ]; then
+                 stop
+                 start
+        fi
+        ;;
+    reconfig)
+        PIDFILE=$(getpidfile)
+        start-stop-daemon --stop --signal HUP --pidfile \
+                "$PIDFILE" --quiet slurmrestd
+        ;;
+    *)
+        echo "Usage: $0 {start|stop|status|restart|condrestart|reconfig}"
+        exit 1
+        ;;
+esac
diff --git a/cluster/test-jobs/.gitignore b/cluster/test-jobs/.gitignore
new file mode 100644
index 0000000..d0934e5
--- /dev/null
+++ b/cluster/test-jobs/.gitignore
@@ -0,0 +1,2 @@
+# ignore results of running jobs
+*.out
\ No newline at end of file
diff --git a/cluster/test-jobs/simplejob.sh b/cluster/test-jobs/simplejob.sh
new file mode 100755
index 0000000..d2261a7
--- /dev/null
+++ b/cluster/test-jobs/simplejob.sh
@@ -0,0 +1,11 @@
+#!/usr/bin/env bash
+
+# this script is included as a simple example of a job that could be run in the cluster;
+# including it in the repo just saves me from having to create it.
+
+# the folder is currently mapped into the slurm controller and worker nodes
+# at the path /opt/test-jobs/
+
+# TBC: perhaps use this as a test of whether the cluster can complete jobs?
+
+echo "Hello from $( hostname ) at $( date )"
diff --git a/docker-compose.override.yml b/docker-compose.dev.yml
similarity index 97%
rename from docker-compose.override.yml
rename to docker-compose.dev.yml
index 9bfb0ee..4ab47e6 100644
--- a/docker-compose.override.yml
+++ b/docker-compose.dev.yml
@@ -23,7 +23,7 @@ services:
     environment:
       - 'VITE_API=http://localhost:9050'
     ports:
-      - "5713:5713"
+      - "5173:5173"
 
   db:
     ports:
diff --git a/docker-compose.slurm.yml b/docker-compose.slurm.yml
new file mode 100644
index 0000000..ed83921
--- /dev/null
+++ b/docker-compose.slurm.yml
@@ -0,0 +1,74 @@
+volumes:
+  accounting_data:
+  munge:
+  shared_tmp:
+  shared_jobs:
+  
+services:
+  backend:
+    build:
+      args:
+        - PRIOR_LAYER=backend-slurm
+    volumes:
+      - munge:/etc/munge/
+      - shared_tmp:/tmp/
+      - shared_jobs:/opt/shared-jobs/
+
+  master:
+    hostname: ${SLURM_MASTER}
+    privileged: true
+    cgroup: private
+    image: ${REGISTRY_PREFIX}/molevolvr-slurm-node:${IMAGE_TAG}
+    platform: linux/amd64
+    build:
+      context: ./cluster
+      cache_from:
+        - type=registry,ref=${REGISTRY_PREFIX}/molevolvr-slurm-node:${IMAGE_TAG}
+      cache_to:
+        - type=inline,ref=${REGISTRY_PREFIX}/molevolvr-slurm-node:${IMAGE_TAG},mode=max
+    environment:
+      - CLUSTER_ROLE=controller,dbd
+    env_file:
+      - .env
+    volumes:
+      - munge:/etc/munge/
+      - shared_tmp:/tmp/
+      - shared_jobs:/opt/shared-jobs/
+      - ./cluster/test-jobs/:/opt/test-jobs/
+      # - /sys/fs/cgroup:/sys/fs/cgroup
+    depends_on:
+      - accounting
+
+  worker:
+    hostname: ${SLURM_WORKER}
+    privileged: true
+    cgroup: private
+    image: ${REGISTRY_PREFIX}/molevolvr-slurm-node:${IMAGE_TAG}
+    platform: linux/amd64
+    build:
+      context: ./cluster
+      cache_from:
+        - type=registry,ref=${REGISTRY_PREFIX}/molevolvr-slurm-node:${IMAGE_TAG}
+      cache_to:
+        - type=inline,ref=${REGISTRY_PREFIX}/molevolvr-slurm-node:${IMAGE_TAG},mode=max
+    environment:
+      - CLUSTER_ROLE=worker
+    env_file:
+      - .env
+    volumes:
+      - munge:/etc/munge/
+      - shared_tmp:/tmp/
+      - shared_jobs:/opt/shared-jobs/
+      - ./cluster/test-jobs/:/opt/test-jobs/
+      # - /sys/fs/cgroup:/sys/fs/cgroup
+
+  # used by slurmdbd to store accounting information
+  accounting:
+    hostname: ${MARIADB_HOST}
+    image: mariadb:11.2
+    restart: unless-stopped
+    env_file:
+      - .env
+    volumes:
+      - "./services/mariadb/:/etc/mysql/mariadb.conf.d/:ro"
+      - accounting_data:/var/lib/mysql
diff --git a/docker-compose.yml b/docker-compose.yml
index 801ca71..7fa2255 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -4,11 +4,17 @@ volumes:
 services:
   backend:
     hostname: backend-${TARGET_ENV}
-    image: molevolvr-backend
+    image: ${REGISTRY_PREFIX}/molevolvr-backend:${IMAGE_TAG}
     platform: linux/amd64
     build:
       context: ./backend
       dockerfile: ./docker/Dockerfile
+      args:
+        - PRIOR_LAYER=backend-base
+      cache_from:
+        - type=registry,ref=${REGISTRY_PREFIX}/molevolvr-backend:${IMAGE_TAG}
+      cache_to:
+        - type=inline,ref=${REGISTRY_PREFIX}/molevolvr-backend:${IMAGE_TAG},mode=max
     env_file:
       - .env
     environment:
@@ -20,8 +26,13 @@ services:
 
   frontend:
     hostname: frontend-${TARGET_ENV}
-    image: molevolvr-frontend
-    build: ./frontend
+    image: ${REGISTRY_PREFIX}/molevolvr-frontend:${IMAGE_TAG}
+    build:
+      context: ./frontend
+      cache_from:
+        - type=registry,ref=${REGISTRY_PREFIX}/molevolvr-frontend:${IMAGE_TAG}
+      cache_to:
+        - type=inline,ref=${REGISTRY_PREFIX}/molevolvr-frontend:${IMAGE_TAG},mode=max
     depends_on:
       - backend
 
diff --git a/frontend/Dockerfile b/frontend/Dockerfile
index 7508ba7..50f021a 100644
--- a/frontend/Dockerfile
+++ b/frontend/Dockerfile
@@ -33,8 +33,8 @@ COPY --from=install /temp/dev/node_modules node_modules
 COPY . .
 # run the app in hot-reloading development mode
 # set up vite to accept connections on any interface, e.g. from outside the
-# container, and to always run on port 5713)
-CMD [ "vite", "--host", "--port", "5713" ]
+# container, and to always run on port 5173)
+CMD [ "vite", "--host", "--port", "5173" ]
 
 
 # -----------------------------------------------------------
diff --git a/run_stack.sh b/run_stack.sh
index 1a414fb..14964dc 100755
--- a/run_stack.sh
+++ b/run_stack.sh
@@ -1,64 +1,26 @@
 #!/usr/bin/env bash
 
-# NOTES:
-# -------
-# This script launches the molevolvr stack in the specified target environment.
-# It's invoked as ./run_stack [target_env] [docker-compose args]; if
-# [target_env] is not specified, it will attempt to infer it from the repo's
-# directory name, and aborts with an error message if it doesn't find a match.
-# the remainder of the arguments are passed along to docker compose.
-#
-# for example, to launch the stack in the "prod" environment with the "up -d"
-# command, you would run: ./run_stack prod up -d
-#
-# the available environments differ in a variety of ways, including:
-# - which services they run (prod runs 'nginx', for example, but the dev-y envs
-#   don't)
-# - cores and memory constraints that are applied to the SLURM containers, in
-#   environments where the job scheduler is enabled
-# - what external resources they mount as volumes into the container; for
-#   example, each environment mounts a different job results folder, but
-#   environments that process jobs use the same blast and iprscan folders, since
-#   they're gigantic
-#
-# these differences between environments are implemented by invoking different
-# sets of docker-compose.yml files. with the exception of the "app" environment,
-# the "root" compose file, docker-compose.yml, is always used first, and then
-# depending on the environment other compose files are added in, which merge
-# with the root compose configuration. since the app environment only runs the
-# app, it has a separate compose file, docker-compose.apponly.yml, rather than
-# merging with the root and killing nearly all the services except the app
-# service.
-#
-# see the following for details on the semantics of merging compose files:
-# https://docs.docker.com/compose/multiple-compose-files/merge/
-#
-# the current environments are as follows (contact FSA for details):
-# - prod: the production environment, which runs the full stack, including the
-#   web app, the job scheduler, and the accounting database. it's the most
-#   resource-intensive environment, and is intended for use in production.
-# - dev/staging: these are effectively dev environments that specific users run
-#   on the server for testing purposes.
-# - app: a development environment that runs only the frontend and backend, and
-#   not the job scheduler or the accounting database. it's intended for use in
-#   frontend development, where you don't need to submit jobs or query the
-#   accounting database.
-
-
-# if 1, skips invoking ./build_images.sh before running the stack
+
+# ===========================================================================
+# === user-overiddeable variables
+# ===========================================================================
+
+# if 1, skips pulling images for the target environment before running
+SKIP_PULL=${SKIP_PULL:-0}
+
+# if 1, skips building images for the target environment before running
 SKIP_BUILD=${SKIP_BUILD:-0}
 
-# command to run after the stack has launched, e.g.
-# in cases where you want to tail some containers after launch
-# (by default, it does nothing)
-POST_LAUNCH_CMD=":"
-# if 1, clears the screen before running the post-launch command
-DO_CLEAR="0"
 # if 1, opens the browser window to the app after launching the stack
 DO_OPEN_BROWSER=${DO_OPEN_BROWSER:-1}
 
 # the URL to open when we invoke the browser
-FRONTEND_URL=${FRONTEND_URL:-"http://localhost:5713"}
+FRONTEND_URL=${FRONTEND_URL:-"http://localhost:5173"}
+
+
+# ===========================================================================
+# === helpers
+# ===========================================================================
 
 # helper function to print a message and exit with a specific code
 # in one command
@@ -80,10 +42,16 @@ function open_browser() {
     fi
 }
 
+
 # ===========================================================================
-# === entrypoint
+# === target env resolution and other env/cmd setup for launching the stack
 # ===========================================================================
 
+# if 1, clears the screen before running the post-launch command
+DO_CLEAR="0"
+# command to run after the stack has launched
+POST_LAUNCH_CMD=":"
+
 # source the .env file and export its contents
 # among other things, we'll use the DEFAULT_ENV var in it to set the target env
 set -a
@@ -91,22 +59,17 @@ source .env
 set +a
 
 # check if the first argument is a valid target env, and if not attempt
-# to infer it from the script's parents' directory name
+# to infer it
 case $1 in
-    "prod"|"staging"|"dev"|"app")
+    "prod"|"dev")
         TARGET_ENV=$1
         shift
         echo "* Selected target environment: ${TARGET_ENV}"
         ;;
     *)
-        # attempt to resolve the target env from the host environment
-        # (e.g., the hostname, possibly the repo root directory name, etc.)
-
-        # get the name of the script's parent directory
-        PARENT_DIR=$(basename $(dirname $(realpath $0)))
+        # resolve the target env
         HOSTNAME=$(hostname)
-
-        # check if the parent directory name contains a valid target env
+        
         if [[ "${HOSTNAME}" = "jravilab" ]]; then
             TARGET_ENV="prod"
             STRATEGY="via hostname ${HOSTNAME}"
@@ -126,7 +89,7 @@ esac
 case ${TARGET_ENV} in
     "prod")
         DEFAULT_ARGS="up -d"
-        COMPOSE_CMD="docker compose -f docker-compose.yml -f docker-compose.prod.yml"
+        COMPOSE_CMD="docker compose -f docker-compose.yml -f docker-compose.slurm.yml -f docker-compose.prod.yml"
         DO_CLEAR="0"
         # never launch the browser in production
         DO_OPEN_BROWSER=0
@@ -135,7 +98,7 @@ case ${TARGET_ENV} in
         ;;
     "dev")
         DEFAULT_ARGS="up -d"
-        COMPOSE_CMD="docker compose -f docker-compose.yml -f docker-compose.override.yml"
+        COMPOSE_CMD="docker compose -f docker-compose.yml -f docker-compose.slurm.yml -f docker-compose.dev.yml"
         DO_CLEAR="1"
         # watch the logs after, since we detached after bringing up the stack
         POST_LAUNCH_CMD="${COMPOSE_CMD} logs -f"
@@ -145,8 +108,23 @@ case ${TARGET_ENV} in
         exit 1
 esac
 
-# ensure that docker compose can see the target env, so it can, e.g., namespace hosts to their environment
+
+# vars that configure (and thus need to be visible to) docker-compose and other
+# child processes
 export TARGET_ENV=${TARGET_ENV}
+export REGISTRY_PREFIX=${REGISTRY_PREFIX}
+export IMAGE_TAG=$( [[ ${TARGET_ENV} = "prod" ]] && echo "latest" || echo ${TARGET_ENV} )
+
+# image build controls:
+# makes compose use the regular docker cli build command
+export COMPOSE_DOCKER_CLI_BUILD=1
+# makes docker use buildkit rather than the legacy build system
+export DOCKER_BUILDKIT=1
+
+
+# ===========================================================================
+# === final argument processing, stack launch
+# ===========================================================================
 
 # if any arguments were specified after the target env, use those instead of the default
 if [ $# -gt 0 ]; then
@@ -155,7 +133,7 @@ if [ $# -gt 0 ]; then
 fi
 
 # check if a "control" command is the current first argument; if so, skip the build
-if [[ "$1" =~ ^(down|restart|logs|build|shell)$ ]]; then
+if [[ "$1" =~ ^(down|restart|logs|build|pull|push|shell)$ ]]; then
     echo "* Skipping build, since we're running a control command: $1"
     SKIP_BUILD=1
     # also skip the post-launch command so we don't get stuck, e.g., tailing
@@ -169,19 +147,20 @@ if [[ "$1" =~ ^(down|restart|logs|build|shell)$ ]]; then
     fi
 fi
 
-# if SKIP_BUILD is 0 and 'down' isn't the docker compose command, build images
-# for the target env.
+# before building, attempt to pull the images for the target env
+# the images now include the build cache, which should make rebuilding
+# the final layers quick.
+if [ "${SKIP_PULL}" != "1" ]; then
+    echo "* Pulling images for ${TARGET_ENV} (tag: ${IMAGE_TAG})"
+    ${COMPOSE_CMD} pull || fatal "Failed to pull images for ${TARGET_ENV}"
+fi
+
+# if SKIP_BUILD is 0 and we're not running a control command, build images for
+# the target env.
 # each built image is tagged with its target env, so they don't collide with
 # each other; in the case of prod, the tag is "latest".
-if [ "${SKIP_BUILD}" -eq 0 ]; then
-    if [ "${TARGET_ENV}" == "prod" ] || [ "${TARGET_ENV}" == "app" ]; then
-        IMAGE_TAG="latest"
-    else
-        IMAGE_TAG="${TARGET_ENV}"
-    fi
-
+if [ "${SKIP_BUILD}" != "1" ]; then
     echo "* Building images for ${TARGET_ENV} (tag: ${IMAGE_TAG})"
-    # ./build_images.sh ${IMAGE_TAG} || fatal "Failed to build images for ${TARGET_ENV}"
     ${COMPOSE_CMD} build || fatal "Failed to build images for ${TARGET_ENV}"
 fi
 
@@ -192,5 +171,5 @@ ${COMPOSE_CMD} ${DEFAULT_ARGS} && \
     [[ ${DO_OPEN_BROWSER} = "1" ]] \
         && open_browser "${FRONTEND_URL}" \
         || exit 0
-) &&
+) && \
 ${POST_LAUNCH_CMD}
diff --git a/services/mariadb/README.md b/services/mariadb/README.md
new file mode 100644
index 0000000..0abcf9a
--- /dev/null
+++ b/services/mariadb/README.md
@@ -0,0 +1,12 @@
+# MariaDB Configuration
+
+This folder contains configuration for the MariaDB instance that runs
+alongside Slurm; Slurm populates the instance with job accounting
+and completion data.
+
+Note that we would just use the existing PostgreSQL instance that's
+used for the rest of the app, but Slurm requires MySQL/MariaDB and
+sets up its own schema in the database.
+
+See https://slurm.schedmd.com/accounting.html for details
+about Slurm's accounting system and use of the database.
diff --git a/services/mariadb/my.cnf b/services/mariadb/my.cnf
new file mode 100644
index 0000000..8a43e79
--- /dev/null
+++ b/services/mariadb/my.cnf
@@ -0,0 +1,5 @@
+[mariadb]
+wait_timeout=28800
+innodb_buffer_pool_size=1024M
+innodb_log_file_size=64M
+innodb_lock_wait_timeout=900