Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load binary packages ahead of use #1275

Closed
aronatkins opened this issue May 16, 2023 · 2 comments · Fixed by #1407
Closed

load binary packages ahead of use #1275

aronatkins opened this issue May 16, 2023 · 2 comments · Fixed by #1407
Labels
feature a feature request or enhancement install 🧺

Comments

@aronatkins
Copy link
Contributor

Installing binary packages can often hide problems with an environment until runtime.

Take, for example, the Dockerfile:

FROM rstudio/r-base:4.3-jammy

RUN apt-get update && apt-get -y install pkg-config

ARG REPOSITORYhttps://packagemanager.rstudio.com/cran/__linux__/jammy/latest
RUN R -s -e "install.packages('rjags', repos=c(CRAN='${REPOSITORY}'))"
RUN R -s -e "library(rjags)"

Building this image:

docker build \
    --progress=plain \
    --no-cache -f Dockerfile.renv .

This example errs because the library statement fails to load the rjags package (because jags is missing).

#7 [4/4] RUN R -s -e "library(rjags)"
#7 0.385 Loading required package: coda
#7 0.427 Error: package or namespace load failed for ‘rjags’:
#7 0.427  .onLoad failed in loadNamespace() for 'rjags', details:
#7 0.427   call: dyn.load(file, DLLpath = DLLpath, ...)
#7 0.427   error: unable to load shared object '/opt/R/4.3.0/lib/R/library/rjags/libs/rjags.so':
#7 0.427   libjags.so.4: cannot open shared object file: No such file or directory
#7 0.427 Execution halted
#7 ERROR: process "/bin/sh -c R -s -e \"library(rjags)\"" did not complete successfully: exit code: 1

This same situation can be seen with renv:

FROM rstudio/r-base:4.3-jammy

RUN apt-get update && apt-get -y install pkg-config

ARG REPOSITORY=https://packagemanager.rstudio.com/cran/__linux__/jammy/latest

RUN mkdir /content && echo "library(rjags)" > /content/script.R

WORKDIR /content

RUN R -s -e "install.packages('renv', repos=c(CRAN='${REPOSITORY}'))"
RUN R -s -e "renv::init(bare = TRUE, repos=c(CRAN='${REPOSITORY}'))"
RUN R -s -e "renv::install('rjags', repos=c(CRAN='${REPOSITORY}'))"
RUN R -s -e "renv::snapshot()"
RUN R -s -f "script.R"

The error:

 > [9/9] RUN R -s -f "script.R":
#0 0.537 Loading required package: coda
#0 0.580 Error: package or namespace load failed for ‘rjags’:
#0 0.580  .onLoad failed in loadNamespace() for 'rjags', details:
#0 0.580   call: dyn.load(file, DLLpath = DLLpath, ...)
#0 0.580   error: unable to load shared object '/root/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/rjags/4-14/dc00e7bcbec8cd953930ec12fc9c8819/rjags/libs/rjags.so':
#0 0.580   libjags.so.4: cannot open shared object file: No such file or directory
#0 0.580 Execution halted

Part of this is rooted in how R handles package installation: The R CMD install command installs source and binary packages differently.

R CMD install source/binary switch:
https://github.com/wch/r-source/blob/3a15e29d1b5a94e06b99cb9315c562fd8f71bc5b/src/library/tools/R/install.R#L419-L422

R CMD install package load test (inside do_install_source):
https://github.com/wch/r-source/blob/3a15e29d1b5a94e06b99cb9315c562fd8f71bc5b/src/library/tools/R/install.R#L1814-L1852
https://github.com/wch/r-source/blob/3a15e29d1b5a94e06b99cb9315c562fd8f71bc5b/src/library/tools/R/install.R#L1854-L1861

R CMD install do_install_binary does not attempt load:
https://github.com/wch/r-source/blob/3a15e29d1b5a94e06b99cb9315c562fd8f71bc5b/src/library/tools/R/install.R#L480-L502

When first added, only to the source path (and used library, not external process):
wch/r-source@7bc912f

Could renv attempt to validate binary packages by loading them before finalizing the installation? That would let us identify problems at restore-time rather than downstream when running the content.

@hadley hadley added feature a feature request or enhancement install 🧺 labels May 18, 2023
@aronatkins
Copy link
Contributor Author

I have a proof-of-concept which does not install a package into the primary library if it cannot be loaded, created by code-reading some of the package loading that R performs when installing from source.

test_load_package.R
R_runR <- function(cmd = NULL, Ropts = "", env = "",
                   stdout = TRUE, stderr = TRUE, stdin = NULL) {
  # from tools/check.R; non-windows.
  suppressWarnings(system2(file.path(R.home("bin"), "R"),
                           c(Ropts),
                           stdout, stderr, stdin, input = cmd, env = env))
}

setRlibs <- function() {
  rlibs <- .libPaths()
  rlibs <- paste(rlibs, collapse = .Platform$path.sep)
  rlibs <- shQuote(rlibs)
  c(paste0("R_LIBS=", rlibs),
    "R_ENVIRON_USER=''",
    "R_LIBS_USER='NULL'",
    "R_LIBS_SITE='NULL'")
}

# This produces a (by default single) quoted string for use in a
# command sent to another R process.
# from tools::install.packages
quote_path <- function(path, quote = "'") {
  path <- gsub("\\", "\\\\", path, fixed = TRUE)
  path <- gsub(quote, paste0("\\", quote), path, fixed = TRUE)
  paste0(quote, path, quote)
}

load_package <- function(pkg_name, tmplib) {
  cat(paste0("trying to load: ", pkg_name, " from ", tmplib, "\n"))

  # TODO: What versions of R contain this helper? Do we need our own?
  cmd <- paste0("tools:::.test_load_package('", pkg_name, "', ", quote_path(tmplib), ")")

  opts <- "--no-save --no-restore --no-echo"
  env <- setRlibs()

  out <- R_runR(cmd, Ropts = opts, env = env)
  if (length(out)) {
    cat(paste(c(out, ""), collapse = "\n"))
  }
  if (length(attr(out, "status"))) {
    stop("loading failed", call. = FALSE, domain = NA)
  }
  cat("load successful.\n")
}

safe_install_impl <- function(pkg_name, lib) {
  tmplib <- tempfile()
  dir.create(tmplib)
  on.exit(unlink(tmplib), add = TRUE)

  cat(paste0("lib: ", lib, "\n"))
  cat(paste0("tmplib: ", tmplib, "\n"))

  old_paths <- .libPaths()
  on.exit(.libPaths(old_paths), add = TRUE)
  .libPaths(c(tmplib, lib, .libPaths()))

  install.packages(pkg_name)

  load_package(pkg_name, tmplib)

  tmp_pkg_path <- file.path(tmplib, pkg_name)
  pkg_path <- file.path(lib, pkg_name)

  # FIXME: rename only works when within the same file-system.
  # FIXME: handle when the target already exists.
  cat(paste0("renaming: ", tmp_pkg_path, " => ", pkg_path, "\n"))
  file.rename(tmp_pkg_path, pkg_path)
}

safe_install <- function(pkg_name, lib) {
  tryCatch(
    safe_install_impl(pkg_name, lib),
    error = function(e) {
      e$message <- paste0(
        "Failed to install package ", pkg_name,
        " into ", lib, ": ", e$message)
      stop(e)
    })
}

options(repos = c(CRAN = "https://packagemanager.rstudio.com/cran/__linux__/jammy/latest"))

# FIXME: Determine the target some other way.

lib <- .libPaths()[1L]

safe_install("coda", lib)
safe_install("rjags", lib)

A similar Dockerfile to before, but called Dockerfile.load-bad and using the install script:

FROM rstudio/r-base:4.3-jammy AS without-jags

RUN apt-get update && apt-get -y install pkg-config

FROM without-jags AS installation

COPY test_load_package.R /content/test_load_package.R

WORKDIR /content

RUN R -s -f test_load_package.R

When run, the "install" errs, as we want:

docker build \
    --progress=plain \
    --no-cache-filter installation \
    -f Dockerfile.load-bad .
 > [installation 3/4] RUN R -s -f test_load_package.R:
#9 4.942 trying to load: rjags from /tmp/Rtmp4Z0p5s/file711f80a4b
#9 5.210 Error: package or namespace load failed for ‘rjags’:
#9 5.210  .onLoad failed in loadNamespace() for 'rjags', details:
#9 5.210   call: dyn.load(file, DLLpath = DLLpath, ...)
#9 5.210   error: unable to load shared object '/tmp/Rtmp4Z0p5s/file711f80a4b/rjags/libs/rjags.so':
#9 5.210   libjags.so.4: cannot open shared object file: No such file or directory
#9 5.210 Error: loading failed
#9 5.210 Execution halted
#9 5.211 Error: Failed to install package rjags into /opt/R/4.3.0/lib/R/library: loading failed
#9 5.211 Execution halted

@aronatkins
Copy link
Contributor Author

Re-tested this issue after the resolution of #1611 and the rjags load problem is still detected:

2.586 # Downloading packages -------------------------------------------------------
2.589 - Downloading rjags from CRAN ...               OK [129.7 Kb in 0.86s]
3.658 - Downloading coda from CRAN ...                OK [312.8 Kb in 0.87s]
4.720 Successfully downloaded 2 packages in 3.1 seconds.
4.720
4.721 The following package(s) will be installed:
4.721 - coda  [0.19-4]
4.721 - rjags [4-14]
4.721 These packages will be installed into "/content/renv/library/R-4.3/x86_64-pc-linux-gnu".
4.721
4.721 # Installing packages --------------------------------------------------------
4.792 - Installing coda ...                           OK [installed binary and cached in 1.4s]
6.320 - Installing rjags ...                          FAILED
7.621 /opt/R/4.3.1/lib/R/bin/R --vanilla -s -f '/tmp/RtmpY8ix89/renv-install-7fa1d2a0'
7.621 ================================================================================
7.621
7.621 Error: .onLoad failed in loadNamespace() for 'rjags', details:
7.621   call: dyn.load(file, DLLpath = DLLpath, ...)
7.621   error: unable to load shared object '/content/renv/staging/1/rjags/libs/rjags.so':
7.621   libjags.so.4: cannot open shared object file: No such file or directory
7.621 Execution halted
7.621
7.621 Error: error testing if 'rjags' can be loaded [error code 1]
7.634 Traceback (most recent calls last):
7.634 13: renv::install("rjags", repos = c(CRAN = "https://packagemanager.rstudio.com/cran/__linux__/jammy/latest"))
7.634 12: renv_install_impl(records)
7.634 11: renv_install_staged(records)
7.634 10: renv_install_default(records)
7.634  9: handler(package, renv_install_package(record))
7.634  8: renv_install_package(record)
7.634  7: withCallingHandlers(renv_install_package_impl(record), error = function(e) writef("FAILED"))
7.634  6: renv_install_package_impl(record)
7.634  5: withCallingHandlers(if (isbin) renv_install_test(package), error = function(err) unlink(installpath,
7.634         recursive = TRUE))
7.634  4: renv_install_test(package)
7.634  3: renv_system_exec(command = R(), args = c("--vanilla", "-s", "-f",
7.634         renv_shell_path(script)), action = sprintf("testing if '%s' can be loaded",
7.634         package))
7.634  2: abort(sprintf("error %s [error code %i]", action, status), body = renv_system_exec_details(command,
7.634         args, output))
7.634  1: stop(fallback)
7.636 Execution halted
------
Dockerfile:14
--------------------
  12 |     RUN R -s -e "remotes::install_github('rstudio/renv')"
  13 |     RUN R -s -e "renv::init(bare = TRUE, repos=c(CRAN='${REPOSITORY}'))"
  14 | >>> RUN R -s -e "renv::install('rjags', repos=c(CRAN='${REPOSITORY}'))"
  15 |     RUN R -s -e "renv::snapshot()"
  16 |     RUN R -s -f "script.R"
--------------------
ERROR: failed to solve: process "/bin/sh -c R -s -e \"renv::install('rjags', repos=c(CRAN='${REPOSITORY}'))\"" did not complete successfully: exit code: 1

Used a slightly different Dockerfile to test main:

FROM rstudio/r-base:4.3-jammy

RUN apt-get update && apt-get -y install pkg-config

ARG REPOSITORY=https://packagemanager.rstudio.com/cran/__linux__/jammy/latest

RUN mkdir /content && echo "library(rjags)" > /content/script.R

WORKDIR /content

RUN R -s -e "install.packages('remotes', repos=c(CRAN='${REPOSITORY}'))"
RUN R -s -e "remotes::install_github('rstudio/renv')"
RUN R -s -e "renv::init(bare = TRUE, repos=c(CRAN='${REPOSITORY}'))"
RUN R -s -e "renv::install('rjags', repos=c(CRAN='${REPOSITORY}'))"
RUN R -s -e "renv::snapshot()"
RUN R -s -f "script.R"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement install 🧺
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants