Skip to content

Commit

Permalink
add filtering of base images from containerit
Browse files Browse the repository at this point in the history
also update project website urls,
preliminary implementation related to o2r-project#105
  • Loading branch information
nuest committed Aug 15, 2018
1 parent 4bd6304 commit e4e2aed
Show file tree
Hide file tree
Showing 11 changed files with 152 additions and 58 deletions.
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ ARG META_VERSION
# Metadata http://label-schema.org/rc1/
LABEL maintainer="o2r-project <https://o2r.info>" \
org.label-schema.vendor="o2r project" \
org.label-schema.url="http://o2r.info" \
org.label-schema.url="https://o2r.info" \
org.label-schema.name="o2r muncher" \
org.label-schema.description="ERC execution and CRUD" \
org.label-schema.version=$VERSION \
Expand Down
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[![Build Status](https://travis-ci.org/o2r-project/o2r-muncher.svg?branch=master)](https://travis-ci.org/o2r-project/o2r-muncher) [![](https://images.microbadger.com/badges/image/o2rproject/o2r-muncher.svg)](https://microbadger.com/images/o2rproject/o2r-muncher "Get your own image badge on microbadger.com") [![](https://images.microbadger.com/badges/version/o2rproject/o2r-muncher.svg)](https://microbadger.com/images/o2rproject/o2r-muncher "Get your own version badge on microbadger.com")

Node.js implementation of the endpoints `/api/v1/compendium` (reading and metadata update) and `/api/v1/jobs` of the [o2r-web-api](https://o2r.info/o2r-web-api/).
Node.js implementation of the endpoints `/api/v1/compendium` (reading and metadata update) and `/api/v1/jobs` of the [o2r API](https://o2r.info/api/).

Requirements:

Expand Down Expand Up @@ -48,7 +48,9 @@ You can override these environment variables (configured in `config/config.js`)
- `MUNCHER_CONTAINERIT_IMAGE`
Docker image name and tag for containerit tool, defaults to running Rocker's [geospatial](https://github.com/rocker-org/geospatial/) image with [containerit](https://github.com/o2r-project/containerit/) pre-installed, i.e. `o2rproject/containerit:geospatial`.
- `MUNCHER_CONTAINERIT_USER`
The user within the container, which must match the used image (see previous setting), defaults to `rstudio`, which is suitable for images in the `rocker/verse` stack of images. _Change this for usage with `docker-compose`!
The user within the container, which must match the used image (see previous setting), defaults to `rstudio`, which is suitable for images in the `rocker/verse` stack of images. _Change this_ when running muncher inside a container, or with `docker-compose`!
- `MUNCHER_CONTAINERIT_FILTER_BASE_IMAGE_PKGS`
Gives the `containerit` container access to the Docker socket so that it can extract the packages installed in a container and not install them redundantly, see also [related issue](https://github.com/o2r-project/o2r-muncher/issues/105). _Only works when running muncher inside a container_, or with `docker-compose`!
- `MUNCHER_FAIL_ON_NO_FILES`
Should an error be thrown when files for a compendium that exists in the database are _not found_? Defaults to `false` (useful for testing).
- `MUNCHER_ALLOW_INVALID_METADATA`
Expand Down Expand Up @@ -147,7 +149,7 @@ Alternatively, start the component(s) under development from your IDE(s).

You can authenticate locally with OAuth via ORCID using the required configuration parameters (see project [reference-implementation](https://github.com/o2r-project/reference-implementation)).

If you want to upload from the command line, make sure the account has the required [level](https://o2r.info/o2r-web-api/user/#user-levels) (it should [by default](https://github.com/o2r-project/o2r-bouncer#available-environment-variables)), get the session cookie `connect.sid` content out of the browser and use it in the `curl` request:
If you want to upload from the command line, make sure the account has the required [level](https://o2r.info/api/user/#user-levels) (it should [by default](https://github.com/o2r-project/o2r-bouncer#available-environment-variables)), get the session cookie `connect.sid` content out of the browser and use it in the `curl` request:

```bash
curl --cookie connect.sid=s:S1oH7... -F "compendium=@/<path to compendium.zip>;type=application/zip" -F "content_type=compendium"
Expand Down
10 changes: 8 additions & 2 deletions config/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -226,10 +226,16 @@ c.containerit.default_create_options = {
Env: ['O2R_MUNCHER=true', 'O2R_MUNCHER_VERSION=' + c.version],
Memory: 1073741824 * 2, // 2G
MemorySwap: 1073741824 * 4,
User: env.MUNCHER_CONTAINERIT_USER || 'rstudio' // IMPORTANT: this must fit the used image!
User: env.MUNCHER_CONTAINERIT_USER || 'rstudio' // this must fit the used image, so that files outside the container for local testing can be deleted
// and must be 'root' (or a user who can run Docḱer) for package filtering > extra setting below!
};
c.containerit.baseImage = env.MUNCHER_CONTAINERIT_BASE_IMAGE || 'rocker/geospatial:3.4.4';
c.containerit.filterBaseImagePkgs = (yn(env.MUNCHER_CONTAINERIT_FILTER_BASE_IMAGE_PKGS) || 'false').toString().toUpperCase();
c.containerit.filterBaseImagePkgs = {
r_parameter_value: (yn(env.MUNCHER_CONTAINERIT_FILTER_BASE_IMAGE_PKGS) || 'false').toString().toUpperCase(),
enabled: yn(env.MUNCHER_CONTAINERIT_FILTER_BASE_IMAGE_PKGS || 'false'),
user: 'root' // needs access to Docker
};
c.containerit.dInDBind = '/var/run/docker.sock:/var/run/docker.sock';
c.containerit.maintainer = 'o2r';
c.containerit.rm = yn(env.MUNCHER_CONTAINERIT_CONTAINER_RM || 'true');

Expand Down
2 changes: 1 addition & 1 deletion index.js
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ function initApp(callback) {
});

const indexResponse = {};
indexResponse.about = 'http://o2r.info';
indexResponse.about = 'https://o2r.info';
indexResponse.versions = {};
indexResponse.versions.current = '/api/v1';
indexResponse.versions.v1 = '/api/v1';
Expand Down
2 changes: 1 addition & 1 deletion lib/executor.js
Original file line number Diff line number Diff line change
Expand Up @@ -628,7 +628,7 @@ function Executor(jobId, compendium) {

fs.open(manifestFile, 'r+', (error) => {
if (error) {
debug('[%s] Manifest file not found at %s', job_id, manifestFile);
debug('[%s] Manifest file not found at %s: %o', job_id, manifestFile, error);
stepUpdate('generate_manifest', 'failure', 'manifest file not found at expected location', (err) => {
if (err) reject(err);
else {
Expand Down
117 changes: 69 additions & 48 deletions lib/manifest.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ const path = require('path');
const Stream = require('stream');
const Docker = require('dockerode');
const Job = require('../lib/model/job');
const cleanMessage = require('../lib/error-message');

// setup Docker client with default options
var docker = new Docker();
Expand Down Expand Up @@ -69,6 +70,11 @@ module.exports.generateManifestForRMarkdown = function (job_id, payload_dir, mai
path_to_workdir_in_container = path.join(config.fs.job, job_id);
}

if(config.containerit.filterBaseImagePkgs.enabled) {
debug('[%s] base image filtering for manifest generation is enabled, must bind Docker socket.', job_id);
binds.push(config.containerit.dInDBind);
}

let path_to_mainfile_in_container = path.join(path_to_workdir_in_container, main_file);

let create_options = Object.assign(
Expand All @@ -81,6 +87,12 @@ module.exports.generateManifestForRMarkdown = function (job_id, payload_dir, mai
}
}
);

if(config.containerit.filterBaseImagePkgs.enabled) {
debug('[%s] Overriding user for containerit container to %s', config.containerit.filterBaseImagePkgs.user);
create_options.User = config.containerit.filterBaseImagePkgs.user;
}

debug('[%s] container create options: %o', job_id, create_options);

let start_options = {};
Expand All @@ -89,20 +101,20 @@ module.exports.generateManifestForRMarkdown = function (job_id, payload_dir, mai
// FIXME use some template mechanism instead of string concatenation

// render command can ignore the config.vs.volume distinction
r_render_command = 'CMD_Render(\''
r_render_command = 'containerit::CMD_Render(\''
+ path.join(config.bagtainer.mountLocationInContainer, main_file) + '\', '
+ 'output_dir = \'' + config.bagtainer.mountLocationInContainer + '\', '
+ 'output_file = \'' + display_file + '\')';

// dockerfile command must use the correct files
r_dockerfile_command = 'dockerfile('
r_dockerfile_command = 'containerit::dockerfile('
+ 'from = \'' + path_to_mainfile_in_container + '\', '
+ 'image = \'' + config.containerit.baseImage + '\', '
+ 'maintainer = \'' + config.containerit.maintainer + '\', '
+ 'copy = NA, '
+ 'container_workdir = \'' + path_to_workdir_in_container + '\', '
+ 'cmd = ' + r_render_command + ', '
+ 'filter_baseimage_pkgs = ' + config.containerit.filterBaseImagePkgs + ')';
+ 'filter_baseimage_pkgs = ' + config.containerit.filterBaseImagePkgs.r_parameter_value + ')';

// FIXME hack to remove .html .pdf files matching the main file
main_file_without_extension = path.basename(main_file).replace(path.extname(main_file), '');
Expand Down Expand Up @@ -147,6 +159,11 @@ module.exports.generateManifestForRScript = function (job_id, payload_dir, main_
path_to_workdir_in_container = path.join(config.fs.job, job_id);
}

if(config.containerit.filterBaseImagePkgs.enabled) {
debug('[%s] base image filtering for manifest generation is enabled, must bind Docker socket.', job_id);
binds.push(config.containerit.dInDBind);
}

let path_to_mainfile_in_container = path.join(path_to_workdir_in_container, main_file);

let create_options = Object.assign(
Expand All @@ -164,17 +181,17 @@ module.exports.generateManifestForRScript = function (job_id, payload_dir, main_
debug('[%s] container start options: %o', job_id, start_options);

// FIXME use some template mechanism instead of string concatenation
r_render_command = 'CMD_Rscript(basename(\''
r_render_command = 'containerit::CMD_Rscript(basename(\''
+ path_to_mainfile_in_container + '\'), vanilla = TRUE)';

r_dockerfile_command = 'dockerfile('
r_dockerfile_command = 'containerit::dockerfile('
+ 'from = \'' + path_to_mainfile_in_container + '\', '
+ 'image = \'' + config.containerit.baseImage + '\', '
+ 'maintainer = \'' + config.containerit.maintainer + '\', '
+ 'copy = NA, '
+ 'container_workdir = \'' + path_to_workdir_in_container + '\', '
+ 'cmd = ' + r_render_command + '\, '
+ 'filter_baseimage_pkgs = ' + config.containerit.filterBaseImagePkgs + ')';
+ 'filter_baseimage_pkgs = ' + config.containerit.filterBaseImagePkgs.r_parameter_value + ')';

r_command = 'setwd(\'' + path_to_workdir_in_container + '\'); write(' + r_dockerfile_command + ')'
//+ 'file = \'' + path.join(config.bagtainer.mountLocationInContainer, config.bagtainer.manifestFile) + '\')';
Expand All @@ -191,51 +208,55 @@ runContainerit = function (job_id, create_options, start_options, cmd, update, c
debug('[%s] Starting Docker container now with options and command:\n\tcreate_options: %s\n\tstart_options: %s\n\tcmd: %s',
job_id, JSON.stringify(create_options), JSON.stringify(start_options), cmd.join(' '));

const containerLogStream = Stream.Writable();
containerLogStream._write = function (chunk, enc, next) {
msg = Buffer.from(chunk).toString().trim();
debug('[%s] [container] %s', job_id, msg);
update('generate_manifest', null, '[Rendering command: ' + cleanMessage(cmd.join(' ')) + ']', (err) => {
if (err) debug('[%s] Error updating job log from container log stream: %o', error);

update('generate_manifest', null, msg, (err) => {
if (err) debug('[%s] Error updating job log from container log stream: %o', error);
const containerLogStream = Stream.Writable();
containerLogStream._write = function (chunk, enc, next) {
msg = Buffer.from(chunk).toString().trim();
debug('[%s] [container] %s', job_id, msg);

next();
});
};

docker.run(config.containerit.image, cmd, containerLogStream, create_options, start_options, (err, data, container) => {
debug('[%s] container running: %o', job_id, container);
if (err) {
debug('[%s] error during manifest creation:', err);
callback(err, null);
} else {
if (data.StatusCode === 0) {
debug('[%s] Completed manifest creation: %o', job_id, data);

// check if manifest was created, then return
callback(null, {
data: data,
manifest: config.bagtainer.manifestFile
});
update('generate_manifest', null, msg, (err) => {
if (err) debug('[%s] Error updating job log from container log stream: %o', error);

next();
});
};

docker.run(config.containerit.image, cmd, containerLogStream, create_options, start_options, (err, data, container) => {
debug('[%s] container running: %o', job_id, container);
if (err) {
debug('[%s] error during manifest creation:', err);
callback(err, null);
} else {
debug('[%s] Error during manifest container run: %o', job_id, data);
container.logs({
follow: true,
stdout: true,
stderr: true,
timestamps: true
}, function (err, stream) {
if (err)
debug('[%s] Error getting container logs after non-zero status code', job_id);
else {
stream.on('data', function (data) {
debug('[%s] container logs ', job_id, Buffer.from(data).toString().trim());
});
}
});

callback(new Error('Received non-zero statuscode from container: ' + JSON.stringify(data)), null);
if (data.StatusCode === 0) {
debug('[%s] Completed manifest creation: %o', job_id, data);

// check if manifest was created, then return
callback(null, {
data: data,
manifest: config.bagtainer.manifestFile
});
} else {
debug('[%s] Error during manifest container run: %o', job_id, data);
container.logs({
follow: true,
stdout: true,
stderr: true,
timestamps: true
}, function (err, stream) {
if (err)
debug('[%s] Error getting container logs after non-zero status code', job_id);
else {
stream.on('data', function (data) {
debug('[%s] container logs ', job_id, Buffer.from(data).toString().trim());
});
}
});

callback(new Error('Received non-zero statuscode from container: ' + JSON.stringify(data)), null);
}
}
}
});
});
}
3 changes: 2 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
"name": "muncher",
"version": "0.20.0",
"description": "Node.js implementation of parts of the [o2r web api](http://o2r.info/o2r-web-api).",
"description": "Node.js implementation of parts of the [o2r API](https://o2r.info/api).",
"main": "index.js",
"scripts": {
"start": "DEBUG=muncher,muncher:*,executor:* node index.js",
Expand All @@ -12,7 +13,7 @@
"test_ci_async_dump": "DEBUG=*,-modem,-mocha:* mocha ./test/ --tags \"not:storage_access not:image_tarball_upload\" --require test/async-dump",
"test_clean_upload_cache": "rm /tmp/o2r-*.zip"
},
"author": "o2r-project (http://o2r.info)",
"author": "o2r-project (https://o2r.info)",
"license": "Apache-2.0",
"repository": {
"type": "git",
Expand Down
2 changes: 1 addition & 1 deletion test/basic.js
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ describe('API', () => {
request(path, (err, res, body) => {
let response = JSON.parse(body);
assert.ifError(err);
assert.equal(response.about, "http://o2r.info");
assert.equal(response.about, "https://o2r.info");
assert.isOk(response.versions);
assert.isOk(response.versions.current);
current = response.versions.current;
Expand Down
44 changes: 44 additions & 0 deletions test/job-manifest.js
Original file line number Diff line number Diff line change
Expand Up @@ -255,4 +255,48 @@ describe('Manifest creation during a job', () => {

});

describe.skip('Manifest generation with skipping base images (only run manually against containerised muncher, otherwise permission issues)', () => {
let job_id = '';
let compendium_id = '';

before(function (done) {
this.timeout(240000); // image tarball saving takes time
createCompendiumPostRequest('./test/workspace/rmd-geospatial', cookie_o2r, 'workspace', (req) => {
request(req, (err, res, body) => {
compendium_id = JSON.parse(body).id;
publishCandidate(compendium_id, cookie_o2r, () => {
startJob(compendium_id, id => {
job_id = id;
waitForJob(job_id, (finalStatus) => {
done();
});
});
});
});
});
});

it('should complete step "generate_manifest"', (done) => {
request(global.test_host + '/api/v1/job/' + job_id, (err, res, body) => {
assert.ifError(err);
let response = JSON.parse(body);
assert.propertyVal(response.steps.generate_manifest, 'status', 'success');
done();
});
});

it('should have the expected content in the manifest', function (done) {
request(global.test_host_transporter + '/api/v1/job/' + job_id + '/data/Dockerfile', (err, res, body) => {
assert.ifError(err);
assert.isNotObject(body, 'response is not JSON');
assert.include(body, 'RUN ["install2.r", "here", "lwgeom"]', 'lwgeom and here packages installed in Dockerfile');
assert.include(body, 'Packages skipped');
assert.match(body, '^# Packages skipped.*sf', 'skip sf package');
assert.match(body, '^# Packages skipped.*rmarkdown', 'skip rmarkdown package');
assert.match(body, '^# Packages skipped.*yaml', 'skip yaml package');
done();
});
});
});

});
1 change: 1 addition & 0 deletions test/workspace/rmd-geospatial/display.html
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
dummy
19 changes: 19 additions & 0 deletions test/workspace/rmd-geospatial/main.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: "Test for R package skipping"
author:
- name: "Daniel Nüst"
affiliation: o2r team
date: 2018-08-08
output: html_document
abstract: "Test workspace"
---

```{r packages}
library("lubridate")
library("units")
library("sf")
library("lwgeom")
library("here")
```

No content.

0 comments on commit e4e2aed

Please sign in to comment.