Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial version of jest-worker #4497

Merged
merged 8 commits into from
Oct 4, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
193 changes: 193 additions & 0 deletions packages/jest-worker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
# jest-worker

Module for executing heavy tasks under forked processes in parallel, by providing a `Promise` based interface, minimum overhead, and bound workers.

The module works by providing an absolute path of the module to be loaded in all forked processes. Files relative to a node module are also accepted. All methods are exposed on the parent process as promises, so they can be `await`'ed. Child (worker) methods can either be synchronous or asynchronous.

The module also implements support for bound workers. Binding a worker means that, based on certain parameters, the same task will always be executed by the same worker. The way bound workers work is by using the returned string of the `computeWorkerKey` method. If the string was used before for a task, the call will be queued to the related worker that processed the task earlier; if not, it will be executed by the first available worker, then sticked to the worker that executed it; so the next time it will be processed by the same worker. If you have no preference on the worker executing the task, but you have defined a `computeWorkerKey` method because you want _some_ of the tasks to be sticked, you can return `null` from it.

The list of exposed methods can be explicitly provided via the `exposedMethods` option. If it is not provided, it will be obtained by requiring the child module into the main process, and analyzed via reflection. Check the "minimal example" section for a valid one.

## Install

```sh
$ yarn add jest-worker
```


## API

The only exposed method is a constructor (`Worker`) that is initialized by passing the worker path, plus an options object.


### `workerPath: string` (required)

Node module name or absolute path of the file to be loaded in the child processes. Use `require.resolve` to transform a relative path into an absolute one.


### `options: Object` (optional)

#### `exposedMethods: $ReadOnlyArray<string>` (optional)

List of method names that can be called on the child processes from the parent process. You cannot expose any method named like a public `Worker` method, or starting with `_`. If you use method auto-discovery, then these methods will not be exposed, even if they exist.

#### `numWorkers: number` (required)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says "required" in the heading, but "defaults" in the body. Which is it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! It's optional.


Amount of workers to spwan. Defaults to the number of CPUs minus 1.

#### `forkOptions: Object` (optional)

Allow customizing all options passed to `childProcess.fork`. By default, some values are set (`cwd` and `env`), but you can override them and customize the rest. For a list of valid values, check [the Node documentation](https://nodejs.org/api/child_process.html#child_process_child_process_fork_modulepath_args_options).

### `computeWorkerKey: (method: string, ...args: Array<any>) => ?string` (optional)

Every time a method exposed via the API is called, `computeWorkerKey` is also called in order to bound the call to a worker. This is useful for workers that are able to cache the result or part of it. You bound calls to a worker by making `computeWorkerKey` return the same identifier for all different calls. If you do not want to bind the call to any worker, return `null`.

The callback you provide is called with the method name, plus all the rest of the arguments of the call. Thus, you have full control to decide what to return. Check a practical example on bound workers under the "bound worker usage" section.

By default, no process is bound to any worker.


## Worker

The returned `Worker` instance has all the exposed methods, plus some additional ones to interact with the workers itself:


### `getStdout(): Readable`

Returns a `ReadableStream` where the standard output of all workers is piped. Note that the `silent` option of the child workers must be set to `true` to make it work. This is the default set by `jest-worker`, but keep it in mind when overriding options through `forkOptions`.


### `getStderr(): Readable`

Returns a `ReadableStream` where the standard error of all workers is piped. Note that the `silent` option of the child workers must be set to `true` to make it work. This is the default set by `jest-worker`, but keep it in mind when overriding options through `forkOptions`.


### `end()`

Finishes the workers by killing all workers. No further calls can be done to the `Worker` instance.


## Minimal example

This example covers the minmal usage:

### File `parent.js`

```javascript
import Worker from 'jest-worker';

async function main() {
const worker = new Worker(require.resolve('./worker'));
const result = await worker.hello('Alice'); // "Hello, Alice"
}

main();
```

### File `worker.js`

```javascript
export function hello(param) {
return 'Hello, ' + param;
}
```


## Standard usage

This example covers the standard usage:

### File `parent.js`

```javascript
import Worker from 'jest-worker';

async function main() {
const myWorker = new Worker({
exposedMethods: ['foo', 'bar'],
numWorkers: 4,
workerPath: require.resolve('./worker'),
});

console.log(await myWorker.foo('Alice')); // "Hello from foo: Alice"
console.log(await myWorker.bar('Bob')); // "Hello from bar: Bob"

myWorker.end();
}

main();
```

### File `worker.js`

```javascript
export function foo(param) {
return 'Hello from foo: ' + param;
}

export function bar(param) {
return 'Hello from bar: ' + param;
}
```


## Bound worker usage:

This example covers the usage with a `computeWorkerKey` method:

### File `parent.js`

```javascript
import Worker from 'jest-worker';

async function main() {
const myWorker = new Worker({
computeWorkerKey: (method, filename) => filename,
exposedMethods: ['foo', 'bar'],
workerPath: require.resolve('./worker'),
});

// Transform the given file, within the first available worker.
console.log(await myWorker.transform('/tmp/foo.js'));

// Wait a bit.
await sleep(10000);

// Transform the same file again. Will immediately return because the
// transformed file is cached in the worker, and `computeWorkerKey` ensures
// the same worker that processed the file the first time will process it now.
console.log(await myWorker.transform('/tmp/foo.js'));

myWorker.end();
}

main();
```

### File `worker.js`

```javascript

import babel from 'babel-core';

const cache = Object.create(null);

export function transform(filename) {
if (cache[filename]) {
return cache[filename];
}

// jest-worker can handle both immediate results and thenables. If a
// thenable is returned, it will be await'ed until it resolves.
return new Promise((resolve, reject) => {
babel.transformFile(filename, (err, result) => {
if (err) {
reject(err);
} else {
resolve(cache[filename] = result);
}
});
});
}
```
13 changes: 13 additions & 0 deletions packages/jest-worker/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"name": "jest-worker",
"version": "21.1.0",
"repository": {
"type": "git",
"url": "https://github.com/facebook/jest.git"
},
"license": "BSD-3-Clause",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIT. All headers as well

"main": "build/index.js",
"dependencies": {
"merge-stream": "^1.0.1"
}
}
173 changes: 173 additions & 0 deletions packages/jest-worker/src/__performance_tests__/test.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
'use strict';

// eslint-disable-next-line import/no-extraneous-dependencies
const workerFarm = require('worker-farm');
const JestWorker = require('../../build');

const sleep = ms => new Promise(resolve => setTimeout(resolve, ms));
const calls = 10000;
const threads = 6;

function testWorkerFarm() {
return new Promise(async (resolve, reject) => {
const startTime = Date.now();
let count = 0;

async function countToFinish() {
if (++count === calls) {
workerFarm.end(api);
const endTime = Date.now();

// Let all workers go down.
await sleep(2000);

resolve({
globalTime: endTime - startTime - 2000,
processingTime: endTime - startProcess,
});
}
}

const api = workerFarm(
{
autoStart: true,
maxConcurrentCallsPerWorker: 1,
maxConcurrentWorkers: threads,
},
require.resolve('./workers/worker_farm'),
['loadTest'],
);

// Let all workers come up.
await sleep(2000);

const startProcess = Date.now();

for (let i = 0; i < calls; i++) {
const promisified = new Promise((resolve, reject) => {
api.loadTest((err, result) => {
if (err) {
reject(err);
} else {
resolve(result);
}
});
});

promisified.then(countToFinish);
}
});
}

function testJestWorker() {
return new Promise(async (resolve, reject) => {
const startTime = Date.now();
let count = 0;

async function countToFinish() {
if (++count === calls) {
farm.end();
const endTime = Date.now();

// Let all workers go down.
await sleep(2000);

resolve({
globalTime: endTime - startTime - 2000,
processingTime: endTime - startProcess,
});
}
}

const farm = new JestWorker(require.resolve('./workers/jest_worker'), {
exposedMethods: ['loadTest'],
forkOptions: {execArgv: []},
workers: threads,
});

farm.getStdout().pipe(process.stdout);
farm.getStderr().pipe(process.stderr);

// Let all workers come up.
await sleep(2000);

const startProcess = Date.now();

for (let i = 0; i < calls; i++) {
const promisified = farm.loadTest();

promisified.then(countToFinish);
}
});
}

function profile(x) {
console.profile(x);
}

function profileEnd(x) {
console.profileEnd(x);
}

async function main() {
if (!global.gc) {
console.log('GC not present');
}

const wFResults = [];
const jWResults = [];

for (let i = 0; i < 10; i++) {
console.log('-'.repeat(75));

profile('worker farm');
const wF = await testWorkerFarm();
profileEnd('worker farm');
await sleep(3000);
// eslint-disable-next-line no-undef
global.gc && gc();

profile('jest worker');
const jW = await testJestWorker();
profileEnd('jest worker');
await sleep(3000);
// eslint-disable-next-line no-undef
global.gc && gc();

wFResults.push(wF);
jWResults.push(jW);

console.log('jest-worker:', jW);
console.log('worker-farm:', wF);
}

let wFGT = 0;
let wFPT = 0;
let jWGT = 0;
let jWPT = 0;

for (let i = 0; i < 10; i++) {
wFGT += wFResults[i].globalTime;
wFPT += wFResults[i].processingTime;

jWGT += jWResults[i].globalTime;
jWPT += jWResults[i].processingTime;
}

console.log('-'.repeat(75));
console.log('total worker-farm:', {wFGT, wFPT});
console.log('total jest-worker:', {jWGT, jWPT});

console.log('-'.repeat(75));
console.log(
`% improvement over ${calls} calls (global time):`,
100 * (wFGT - jWGT) / wFGT,
);

console.log(
`% improvement over ${calls} calls (processing time):`,
100 * (wFPT - jWPT) / wFPT,
);
}

main();
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
'use strict';

const pi = require('./pi');

module.exports.loadTest = function() {
return pi();
};
Loading