Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash with node alpine but not node slim #288

Closed
daveisfera opened this issue Dec 9, 2016 · 23 comments
Closed

Crash with node alpine but not node slim #288

daveisfera opened this issue Dec 9, 2016 · 23 comments
Labels

Comments

@daveisfera
Copy link

daveisfera commented Dec 9, 2016

Our application crashes V8 when running with node:6.9.2-alpine but not with node:6.9.2-slim. Is there anything I can do to help diagnose the cause of this issue?

@LaurentGoderre
Copy link
Member

What happens when you run docker logs?

@daveisfera
Copy link
Author

daveisfera commented Dec 9, 2016

There's nothing in the logs for the container, but this is in the dmesg logs on the host:

[19594787.401624] V8 WorkerThread[8217]: segfault at 7f4d0e74eff8 ip 00007f4d0d597d8e sp 00007f4d0e74f000 error 6 in node[7f4d0ccdc000+18b6000]

@LaurentGoderre
Copy link
Member

Is there a way you can share the Dockerfile?

@daveisfera
Copy link
Author

The Dockerfile itself is pretty simple but I can't share our code so it wouldn't be much use. If I had an idea of where to start looking, I'd work on creating a stand alone, minimal reproducer, but I'm not sure where to even start with that since I'm not sure what's causing the crash.

@chorrell
Copy link
Contributor

It would be good to know what modules you are using in your app. This could be a v8 issue under Alpine, given that your app works fine using the slim variant (debian).

@pesho
Copy link
Contributor

pesho commented Dec 12, 2016

I'd work on creating a stand alone, minimal reproducer, but I'm not sure where to even start with that since I'm not sure what's causing the crash.

You can try to isolate if a specific require/import or other action in your application is causing the crash.

@LaurentGoderre
Copy link
Member

Do you use node-gyp in your alpine Dockerfile?

@daveisfera
Copy link
Author

Not in the container that's crashing.

@daveisfera
Copy link
Author

I've been unable to make a minimal reproducer because the point at which it crashes jumps around in our code, but it only happens when using ajv to validate GeoJSON using the schema from http://json.schemastore.org/geojson and that's the most I've been able to isolate the issue.

@LaurentGoderre
Copy link
Member

Is any of the dependencies using pre-built binaries, those sometimes have issues. I think the ecosystem in general isn't familiar with alpine and musl yet so a lot of the prebuilt binaries don't work in alpine and you sometimes have to compile them from source.

@daveisfera
Copy link
Author

As far as I can tell, ajv and it's dependencies are pure Javascript so I don't believe it's an issue with pre-built binaries.

If I removed the GeoJSON validation and just ran more data through, then the crash still happens. I also ran the tests from ajv against the Alpine base image in hopes that it would reveal something, but all of the tests passed except for one that relies on an unavailable module (see ajv-validator/ajv#404).

@daveisfera
Copy link
Author

I'm not sure what fixed the issue, but I can no longer reproduce this.

@daveisfera daveisfera reopened this Mar 14, 2017
@daveisfera
Copy link
Author

daveisfera commented Mar 14, 2017

The crashes started happening again and I was able to make a reproducer:

test_ajv.js

#!/usr/bin/env node
'use strict';

const Ajv = require('ajv');
const validator = new Ajv({ allErrors: true, extendedRefs: false });
const _ = require('lodash');

var fs = require('fs');
var parse = require('csv-parse');

const MAX_STRING_LENGTH = 10000;

const STRING_KEY = {
    id: '/StringKey',
    type: 'string',
    maxLength: MAX_STRING_LENGTH,
};

const COLUMN_KEY = {
    id: '/ColumnKey',
    oneOf: [
        { $ref: '/StringKey' },
        { type: ['boolean', 'number'] },
    ],
};

const COLUMN_ARRAY = {
    type: 'array',
    items: { $ref: '/ColumnKey' },
};

const COLUMN_TYPE = {
    id: '/ColumnType',
    oneOf: [
        { $ref: '/ColumnKey' },
        COLUMN_ARRAY,
    ],
};

validator.addSchema(STRING_KEY);
validator.addSchema(COLUMN_KEY);
validator.addSchema(COLUMN_TYPE);

let numValid = 0;
let header;
let validate;
fs.createReadStream(process.argv[2])
    .pipe(parse({delimiter: ','}))
    .on('data', function(csvrow) {
        if (header) {
            let obj = {}
            _.forEach(header, (val, i) => {
                obj[val] = csvrow[i];
            });
            csvrow = obj;
            const valid = validate(csvrow);
            if (!valid) {
                console.log(csvrow);
            } else {
                numValid += 1;
            }
        } else {
            header = csvrow;
            const schema = {
                type: 'object',
                properties: _.zipObject(header, _.map(header, column => { return { $ref: '/ColumnType'}; })),
            };
            //console.log('s:', schema);
            validate = validator.compile(schema);
        }
    })
    .on('end', () => {
        console.log("numValid:", numValid);
    });

Dockerfile

FROM node:6.10.0-alpine

RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

RUN yarn add ajv csv-parse lodash

COPY test_ajv.js /usr/src/app

CMD [ "node", "test_ajv.js" ]

make_ajv_test_csv.py

import random
import csv

random.seed(123456789)

WORDS = ['this', 'works', 'well', 'as', 'a', 'reproducer']

S_COLS = 100
out = csv.DictWriter(open('test_ajv.csv', 'w'), ['n'] + ['s{}'.format(i) for i in range(S_COLS)])
out.writeheader()

for n in range(500):
    row = {'n': n}
    for i in range(S_COLS):
        row['s{}'.format(i)] = ' '.join((random.choice(WORDS) for _ in range(random.randint(1, 20))))
    out.writerow(row)

And then I ran the following:

python ./make_ajv_test_csv.py
mkdir -p /tmp/test_ajv
mv test_ajv.csv /tmp/test_ajv/
docker build -t test_ajv test_ajv/
docker run --rm -v /tmp/test_ajv:/tmp/test_ajv test_ajv node test_ajv.js /tmp/test_ajv/test_ajv.csv

NOTE: I made the test not use the words file from the OS so it's platform independent.

@daveisfera daveisfera changed the title Crash with node:6.9.2-alpine but not node:6.9.2-slim Crash with node alpine but not node slim Mar 15, 2017
@daveisfera
Copy link
Author

daveisfera commented Mar 17, 2017

Here's a completely self contained reproducer that doesn't require reading a CSV file:

#!/usr/bin/env node
'use strict';

const Ajv = require('ajv');
const validator = new Ajv({ allErrors: true, extendedRefs: false });

function getRandomIntInclusive(min, max) {
    min = Math.ceil(min);
    max = Math.floor(max);
    return Math.floor(Math.random() * (max - min + 1)) + min;
}       
        
const MAX_STRING_LENGTH = 10000;

const STRING_KEY = {
    id: '/StringKey',
    type: 'string',
    maxLength: MAX_STRING_LENGTH,
};

const COLUMN_KEY = {
    id: '/ColumnKey',
    oneOf: [
        { $ref: '/StringKey' },
        { type: ['boolean', 'number'] },
    ],  
};          
            
const COLUMN_ARRAY = {
    type: 'array',
    items: { $ref: '/ColumnKey' },
};          
            
const COLUMN_TYPE = {
    id: '/ColumnType',
    oneOf: [    
        { $ref: '/ColumnKey' },
        COLUMN_ARRAY,
    ],      
};          
                
validator.addSchema(STRING_KEY);
validator.addSchema(COLUMN_KEY);
validator.addSchema(COLUMN_TYPE);
            
const WORDS = ['this', 'works', 'well', 'as', 'a', 'reproducer'];
const NUM_COLUMNS = 100;
    
let schema = {
    type: 'object',
    properties: {},
};

let c;
for (c=0; c<NUM_COLUMNS; c++) {
    schema.properties[`s${c}`] = { $ref: '/ColumnType'};
}       
const validate = validator.compile(schema);

const NUM_ROWS = parseInt(process.argv[2] || '500');
console.log(`Testing ${NUM_ROWS} rows`);
let r;
let value;
let i;
for (r=0; r<NUM_ROWS; r++) {
    value = {}
    for (c=0; c<NUM_COLUMNS; c++) {
        const n = getRandomIntInclusive(1, 20);
        const cS = `s${c}`;
        value[cS] = '';
        for (i=0; i<n; i++) {
            value[cS] += ` WORDS[${getRandomIntInclusive(0, 5)}]`;
        }
    }
    
    validate(value);
}

console.log('Done');

Here's the Dockerfile:

FROM node:6.10.0-alpine

RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

RUN yarn add ajv

COPY test_ajv.js /usr/src/app

CMD [ "node", "test_ajv.js" ]

Then run:

docker build -t test_ajv .
docker run --rm test_ajv

@LaurentGoderre
Copy link
Member

I am wondering if the issue could be because PhantomJS used in ajv....I couldn't get PhantomJS to work in alpine without custom building it.

@daveisfera
Copy link
Author

PhantomJS is used in the tests of ajv but not in the code itself. If you look at the package.json file, it's a devDependency ( https://github.com/epoberezkin/ajv/blob/071b81099edd30b0eba96afb6ae1e289b77db163/package.json#L93 ).

@LaurentGoderre
Copy link
Member

It's so weird.....I add console.log(value); and the script finishes

@LaurentGoderre
Copy link
Member

Hmmm, nevermind, it crashes at line 443 out of 500

@LaurentGoderre
Copy link
Member

It actually seems like a race condition.....

@daveisfera
Copy link
Author

I'm not familiar with how v8/node executes things, but from looking at the backtrace from the core dump, it looks like it's getting stuck in a recursive call and exhausting the stack. Here's the backtrace from gdb with a lot of the repeated calls removed:

#0  0x0000564f3d67e63e in v8::internal::HGlobalValueNumberingPhase::CollectSideEffectsOnPathsToDominatedBlock(v8::internal::HBasicBlock*, v8::internal::HBasicBlock*) ()
#1  0x0000564f3d67e6da in v8::internal::HGlobalValueNumberingPhase::CollectSideEffectsOnPathsToDominatedBlock(v8::internal::HBasicBlock*, v8::internal::HBasicBlock*) ()
#2  0x0000564f3d67e6da in v8::internal::HGlobalValueNumberingPhase::CollectSideEffectsOnPathsToDominatedBlock(v8::internal::HBasicBlock*, v8::internal::HBasicBlock*) ()
...
#1287 0x0000564f3d67e6da in v8::internal::HGlobalValueNumberingPhase::CollectSideEffectsOnPathsToDominatedBlock(v8::internal::HBasicBlock*, v8::internal::HBasicBlock*) ()
#1288 0x0000564f3d67e6da in v8::internal::HGlobalValueNumberingPhase::CollectSideEffectsOnPathsToDominatedBlock(v8::internal::HBasicBlock*, v8::internal::HBasicBlock*) ()
#1289 0x0000564f3d67e6da in v8::internal::HGlobalValueNumberingPhase::CollectSideEffectsOnPathsToDominatedBlock(v8::internal::HBasicBlock*, v8::internal::HBasicBlock*) ()
#1290 0x0000564f3d68020c in v8::internal::HGlobalValueNumberingPhase::AnalyzeGraph() ()
#1291 0x0000564f3d6807dd in v8::internal::HGlobalValueNumberingPhase::Run() ()
#1292 0x0000564f3d6b13c5 in void v8::internal::HGraph::Run<v8::internal::HGlobalValueNumberingPhase>() ()
#1293 0x0000564f3d6be624 in v8::internal::HGraph::Optimize(v8::internal::BailoutReason*) ()
#1294 0x0000564f3d640aec in v8::internal::OptimizedCompileJob::OptimizeGraph() ()
#1295 0x0000564f3d8cc379 in v8::internal::OptimizingCompileDispatcher::CompileTask::Run() ()
#1296 0x0000564f3dc67329 in v8::platform::WorkerThread::Run() ()
#1297 0x0000564f3de923b0 in v8::base::ThreadEntry(void*) ()
#1298 0x00007f97cec3a655 in ?? () from /lib/ld-musl-x86_64.so.1
#1299 0x0000000000000000 in ?? ()

@daveisfera
Copy link
Author

Here's a further simplified reproducer and it crashes when 81 columns and 394 rows are used:

#!/usr/bin/env node
'use strict';

const Ajv = require('ajv');
const validator = new Ajv({ allErrors: true, extendedRefs: false });

const STRING_KEY = {
    id: '/StringKey',
    type: 'string',
    maxLength: 10000,
};

validator.addSchema(STRING_KEY);

let schema = {
    type: 'object',
    properties: {},
};

const NUM_COLUMNS = parseInt(process.argv[2] || '81');
console.log(`Testing with ${NUM_COLUMNS} columns`);
let c;
for (c=0; c<NUM_COLUMNS; c++) {
    schema.properties[`s${c}`] = { $ref: '/StringKey'};
}
console.log('schema:', schema);
const validate = validator.compile(schema);

let value = {};
for (c=0; c<NUM_COLUMNS; c++) {
    const cS = `s${c}`;
    value[cS] = '';
}
console.log('value:', value);

const NUM_ROWS = parseInt(process.argv[3] || '394');
console.log(`Testing with ${NUM_ROWS} rows`);
let r;
for (r=0; r<NUM_ROWS; r++) {
    validate(value);
}

console.log('Done');

@daveisfera
Copy link
Author

I just tried with node 6.10.2 and it now crashes with a smaller number of columns (70) but the same number of rows is required (394).

@daveisfera
Copy link
Author

A fix has been committed upstream: nodejs/node#11991 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants