Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate flaky test-http-pipeline-regr-3332 on SmartOS #7649

Closed
Trott opened this issue Jul 11, 2016 · 6 comments
Closed

Investigate flaky test-http-pipeline-regr-3332 on SmartOS #7649

Trott opened this issue Jul 11, 2016 · 6 comments
Labels
http Issues or PRs related to the http subsystem. smartos Issues and PRs related to the SmartOS platform. test Issues and PRs related to the tests.

Comments

@Trott
Copy link
Member

Trott commented Jul 11, 2016

https://ci.nodejs.org/job/node-test-commit-smartos/3215/nodes=smartos14-64/console

not ok 491 parallel/test-http-pipeline-regr-3332
# FATAL ERROR: node::StreamWrap::DoAlloc(size_t, uv_buf_t*, void*) Out Of Memory
#  1: node::Abort() [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
#  2: node::OnFatalError(char const*, char const*) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
#  3: node::FatalError(char const*, char const*) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
#  4: node::StreamWrap::OnAllocImpl(unsigned long, uv_buf_t*, void*) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
#  5: node::StreamWrap::OnAlloc(uv_handle_s*, unsigned long, uv_buf_t*) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
#  6: uv__read [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
#  7: uv__stream_io [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
#  8: uv__io_poll [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
#  9: uv_run [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
#10: node::StartNodeInstance(void*) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
#11: node::Start(int, char**) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
#12: _start [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/node]
  ---
  duration_ms: 1.729
@Trott Trott added http Issues or PRs related to the http subsystem. test Issues and PRs related to the tests. smartos Issues and PRs related to the SmartOS platform. labels Jul 11, 2016
@Barbosik
Copy link

Barbosik commented Aug 4, 2016

I catch the same error on ubuntu server - my node.js app just crashed

@misterdjules
Copy link

My apologies for being so long to take a look at this.

@Trott Was a core file generated when this process aborted. If so is it possible to get access to that core file?

@Trott
Copy link
Member Author

Trott commented Sep 15, 2016

@misterdjules This happened on CI, so if there was a core file, it would have been on the host used by that. So I'm guessing that by now, any core file is long gone. To be honest, I don't think I've seen this in over a month so I'm not sure it's going to be that easy to reproduce. I guess we could try running it 10000 times on a stress test job and see what happens?

@misterdjules
Copy link

@Trott The core file could still be available, as by default core files generated by an application are stored in the global zone of the server running the virtual machine. I'll take a look on every server running the current build SmartOS VMs, and I'll report back on whether I found that core file.

If we find a core file, we should be able to tell if the out of memory error was due to the process itself using too much memory, or to something else on that machine. My hunch is that it's a case of the latter, but it'd be nice if we could find some evidence of that.

How many times did you see this test fail?

@Trott
Copy link
Member Author

Trott commented Sep 15, 2016

How many times did you see this test fail?

It's entirely possible that this was the only time I'd seen it fail on SmartOS. If you can't find hard evidence that there isn't a problem with the test itself, and if a stress test comes up clean, then I'd be content to close this until either it recurs in CI or someone can provide a way to reproduce the issue with reasonable reliability.

Stress test on 32-bit: https://ci.nodejs.org/job/node-stress-single-test/913/nodes=smartos14-32/console

Stress test on 64-bit: https://ci.nodejs.org/job/node-stress-single-test/914/nodes=smartos14-64/console

@misterdjules
Copy link

misterdjules commented Sep 15, 2016

I won't be able to recover the core file from that crash. However the stress tests seem to have both passed, so I think we can consider this test as non-flaky for now, until we're proven otherwise in the future.

What I'll do though is reach out to the build WG to setup the SmartOS test VMs in such a way that core files are kept at least for a reasonable amount of time that will allow collaborators to investigate future crashes if need be.

Thank you for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. smartos Issues and PRs related to the SmartOS platform. test Issues and PRs related to the tests.
Projects
None yet
Development

No branches or pull requests

3 participants