Compilation fails when run on a server with 122GB of RAM #164

andreausu · 2017-05-29T10:44:08Z

Hello,

we're running into a weird issue on our CI pipeline when we use an AWS server with 122GB of RAM (i3.4xlarge):

./node_modules/.bin/elm-make src/Main.elm 
[                                                  ] - 0 / 2Stack space overflow: current size 99136 bytes.
Use `+RTS -Ksize -RTS' to increase it.
elm-make: thread blocked indefinitely in an MVar operation

Using a i3.2xlarge instance instead that "only" has 61 GB of RAM works just fine.

The build is run inside a docker container so we are 100% sure that the software / environment is identical between the 2 nodes and the underlying OS as well since we're spinning those up using the same AMI and in a fully automated manner.

We've also tried limiting the available RAM via docker-compose configuration to no avail.

Do you have any idea of what's going on or how to debug this further?

Elm version: elm-make 0.18 (Elm Platform 0.18.0)
Base docker image: elixir:1.4.2 (Debian Jessie)

Best,
Andrea

The text was updated successfully, but these errors were encountered:

process-bot · 2017-05-29T10:44:08Z

Thanks for the issue! Make sure it satisfies this checklist. My human colleagues will appreciate it!

Here is what to expect next, and if anyone wants to comment, keep these things in mind.

andys8 · 2017-09-15T17:42:37Z

I'm experiencing the same issue on CI with Jenkins running on a Kubernetes cluster on AWS infrastructure. I can't say for now which EC2 instance type is used.

alienscience · 2017-09-18T16:41:33Z

We have also hit this issue when running elm-make from Kubernetes on servers with 120GB RAM:

Stack space overflow: current size 99136 bytes.
Use `+RTS -Ksize -RTS' to increase it.
elm-make: thread blocked indefinitely in an MVar operation

This much memory is not available to elm-make, instead Kubernetes limits the build to 4GB. We have noticed when Kubernetes runs the build on a smaller node (62GB or less), the build succeeds without a stack overflow.

andys8 · 2017-10-02T13:14:06Z

Out workaround for now is to build our own version of elm-make from source with the flag -rtsopts. This enables haskell runtime flags at for runtime: -N, -M and -K can be used to adjust CPU and memory.

evancz · 2018-03-07T03:26:23Z

I think this is the same as elm/compiler#1473 and is related to various oddities in Haskell (e.g. multi-threaded GC and CPU miscounting)

Anyway, there is tons of advice in that other issue, and it is becoming clearer how to work around the Haskell oddities in our binaries.

andys8 · 2018-03-07T09:22:59Z

@evancz There is no known workaround for the memory issues that is not recompiling the elm compiler. The cpu issue could be solved in the same way, by enabling rtsopts, but it isn't the same issue.

The way I understand it, the merged changes to the node-elm-compiler (rtfeldman/node-elm-compiler#65) are passing flags to the compiler, but will only work with enabled rtsopts which is not the case.

It would be a big enhancement and a simple solution to add the flag by default. Otherwise teams have to recompile the compiler, host it and make it available in ci builds. It makes things hard to promote elm in any way if it starts which a quirky ci setup like this. I would appreciate the changes a lot and it would make the cpu configuration easier, too.

zwilias · 2018-03-07T11:21:11Z

That is probably what will happen.

The gist is that optimal (and in extreme cases like this, workable) RTS settings depend on specifics of the hardware. Potentially, a binary could "self configure" based on this information. If that is not possible in a reasonable timespan, providing sane defaults and the option to override without recompiling the binaries sounds like a good alternative.

As mentioned in this comment, we're looking into improving the situation 👍

andys8 · 2018-03-07T11:41:46Z

Thanks for the update regarding the current state.

andreausu mentioned this issue May 29, 2017

Compilation fails when run on a server with 122GB of RAM elm-lang/elm-repl#154

Closed

andys8 mentioned this issue Sep 16, 2017

Performance reduced when running on multi-core ( which is by default ) #159

Closed

andys8 mentioned this issue Oct 2, 2017

GHC option: Enable rtsopts #179

Open

evancz closed this as completed Mar 7, 2018

andys8 mentioned this issue Mar 7, 2018

Add support for +RTS/-RTS flags from GHC rtfeldman/node-elm-compiler#65

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compilation fails when run on a server with 122GB of RAM #164

Compilation fails when run on a server with 122GB of RAM #164

andreausu commented May 29, 2017 •

edited

Loading

process-bot commented May 29, 2017

andys8 commented Sep 15, 2017

alienscience commented Sep 18, 2017

andys8 commented Oct 2, 2017 •

edited

Loading

evancz commented Mar 7, 2018

andys8 commented Mar 7, 2018

zwilias commented Mar 7, 2018

andys8 commented Mar 7, 2018

Compilation fails when run on a server with 122GB of RAM #164

Compilation fails when run on a server with 122GB of RAM #164

Comments

andreausu commented May 29, 2017 • edited Loading

process-bot commented May 29, 2017

andys8 commented Sep 15, 2017

alienscience commented Sep 18, 2017

andys8 commented Oct 2, 2017 • edited Loading

evancz commented Mar 7, 2018

andys8 commented Mar 7, 2018

zwilias commented Mar 7, 2018

andys8 commented Mar 7, 2018

andreausu commented May 29, 2017 •

edited

Loading

andys8 commented Oct 2, 2017 •

edited

Loading