Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zlib: memory leak with gunzipSync #1479

Closed
targos opened this issue Apr 20, 2015 · 11 comments
Closed

zlib: memory leak with gunzipSync #1479

targos opened this issue Apr 20, 2015 · 11 comments
Labels
zlib Issues and PRs related to the zlib subsystem.

Comments

@targos
Copy link
Member

targos commented Apr 20, 2015

I was running a script to read sequentially a lot of gzipped files, and print on stdout the result of a computation for each file.
After about 16'000 files, the process just stopped and Killed was printed on the terminal.

I guess that the memory used by the process increases until the kernel decides to kill it.

I could reduce the code to this testcase:

'use strict';

var zlib = require('zlib');

var data = 'abcdefghijklmnopqrstuvwxyz';
var gzipped = zlib.gzipSync(data);

while (true) {
    var contents = zlib.gunzipSync(gzipped);
    process.stdout.write(contents.toString() + '\n');
}
> node test.js > /dev/null
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
[1]    15363 abort (core dumped)  node test.js > /dev/null

This is what I see with top command just before the process disappears:

> top
15363 mzasso    20   0 10,240g 3,716g  11776 R 111,0 24,2   0:53.59 node

NB: a similar code, that would just write data in the while loop keeps a stable memory consumption

@Fishrock123 Fishrock123 added the zlib Issues and PRs related to the zlib subsystem. label Apr 20, 2015
@targos targos changed the title memory leak with zlib.gunzipSync zlib: memory leak with gunzipSync Apr 21, 2015
@targos
Copy link
Member Author

targos commented Apr 25, 2015

It also happens on Windows :

node test.js > NUL
#
# Fatal error in ..\..\src\heap\mark-compact.cc, line 2137
# CHECK(success) failed
#

@Fishrock123
Copy link
Contributor

Can you get some deeper info on it? Maybe run it through valgrind?

@dchusovitin
Copy link

Problem in your code. GC isn't called, because you use only blocking methods.

Run "node --expose_gc test.js"

'use strict';

var zlib = require('zlib');

var data = 'abcdefghijklmnopqrstuvwxyz';
var gzipped = zlib.gzipSync(data);

var step = 0;

while (true) {
    step++;

    var contents = zlib.gunzipSync(gzipped);
    process.stdout.write(contents.toString() + '\n');

    if (step % 1000) {
        gc();
    }
}

@Fishrock123
Copy link
Contributor

it's true. Using while (true) without exit is very bad, and makes it easier to cause your own errors.

@targos
Copy link
Member Author

targos commented Apr 25, 2015

Yeah of course, while(true) was for the example. It is not what I was using in my real script.

I had something like this:

glob('**/*.gz', {
    cwd: config.data
}, function (err, files) {
    files.forEach(treatFile); // about 100k files to unzip and treat
});

Now I am using the async version without any issue. The code was just simpler using gunzipSync...

Thanks for your answers :)

@ChALkeR
Copy link
Member

ChALkeR commented May 10, 2015

https://github.com/iojs/io.js/blob/v2.0.1/lib/zlib.js#L458 — this is the line that causes it.
It delays emitting the close event until the next tick, locking this. That's why that objects are not collected, because in sync code everything is running in a single tick.

Looks like even the *Sync methods were created with async in mind.

@ChALkeR
Copy link
Member

ChALkeR commented May 11, 2015

@dchusovitin You are incorrect. Both incremental and full GC runs are automatically called even in synchronous code. Moreover, your code sample doesn't even change anything (with and without fixing the mistype) except for slowing things down.

@trevnorris
Copy link
Contributor

Looks like even the *Sync methods were created with async in mind.

I think it's more that everything *Sync was created after their async counterparts. If something async is happening in a sync call then it's a bug.

@bnoordhuis
Copy link
Member

But the reason to defer the event is to prevent infinite recursion. I'm not sure if this is fixable without breaking something else.

@al6x
Copy link

al6x commented Jan 1, 2016

Same bug with zlib.gzipSync used in a loop - process out of memory.

@ChALkeR
Copy link
Member

ChALkeR commented Mar 19, 2016

#5707 should fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
zlib Issues and PRs related to the zlib subsystem.
Projects
None yet
Development

No branches or pull requests

7 participants