Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

stream: write small messages cause massive memory usage #6623

Closed
trevnorris opened this issue Dec 2, 2013 · 33 comments
Closed

stream: write small messages cause massive memory usage #6623

trevnorris opened this issue Dec 2, 2013 · 33 comments
Assignees
Milestone

Comments

@trevnorris
Copy link

If I write until writable.write() returns false when writing many small buffers Node consumes a lot of memory and the process seems to halt:

var net = require('net');

net.createServer(function(conn) {
  conn.on('drain', function() {
    write(conn);
  });
  write(conn);
}).listen(8000, function() {
  console.log('Listening on port 8000');
});

var SIZE = 1024;
var b = new Buffer(SIZE);
var cntr = 0;

b.fill('a');

function write(conn) {
  while(conn.write(b)) {
    cntr += SIZE;
  }
}

var t = process.hrtime();
setInterval(function() {
  t = process.hrtime(t);
  var mib = cntr * 8 / 1024 / 1024 / (t[0] + t[1] / 1e9);
  console.log(mib.toFixed(2) + ' Mb/sec');
  cntr = 0;
  t = process.hrtime();
}, 2000);

As soon as the connection is made the setInterval() stops printing and memory begins to fill up quickly.

Here's another example that definitely shouldn't be able to fill up memory:

var net = require('net');

var SIZE = 1024;
var b = new Buffer(SIZE);
var cntr = 0;

b.fill('a');

net.createServer(function(conn) {
  (function write() {
    // Notice here we're passing write() as the flush callback.
    conn.write(b, write);
    cntr += SIZE;
  }());
}).listen(8000, function() {
  console.log('Listening on port 8000');
});

var t = process.hrtime();
setInterval(function() {
  t = process.hrtime(t);
  var mib = cntr * 8 / 1024 / 1024 / (t[0] + t[1] / 1e9);
  console.log(mib.toFixed(2) + ' Mb/sec');
  cntr = 0;
  t = process.hrtime();
}, 2000);

I run these tests by starting up the Node process then running nc localhost 8000 > /dev/null.

The second one doesn't cause the same memory issue in v0.10, while the first one does.

@trevnorris
Copy link
Author

I believe this is related to #6506

@ghost ghost assigned isaacs Dec 2, 2013
@trevnorris
Copy link
Author

Here's a slight variation that doesn't cause the blocking, but does still cause memory to spike rapidly:

var net = require('net');
var PORT = 8000;
var SIZE = 1024;
var b = new Buffer(SIZE).fill('a');
var cntr = 0;

net.createServer(function(conn) {
  write(conn);
}).listen(PORT, function() {
  console.log('Listening on port 8000');
});


function write(conn) {
  var lcntr = 0;
  do {
    cntr += SIZE;
    lcntr += SIZE;
  } while (conn.write(b) && lcntr < 0x40000000);

  setImmediate(write, conn);
}

setInterval(function() {
  var mib = cntr / 1024 / 1024;
  console.log(mib.toFixed(2) + ' MB');
}, 2000);

@trevnorris
Copy link
Author

Here's a possible solution:

diff --git a/lib/_stream_writable.js b/lib/_stream_writable.js
index 6404821..dbd1bc2 100644
--- a/lib/_stream_writable.js
+++ b/lib/_stream_writable.js
@@ -78,6 +78,14 @@ function WritableState(options, stream) {
   // socket or file.
   this.length = 0;

+  // If we're writing many small requests then it's possible to starve
+  // the event loop since write() will always return true and the
+  // callback will always be called.
+  // This keep track of the total number of bytes written before we've
+  // deferred to the event loop. It will defer after HWM bytes have
+  // been written.
+  this._loopWritten = 0;
+
   // a flag to see when we're in the middle of a write.
   this.writing = false;

@@ -234,8 +244,15 @@ function writeOrBuffer(stream, state, chunk, encoding, cb) {
   var len = state.objectMode ? 1 : chunk.length;

   state.length += len;
+  state._loopWritten += len;

   var ret = state.length < state.highWaterMark;
+
+  if (state._loopWritten > state.highWaterMark) {
+    ret = false;
+    state._loopWritten = 0;
+  }
+
   state.needDrain = !ret;

   if (state.writing || state.corked)

Though there're two issues. First is that if there's a callback (i.e. my second example) then it'll still starve the event loop because the callback is called synchronously. Second, it's causing a test to fail.

/cc @tjfontaine @isaacs

@ashelley
Copy link

@trevnorris

I posted this right after you closed the related issue. This comment may still be relevant (I think):

#8048 (comment)

Please note that I don't think this is just memory related. There seems to be latency which i'm not sure is related to the memory utilization problem.

@ashelley
Copy link

Hello,

After reviewing this some more... I have come up with a proposed patched (needs more testing) that I think solves my problem.

I think the problem that exists is that clearbuffer calls .slice on the buffer too much. Basically for every queued write we are manipulating a large array and this causes the extra latency on writes. Instead we should just null the values in the buffer and skip past them.

Here is my proposed patch:

https://gist.github.com/ashelley/bf4cc9dd22aec8ddd8df

EDIT: removed erroneous edits on patch file and please note this applies to v0.10.0

@ashelley
Copy link

Can't figure out the minimum test case but here it is:

FATAL ERROR: JS Allocation failed - process out of memory
Aborted (core dumped)
node -v
v0.10.30

Only thing changing in my code to stop this from happening is changing how often i write to zlib.createDeflate stream.

I know this isn't much help but wanted to document it anyway.

EDIT: probably just related to high memory usage as OP noted

@dipbhi
Copy link

dipbhi commented May 7, 2015

Is there any fix planned for this issue? I am still encountering it in v. 0.12.2 and it's becoming a road block for a product I am working on :(.

@trevnorris
Copy link
Author

@dipbhi I haven't taken another look at this since it my last post here. The quickest solution for you would be to occasionally use setImmediate() before continuing with the write. I'll put this back on my list of things to investigate.

@dipbhi
Copy link

dipbhi commented May 8, 2015

Ok. I will try setImmediate. Thanks for the suggestion.
On 08-May-2015 4:47 am, "Trevor Norris" notifications@github.com wrote:

@dipbhi https://github.com/dipbhi I haven't taken another look at this
since it my last post here. The quickest solution for you would be to
occasionally use setImmediate() before continuing with the write. I'll
put this back on my list of things to investigate.


Reply to this email directly or view it on GitHub
#6623 (comment).

@dipbhi
Copy link

dipbhi commented May 8, 2015

By the way, would it be bad to call setImmediate before each write call?
On 08-May-2015 8:30 am, "Dipen Bhikadya" dipen.bhikadya@gmail.com wrote:

Ok. I will try setImmediate. Thanks for the suggestion.
On 08-May-2015 4:47 am, "Trevor Norris" notifications@github.com wrote:

@dipbhi https://github.com/dipbhi I haven't taken another look at this
since it my last post here. The quickest solution for you would be to
occasionally use setImmediate() before continuing with the write. I'll
put this back on my list of things to investigate.


Reply to this email directly or view it on GitHub
#6623 (comment).

@ashelley
Copy link

ashelley commented May 8, 2015

@dipbhi if you can listen for the write to return false then wait for the drain event to be emitted you probably won't see as much memory usage...

pseudo code:

if(!stream.write(chunk)) {
      // stop writing
     stream.once('drain', function() {
          //continue writing
     });
}

alternatively you can see watch stream._writableState.needDrain then wait for the drain event. This will be more efficient than calling setImmediate on each write AFAIK.

@dipbhi
Copy link

dipbhi commented May 8, 2015

@ashelley,
Nice suggestions, I will try them out. Thank you

On Fri, May 8, 2015 at 10:08 AM, ashelley notifications@github.com wrote:

@dipbhi https://github.com/dipbhi if you can listen for the write to
return false then wait for the drain event to be emitted you probably won't
see as much memory usage...

pseudo code:

if(!stream.write(chunk) {
// stop writing
stream.once('drain', function() {
//continue writing
});
}

alternatively you can see watch stream._writableState.needDrain then wait
for the drain event. This will be more efficient than calling setImmediate
on each write AFAIK.


Reply to this email directly or view it on GitHub
#6623 (comment).

@ashelley
Copy link

ashelley commented May 8, 2015

Actually, after reviewing this again the original example of using the network stream the highWaterMark is never hit as originally reported. So the solution might depend on the type of stream you are working with.

As @trevnorris mentioned setImmediate is your only option if you are using the socket stream. One thing I did notice though that instead of monitoring stream._writableState.needDrain you can monitor stream._writableState.pendingcb to throttle your writes:

var net = require('net');

net.createServer(function(conn) {
  conn.on('drain', function() {
    write(conn);
  });
  write(conn);
}).listen(8000, function() {
});

var SIZE = 1024;
var b = new Buffer(SIZE);
var cntr = 0;

b.fill('a');

function write(conn) {
  while(conn.write(b)) {
    if((conn._writableState.pendingcb * SIZE) > conn._writableState.highWaterMark) {
      return setImmediate(function() {
        write(conn);
      });
    }
    cntr += SIZE;
  } 
}

var t = process.hrtime();
setInterval(function() {
  t = process.hrtime(t);
  var mib = cntr * 8 / 1024 / 1024 / (t[0] + t[1] / 1e9);
  console.log(mib.toFixed(2) + ' Mb/sec');
  cntr = 0;
  t = process.hrtime();
}, 2000);

Edit:

you might get higher throughput if you lower the number of setImmediate calls:

-    if((conn._writableState.pendingcb * SIZE) > conn._writableState.highWaterMark) {
+   if((conn._writableState.pendingcb) > conn._writableState.highWaterMark) {

original:

1758.35 Mb/sec callbacks= 28146
1778.55 Mb/sec callbacks= 28454
1774.77 Mb/sec callbacks= 28392
1724.84 Mb/sec callbacks= 27601
1717.45 Mb/sec callbacks= 27469
1739.78 Mb/sec callbacks= 27833
1765.43 Mb/sec callbacks= 28244
1785.60 Mb/sec callbacks= 28567
1804.45 Mb/sec callbacks= 28867
1778.93 Mb/sec callbacks= 28460

with reduced number of setimmediates:

509.57 Mb/sec callbacks= 8
2094.21 Mb/sec callbacks= 33
2123.88 Mb/sec callbacks= 34
2123.32 Mb/sec callbacks= 34
2113.31 Mb/sec callbacks= 34
2096.39 Mb/sec callbacks= 33
2135.78 Mb/sec callbacks= 34
2128.26 Mb/sec callbacks= 34
2136.09 Mb/sec callbacks= 34
2104.68 Mb/sec callbacks= 33
2161.67 Mb/sec callbacks= 34
2100.41 Mb/sec callbacks= 33

@dipbhi

@trevnorris
Copy link
Author

@ashelley Great points. The addition to reducing the number of calls to setImmediate() is a great tip on improving performance.

@dipbhi
Copy link

dipbhi commented May 11, 2015

@ashelley & @trevnorris
Using setImmediate seems to be an excellent solution to control memory usage with one client server pair, and @ashelley's throttling suggestion has an added advantage.

However when I try to use similar solution with multiple clients (e.g. chat room) memory usage still goes up and server does seem to have a memory leak as well. I am hoping that I can pull your brain a little more to see if there is any workaround for multi-client situation as well.

Here is my sample code (server should be run with --expose-gc):

Server:

var net = require('net'), conns = [];

net.createServer(function(conn) {
    console.log('client connected');
    conns.push(conn);

    conn.on('data', function(buf){
        distribute(conn, buf);
    });

    conn.on('error', function(err){
        //console.error(err);
        removeConn(conn);
    })

    conn.on('end', function(){
        console.log('client disconnected');
        removeConn(conn);
    });

}).listen(8030);

var SIZE = 1024;
var b = new Buffer(SIZE);
var cntr = 0;
b.fill('a');

function distribute(conn, buf) {
    for(var i = 0, len = conns.length; i < len; i++){
        var targetConn = conns[i];
        if(targetConn != conn){
            write(targetConn, buf); 
        }
    } 
}

function write(conn, buf) {
    if(conn){
        if(conn._writableState.pendingcb > conn._writableState.highWaterMark) {
            return setImmediate(function() {
                write(conn, buf);
            });
        }
        else{
            conn.write(buf, function(err){
                if(err){
                    removeConn(conn);
                }
            });
        }
    }
    cntr += SIZE;
}

function removeConn(conn){
    var connIndex = conns.indexOf(conn);
    if(connIndex > 0){
        conns.splice(connIndex, 1);
    }   
}

var t = process.hrtime();
setInterval(function() {
  t = process.hrtime(t);
  var mib = cntr * 8 / 1024 / 1024 / (t[0] + t[1] / 1e9);
  console.log(mib.toFixed(2) + ' Mb/sec');
  cntr = 0;
  t = process.hrtime();
}, 2000);

setInterval(function(){
    console.log('calling gc');
    gc();
}, 30000);

Client:

var cp = require('child_process'),
    numOfChildren = 100,
    runningChildren = numOfChildren;    

for(var i = 0; i < numOfChildren; i++){
    var child = cp.fork('child.js', {silent: true});

    child.on('close', function(){
        runningChildren--;
        if(runningChildren == 0){
            console.log('Execution completed.');
        }
    });

    child.stdout.on('data', function(data){
        console.log('' + data);
    });

    child.stderr.on('data', function(data){
        console.error(data.toString());
    });     

}

Client Child

var net = require('net'), 
    numOfMsgs = 1000, 
    receivedCount = 0, 
    receivedPrintFreq = 1000,
    host = '127.0.0.1',
    totalBufferSize = 0, 
    writeCount = 0;

var conn = net.connect({port: 8030, host: host}, function() {
    console.log('Connected to the server, pid - %d', process.pid);
});

conn.on('data', function(buffer){
    receivedCount++;
    totalBufferSize += buffer.length;
    if(receivedCount % receivedPrintFreq === 0) console.log('%d messages received, buffer size - %d, pid - %d', receivedCount, totalBufferSize, process.pid);
});

conn.on('error', function(err) {
  console.error(err);
});

conn.on('end', function() {
  console.log('Disconnected from the server, pid - %d', process.pid);
});

var SIZE = 1024;
var b = new Buffer(SIZE);

b.fill('a');

for(var i = 0; i < numOfMsgs; i++){
    conn.write(b, function(){
        writeCount++;
        if(writeCount === numOfMsgs){
            setImmediate(function(){
                conn.end(); 
            });
        }
    });
}

Edit
Memory usage/leak gets worse when running server and client on different machines (and using real internet network traffic).

@saquibkhan
Copy link

Can you try on node v0.12.3 ? I tried the first test and i didn't see any memory leaks.

@dipbhi
Copy link

dipbhi commented May 12, 2015

@saquibkhan I was using latest release v0.12.2. Is there any installer available for v0.12.3?

@saquibkhan
Copy link

hmm, v0.12.2 should also be fine. i used below code to test me leaks, i could see me leak on v0.10 but not on v0.12 dev branch

var net = require('net');

net.createServer(function(conn) {
  conn.on('drain', function() {
    write(conn);
  });
  write(conn);
}).listen(8000, function() {
  console.log('Listening on port 8000');
});

var SIZE = 1024;
var b = new Buffer(SIZE);
var cntr = 0;

b.fill('a');

function write(conn) {
  while(conn.write(b)) {
    cntr += SIZE;
  }
}

var t = process.hrtime();
setInterval(function() {
  t = process.hrtime(t);
  var mib = cntr * 8 / 1024 / 1024 / (t[0] + t[1] / 1e9);
  console.log(mib.toFixed(2) + ' Mb/sec');
  cntr = 0;
  t = process.hrtime();
}, 2000);

@dipbhi
Copy link

dipbhi commented May 12, 2015

@saquibkhan OK. How about the multi-client scenario I posted later. Does that leak on the dev branch?

@jasnell
Copy link
Member

jasnell commented Aug 27, 2015

@trevnorris ... what do you want to do with this one?

@trevnorris
Copy link
Author

@jasnell This is an issue because we call the callback sync if all data was flushed immediately. Which prevents the loop from turning over, thus not allowing full cleanup.

We could do a simple patch to force async after N calls. Same thing that libuv does. After 49,000 events are processed in the poll phase it forces the event loop to proceed. Should be a simple patch, but never got around to it.

@d3x0r
Copy link

d3x0r commented Apr 1, 2016

My issue is more along the lines of #8048 but they say that that is a dup of this.

"use strict";

var fs = require( 'fs' );
var wstream;
var fileNum = 1;
var basePath = "./";
var baseName = "output_file";
var maxlimit = 100000000;
var limit = 1000000;
var subCount = 0;
var cardGen = 0;
var single_buffer = new Buffer( 36 );

function doGeneration() {
    var run = true;
        // done generating
        if( cardGen >= maxlimit )
            return;
       cardGen += limit;
    if( !wstream ) {
      console.log( `create streams... ${fileNum}`);
      wstream = fs.createWriteStream(  basePath + baseName + `.${fileNum}.dat` );
      wstream.on( "close", ()=>{ console.log( "Fail closed." );
      setTimeout( ()=>{ global.gc(); doGeneration() }, 10 );
        } );
      wstream.on( "end", ()=>{ console.log( "file ended." );} );
      fileNum++;
    }
    for(subCount = 0; subCount < limit; subCount++ ) {
        wstream.write( single_buffer );
    }
    {
      console.log( "end stream?" );
      wstream.end();
      wstream = null;
    }
}

doGeneration();


That sample even uses a setTimeout after the file closes... so everything should be flushed and done before the next iteration of this process.
My Original process I noticed the problem when I didn't use setTimeout(); but when I would, then I would get huge delays between runs when nothing was happening (no disk, no memory)... this stripped sample slowly runs out of memory even with the global.gc(); on every file close... takes a couple minutes; generates a few hundred megs of data.

node --expose-gc --max-old-space-size=512 test.js >err 2>&1
(hmm... still used over 1.6G even with smaller limit )

but unfortunately that doesn't really run out of memory like my actual process though... but making it might give me a work around to wait until the close event to dispatch another run.

@ashelley
Copy link

ashelley commented Apr 1, 2016

@d3x0r if you listen for the return value of write and wait for the drain event to happen on wstream before continuing you probably won't see the problem. in other words... you can't write in a for loop without listening to the return value of write. when write returns false you have to wstream.once('drain', () => {//start running loop again})

@d3x0r
Copy link

d3x0r commented Apr 2, 2016

@ashelly thanx; I will attempt that monday... other fish to shish-kabob. I see there's often a 'right' way that's the only way. I'm a little astounded that the file can't just evaporate with all it's buffers... even if it takes a 'while' ... at least scheduling in the on("close",()=>{ global.gc(); doGenerate() } ); /* generator.next()? */ it's hovering at 1.2G more than halfway done.
I still don't know which direction to go to simulate the stall condition though; it's not the first time that I did setTimeout(f,0) and has the program just 'stop' but using setTimeout(f, 8); worked OK. and that is in a serial process that does a bit of work on a single clause of data and then posts the results and reschedules to get another. The read/write is actually in in the cluster.master. but I guess that could use once( 'drain' ) /* ?once? nice. */

"use strict";

var fs = require( 'fs' );
var fileNum = 1;
var basePath = "./";
var baseName = "output_file";
var maxlimit = 100000000;
var limit = 1000000;
var cardGen = 0;
var single_buffer = new Buffer( 36 );

function doGeneration() {
   // for( generation = 0; gneration < 1000; generation++ ) { 
    var wstream;
    if( cardGen >= maxlimit )   return;
    if( !wstream ) {
      //console.log( `create streams... ${fileNum}`);
      wstream = fs.createWriteStream(  basePath + baseName + `.${fileNum}.dat` );
      fileNum++;
    }
    for(var subCount = 0; subCount < limit; subCount++ )  wstream.write( single_buffer );
    cardGen += subCount;
    wstream.end();
    wtream - null;
   //}
    return doGeneration();
}

setTimeout( doGeneration );

@d3x0r
Copy link

d3x0r commented Apr 2, 2016

I updated my comment to include the sample I'd rather do, without all the bells and whistles of any events and possible reverse-reference linking...


I see, and the next logical step is after implementing on( 'drain', ()=>{} ) for output write failures, if the input is readline file stream, the lines don't stop immediately... but it could be possible to end() the file before the drain fires. in that case the drain never fires, and never unlocks the thing... so you have to keep the closed

This version shouldn't run out of memory.... and yet it does. Hooray for garbage collection (NOT!)

"use strict";

var fs = require( 'fs' );
var fileNum = 1;
var basePath = "./";
var baseName = "output_file";
var maxlimit = 100000000;
var limit = 1000000;
var subCount = 0;
var cardGen = 0;
var single_buffer = new Buffer( 36 );
var blocked = false; // external things use this to see if they can continue...
var wstream = null;;
var wstreamClosed = null;;

function doGeneration() {
    if( cardGen >= maxlimit )   return;
    if( !wstream ) {
      wstream = fs.createWriteStream(  basePath + baseName + `.${fileNum}.dat` );
      wstream.on( 'drain', () =>{ 
         if( wstreamClosed ) {
           wstreamClosed.end();
           wstreamClosed = null;
        }
        blocked = false;
        setTimeout( doGeneration ); // continue generation...
      } );
      fileNum++;
    }
    for(  ; !blocked && subCount < limit; subCount++ )  
        if( !wstream.write( single_buffer ) )
       {
            blocked = true;
            return;
       }
    if( !blocked )
    {
        cardGen += subCount;
        subCount = 0;
        if( blocked )  wstreamClosed = wstream;
        else       wstream.end();
        wstream = null;
        setTimeout( doGeneration )
        //return doGeneration();
    }
}

setTimeout( doGeneration );

@d3x0r
Copy link

d3x0r commented Apr 5, 2016

@ashelley I updated the sample to include drain event listening and protection to avoid end()ing the file... (see prior post)

So even doing it the 'proper' way doesn't fix leaks.
by 45000000 it's at 700MB.

@ashelley
Copy link

ashelley commented Apr 5, 2016

i'm not sure how to limit 'new node''s memory usage but this code works fine and completes successfully in a small memory footprint. I ran your above code sans es6 syntax in node 0.12.7 and it works fine: node --max_old_space_size=20 test.js

I doubt that --max_old_space_size works on new versions of node as it probably allocates memory differently and requires a different command line switch to "properly limit it".

The key is that once you are listening to drain you don't go into "mystery wait times" and everything performs as expected.

I'd look for answers regarding on how to limit node's memory usage from command line options and you will see that with your above code you will get your expected results.

@d3x0r
Copy link

d3x0r commented Apr 5, 2016

(The sample code as is, as of this time
it doesn't apparently limit the memory...
It does allow increase above 1.5G for 64bit node... 5120 for instance works....

the code in a nightly build from a few weeks ago does generate large files...
but it might be disk performance vs cpu performance... if you have a slower cpu maybe you won't get disk bottle necked... was worried it was something timing sensitive since the GC works in a schedule of time maybe it's starved? I didn't addditionally apply manual gc to it to see.

When I first was diagnosing this, I used electron which uses node, and used its debugging tools...but when I paused the code so I could do a heap snapshot the heap cleaned itself and went back to 340M (typical for tables kept in memory) ... from 1G.. 700M just because of writing to a file is obscene.

core i7-6700 3.4ghz
ST1000D1M003-1ER162 (probably http://www.newegg.com/Product/Product.aspx?Item=N82E16822148840 )
node V5.8.0


And it's not even like this is a threading problem. I'm definitely providing a time on the stack when there is 0 program events. Collision with doing end() while it is in the state of needing to emit drain()?

I did try with once, but the input was a readline thing, and .pause() does... after 20 some messages. And setting up a .once handler blinding not checking to see if I had one set already caused a over-scheduling warning from addEventListener.

It just gets more complex when the linereader feeds a cluster that has work queues that come out and feed the output to a file... it's like 'oh by the way, when the cluster gives you an event, don't do anything with it because I have to wait for a drain event'


This is why people say 'use a real streaming library like XXXXX' that has 20+ other dependancies itself.... I could rework this as transform filters maybe that wouldn't leak as much?

@ashelley
Copy link

ashelley commented Apr 6, 2016

internally there is an array that gets sliced 1 time per write. so if you don't wait for drain the internal buffer grows huge... to reproduce the issue create a large array and remove each element by slicing it from the bottom. it's slow. by listenning to the drain event you cause the internal buffer to not grow huge so you don't end up eventually waiting for an array to get sliced inefficiently once per write. it's definitely a bug in node not being able to flush writes out in batches and inefficiently handling the internal buffer.

@d3x0r
Copy link

d3x0r commented Apr 6, 2016

Okay... so let's suppose wstream is an array that gets things added, and slices a bit off every time and keeps adding things... the drain event supposedly when wstream.length == 0

Edit3: I lied it's losing about 4k per second after running a half hour or so... and it's up to 42M...

In ideal conditions this appears to go to 39M and stay there. It started at 22M or so...
I don't know if I would have to tune the constants to make better pressure on that sort of thing....

I suppose I could copy the bytes flat into the buffer... that would be more like write.

"use strict";

var fileNum = 1;
var maxlimit = 100000000000000;
var limit = 1000000;
var subCount = 0;
var cardGen = 0;
var single_buffer = new Buffer( 36 );
var wstream = null;;

function doGeneration() {
    if( cardGen >= maxlimit )   return;
    if( !wstream ) {
      wstream = [];

   }
    for(  ; subCount < limit; subCount++ )  
    {
        wstream.push( single_buffer/*.clone() ? */ );
        if( ( subCount % 1000 ) == 0 ) {
            wstream = wstream.slice( -Math.floor( wstream.length *0.10 ) );
            //console.log( `slice is... ${wstream.length}` ); // goes from 100 to 111 and stays
        }
    }

    subCount = 0;
    cardGen += limit;    
    //console.log( `stream is ${wstream.length}` );  /// finally flush everything eventually and relase it?
    // actually this starts another buffer.. probably the 111 is kept around forever because the new file
    // will have priority over the old file even though the old file does eventually drain and close? (does it?)
    wstream = null;
    setTimeout( doGeneration )
    //return doGeneration();
}

setTimeout( doGeneration );

@ashelley
Copy link

ashelley commented Apr 6, 2016

the code has changed since i last looked but you are better off trying to debug the actual node code i'd say instead of trying to mock it:

https://github.com/nodejs/node/blob/v4.4.2-proposal/lib/_stream_writable.js#L398-L412

EDIT: i think it was stated above that it is because this cb never called async but i could be wrong.

https://github.com/nodejs/node/blob/v4.4.2-proposal/lib/_stream_writable.js#L355

EDIT2: I actually added global.gc() back to your code and it started garbage collecting keeping the memory at around 300-400MB (err maybe not)

EDIT3: after playing around with it a bit... node 0.12 doesn't go crazy with memory usage with this code but writes a lot slower (more cpu i think too). 4.4.2 and 5.10.1 write a lot faster but eat memory so i believe the issue here is totally different depending on which version of node you are using.

@d3x0r
Copy link

d3x0r commented May 12, 2016

Still broken in 6.1.0 bump (or is bumping bannable?)

  1. that sort of project is done now and I have no need to fix it myself? (I am inclined to be a nuisance every few months as new versions come out and the test still fails)
  2. others use external libraries for streaming that do better jobs?
  3. is it a core v8 issue? Is this a dup of their bug? Because I played with doing some file system objects and I had to inject a 1.4 second wait before global.gc would gc. It's as simple as having a destructor defined, and waiting patently. ( a file system may have multiple files in itself too which may further complicate the waiting)... and I can see starving it like this is a sin; (fixable with appropriate options and features; but couldn't the native parts somehow kick the garbage collector for me?)

@bnoordhuis
Copy link
Member

@d3x0r This repository is for archival purposes only. If you believe it's still relevant, please file a new issue here but please search existing issues first.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants