Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP OutgoingMessage _send in two packets when a Buffer is written #7914

Closed
andrewjjenkins opened this issue Jul 28, 2016 · 4 comments
Closed
Assignees
Labels
http Issues or PRs related to the http subsystem. performance Issues and PRs related to the performance of Node.js. stalled Issues and PRs that are stalled.

Comments

@andrewjjenkins
Copy link

  • Version: node -v: v6.2.2
  • Platform: OS X 10.10.5 and Ubuntu Linux 14.0.4 LTS
  • Subsystem: HTTP

When sending an HTTP outgoing message (HTTP.ServerResponse in this case), the behavior of _send, _flushOutput and _writeRaw cause short HTTP messages that should fit in one packet to often be written in two packets. I believe this is caused by some inefficient cork/uncork.

Here is an example program:

var http = require('http');

var endStr = 'This is the body';
var endBuf = new Buffer(endStr);
var i = 0;

var server = http.createServer(function (req, res) {
  debugger;
  if (i % 2 === 0) {
    console.log('Sending the end as a string');
    res.setHeader('Content-Length', endStr.length);
    res.write(endStr);
  } else {
    console.log('Sending the end as a buffer');
    res.setHeader('Content-Length', endBuf.length);
    res.write(endBuf);
  }
  res.end();
  i++;
}).listen(0, '127.0.0.1', function () {
  console.log('listening on ', server.address());
});

I know that I could use res.end(endBuf) / res.end(endStr) in this case but I'm not because this is just an example. I actually found this in a program that is piping from an HTTP.ClientResponse to an HTTP.ServerResponse, but that is much more complicated to work with. In this application, we're also forwarding the Content-Length header from the origin server (that's why I set Content-Length, too). This is the simplest example I could come up with that demonstrates the behavior.

Actual results:

  • The first request (write(endStr)) gets the response with the HTTP headers and body in one packet.
  • The second request (write(endBuf)) gets the response with the HTTP headers in one packet and the body in the next.
$ node uncorktest.js &
listening on  { address: '127.0.0.1', family: 'IPv4', port: 53440 }
$ sudo tcpdump -i lo0 -lnX tcp port 53440


(another terminal)$ curl http://localhost:53440/
node says: Sending the end as a string
(tcpdump excerpt, note one packet):
16:04:26.224813 IP 127.0.0.1.53440 > 127.0.0.1.53442: Flags [P.], seq 1:117, ack 80, win 12756, options [nop,nop,TS val 4062
97891 ecr 406297885], length 116
        0x0000:  4500 00a8 2b8c 4000 4006 0000 7f00 0001  E...+.@.@.......
        0x0010:  7f00 0001 d0c0 d0c2 5167 4a6e ab94 4704  ........QgJn..G.
        0x0020:  8018 31d4 fe9c 0000 0101 080a 1837 9d23  ..1..........7.#
        0x0030:  1837 9d1d 4854 5450 2f31 2e31 2032 3030  .7..HTTP/1.1.200
        0x0040:  204f 4b0d 0a43 6f6e 7465 6e74 2d4c 656e  .OK..Content-Len
        0x0050:  6774 683a 2031 360d 0a44 6174 653a 2054  gth:.16..Date:.T
        0x0060:  6875 2c20 3238 204a 756c 2032 3031 3620  hu,.28.Jul.2016.
        0x0070:  3232 3a30 343a 3236 2047 4d54 0d0a 436f  22:04:26.GMT..Co
        0x0080:  6e6e 6563 7469 6f6e 3a20 6b65 6570 2d61  nnection:.keep-a
        0x0090:  6c69 7665 0d0a 0d0a 5468 6973 2069 7320  live....This.is.
        0x00a0:  7468 6520 626f 6479                      the.body

(another terminal)$ curl http://localhost:53440/
node says: Sending the end as a buffer
(tcpdump excerpt, note two packets):
16:10:09.708172 IP 127.0.0.1.53440 > 127.0.0.1.53498: Flags [P.], seq 1:101, ack 80, win 12756, options [nop,nop,TS val 406639933 ecr 406639932], length 100
        0x0000:  4500 0098 65d8 4000 4006 0000 7f00 0001  E...e.@.@.......
        0x0010:  7f00 0001 d0c0 d0fa 0f3e d5c2 0f29 e2ae  .........>...)..
        0x0020:  8018 31d4 fe8c 0000 0101 080a 183c d53d  ..1..........<.=
        0x0030:  183c d53c 4854 5450 2f31 2e31 2032 3030  .<.<HTTP/1.1.200
        0x0040:  204f 4b0d 0a43 6f6e 7465 6e74 2d4c 656e  .OK..Content-Len
        0x0050:  6774 683a 2031 360d 0a44 6174 653a 2054  gth:.16..Date:.T
        0x0060:  6875 2c20 3238 204a 756c 2032 3031 3620  hu,.28.Jul.2016.
        0x0070:  3232 3a31 303a 3039 2047 4d54 0d0a 436f  22:10:09.GMT..Co
        0x0080:  6e6e 6563 7469 6f6e 3a20 6b65 6570 2d61  nnection:.keep-a
        0x0090:  6c69 7665 0d0a 0d0a                      live....
...
16:10:09.708261 IP 127.0.0.1.53440 > 127.0.0.1.53498: Flags [P.], seq 101:117, ack 80, win 12756, options [nop,nop,TS val 40
6639933 ecr 406639933], length 16
        0x0000:  4500 0044 5e99 4000 4006 0000 7f00 0001  E..D^.@.@.......
        0x0010:  7f00 0001 d0c0 d0fa 0f3e d626 0f29 e2ae  .........>.&.)..
        0x0020:  8018 31d4 fe38 0000 0101 080a 183c d53d  ..1..8.......<.=
        0x0030:  183c d53d 5468 6973 2069 7320 7468 6520  .<.=This.is.the.
        0x0040:  626f 6479                                body

Expected Results:

  • The first request (write(endStr)) gets the response with the HTTP headers and body in one packet. (This is correct already)
  • The second request (write(endBuf)) should also get the response with the HTTP headers and body in one packet. (This is not what I am observing currently)

My analysis:
I think that the use of cork in _http_outgoing.js is not 100% correct in this case. It looks like in the buffer path we cork, write the headers, uncork and write the body.

If OutgoingMessage.prototype._send is called with a string, it will create one data string of the headers + body portion combined, and then call this._writeRaw() with that combined string. This path works (though pays the price of a) not using node Buffers and b) forcing a string concatenation.

If instead it is called with a Buffer, it executes the else{} case which cleverly unshifts the serialized headers into the output buffer and and then calls _writeRaw with the body portion. So far so good.

OutgoingMessage.prototype._send = function(data, encoding, callback) {
  // This is a shameful hack to get the headers and first body chunk onto
  // the same packet. Future versions of Node are going to take care of
  // this at a lower level and in a more general way.
  if (!this._headerSent) {
    if (typeof data === 'string' &&
        encoding !== 'hex' &&
        encoding !== 'base64') {
      data = this._header + data;
    } else {
      this.output.unshift(this._header);
      this.outputEncodings.unshift('latin1');
      this.outputCallbacks.unshift(null);
      this.outputSize += this._header.length;
      if (typeof this._onPendingData === 'function')
        this._onPendingData(this._header.length);
    }
    this._headerSent = true;
  }
  return this._writeRaw(data, encoding, callback);
};

OK, next _writeRaw actually pushes things out in two stages: it calls _flushOutput and then connection.write:

OutgoingMessage.prototype._writeRaw = function(data, encoding, callback) {
  if (typeof encoding === 'function') {
    callback = encoding;
    encoding = null;
  }

  var connection = this.connection;
  if (connection &&
      connection._httpMessage === this &&
      connection.writable &&
      !connection.destroyed) {
    // There might be pending data in the this.output buffer.
    var outputLength = this.output.length;
    if (outputLength > 0) {
      this._flushOutput(connection);   // <----!!! headers
    } else if (data.length === 0) {
      if (typeof callback === 'function')
        process.nextTick(callback);
      return true;
    }

    // Directly write to socket.
    return connection.write(data, encoding, callback);   // <----!!! body portion
  } else if (connection && connection.destroyed) {
    // The socket was destroyed.  If we're still trying to write to it,
    // then we haven't gotten the 'close' event yet.
    return false;
  } else {
    // buffer, as long as we're not destroyed.
    return this._buffer(data, encoding, callback);
  }
};

_flushOutput does wrap its writes in cork()/uncork() but unfortunately it calls uncork() before the connection.write() of the body hits. On my system this usually forces the writes into two packets.

OutgoingMessage.prototype._flushOutput = function _flushOutput(socket) {
  //...
  socket.cork();
  for (var i = 0; i < outputLength; i++) {
    ret = socket.write(output[i], outputEncodings[i],
                       outputCallbacks[i]);
  }
  socket.uncork();
  //...
}

I hacked this shim into my program which does an extremely hacky and ugly job of calling uncork() only after also writing out the body, and this fixes it for me. See the "!!!" for the changes I made.

http.OutgoingMessage.prototype._writeRaw = function (data, encoding, callback) {
  if (typeof encoding === 'function') {
    callback = encoding;
    encoding = null;
  }

  var connection = this.connection;
  if (connection &&
      connection._httpMessage === this &&
      connection.writable &&
      !connection.destroyed) {
    // There might be pending data in the this.output buffer.
    var outputLength = this.output.length;
    if (outputLength > 0) {
      connection.cork();  //!!! Add a cork before flushOutput
      this._flushOutput(connection);
    } else if (data.length === 0) {
      if (typeof callback === 'function')
        process.nextTick(callback);
      return true;
    }

    // Directly write to socket.
    var rc = connection.write(data, encoding, callback);
    connection.uncork();  // Only uncork after writing the body portion
    return rc;
  } else if (connection && connection.destroyed) {
    // The socket was destroyed.  If we're still trying to write to it,
    // then we haven't gotten the 'close' event yet.
    return false;
  } else {
    // buffer, as long as we're not destroyed.
    return this._buffer(data, encoding, callback);
  }
};

http.OutgoingMessage.prototype._flushOutput = function _flushOutput(socket) {
  var ret;
  var outputLength = this.output.length;
  if (outputLength <= 0)
    return ret;

  var output = this.output;
  var outputEncodings = this.outputEncodings;
  var outputCallbacks = this.outputCallbacks;
  //socket.cork();  //!!! Let the parent handle cork/uncork
  for (var i = 0; i < outputLength; i++) {
    ret = socket.write(output[i], outputEncodings[i],
                       outputCallbacks[i]);
  }
  //socket.uncork();  //!!! Let the parent handle cork/uncork

  this.output = [];
  this.outputEncodings = [];
  this.outputCallbacks = [];
  if (typeof this._onPendingData === 'function')
    this._onPendingData(-this.outputSize);
  this.outputSize = 0;

  return ret;
};
@mscdex mscdex added the http Issues or PRs related to the http subsystem. label Jul 28, 2016
@brendanashworth brendanashworth self-assigned this Aug 2, 2016
@brendanashworth
Copy link
Contributor

Hi @andrewjjenkins, thanks for opening up an issue. I've submitted a pull request to address this: #7946.

@andrewjjenkins
Copy link
Author

Thanks @brendanashworth . I haven't had the time to give this much attention beyond writing the original bug. Looks like it triggered some deeper improvements for node - big thanks for taking on the work to fix this the right way.

@ChALkeR ChALkeR added the performance Issues and PRs related to the performance of Node.js. label Aug 11, 2016
jbergstroem pushed a commit to jbergstroem/node that referenced this issue Oct 7, 2016
This commit opts for a simpler way to batch writes to HTTP clients into
fewer packets. Instead of the complicated snafu which was before, now
OutgoingMessage#write automatically corks the socket and uncorks on the
next tick, allowing streams to batch them efficiently. It also makes the
code cleaner and removes an ugly-ish hack.

Fixes: nodejs#7914
@Trott Trott added the stalled Issues and PRs that are stalled. label Jul 9, 2017
@nrjpoddar
Copy link

Is there a resolution for this bug? PR #7946 seems to be in a stalled state?

@andrewjjenkins
Copy link
Author

It looks like this is fixed. @janakimeyyappan did some testing with node 8.2.1 and no longer sees the issue. We suspect 2e9d9c1 fixed it and if that wasn't enough, then #13013 finished it off (and has good test results). #13522 may also be related - looks like it might be the evolution of #7946 that stalled for a little while.

Thanks @brendanashworth, @jbergstroem and @mscdex .

Since I filed it and it looks to me like it's fixed, I'll mark it closed, but if there's some reason it should remain open that's fine with me too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. performance Issues and PRs related to the performance of Node.js. stalled Issues and PRs that are stalled.
Projects
None yet
Development

No branches or pull requests

6 participants