Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tls: improve TLSSocket & Server performance #15575

Closed
wants to merge 4 commits into from

Conversation

apapirovski
Copy link
Member

@apapirovski apapirovski commented Sep 23, 2017

While working on http2 I spent some time digging around tls and I noticed that there were a whole lot of unnecessary closures and function calls throughout. I decided to go through and try to eliminate as many of them as possible. Turns out some of these didn't even need much changed (such as all the _handle events because the _handle already knows which TLSSocket it belongs to).

I tried to limit the churn as much as possible but there definitely is some of it. I've avoided making unnecessary variable adjustments from var to let, const except where the line in question (or related block) was already changing. I also limited all my changes to performance and kept all the logic the same. While the existing tests cover a very solid amount of this module, I didn't want to potentially introduce bugs in uncovered edge cases.

It's a bit hard to get a good measure of performance for these changes as so much of this performance is being limited elsewhere but I created a basic benchmark for TLSSocket creation which clocked in at (and do note that this is limited by the C++ and other code, there's a note re: JS specific numbers further down)

tls/socket.js n=100000     13.48 %        *** 3.554366e-08

The tls-connect benchmark was not much help because it seemed limited by my system's ability to allocate new sockets rather than the actual performance of the code. I couldn't even run a reliable benchmark comparing master to master itself (with 200 runs, I got results anywhere from -10 to 10% with almost no certainty), let alone my code to master.

I also used vegeta to hit a basic TLS server and ran a benchmark for what the max handled requests per second would be at full saturation. With these changes I got a reliable 4% increase across a total of million requests or so.

Also, running the node profiler on the new code indicates that the JS code throughout TLSSocket takes about 50% less time and the JS code within tlsConnectionListener takes about 35% less time (this is across 40 runs of tls-connect.js and 10m session with vegeta). It's a bit hard to run an actual benchmark on it though because so much of it is tied to other code.

I realize this is quite a lot to review so thanks for any and all feedback! Let me know if there's anything I can do to make this more digestible.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

tls

Limit the number of closures created and functions called
throughout the lifecycle of TLSSocket, Server & connect.
@nodejs-github-bot nodejs-github-bot added the tls Issues and PRs related to the tls subsystem. label Sep 23, 2017
@mscdex
Copy link
Contributor

mscdex commented Sep 23, 2017

CI: https://ci.nodejs.org/job/node-test-pull-request/10227/

@apapirovski
Copy link
Member Author

No clue what's going on with the linter[1] but the centos5 failures are related to #15505 and Windows seems to be an inspector/test-bindings fail. Everything else is green.

Thanks for running the CI @mscdex!

[1] Does anyone know? Every CI I've seen lately, it fails.

@refack
Copy link
Contributor

refack commented Sep 23, 2017

Does anyone know? Every CI I've seen lately, it fails.

There's an issue with the error codes in

node/lib/internal/errors.js

Lines 119 to 121 in 150b8f7

E('ERR_ARG_NOT_ITERABLE', '%s must be iterable');
E('ERR_ASSERTION', '%s');
E('ERR_ASYNC_CALLBACK', (name) => `${name} must be a function`);

going out of ASCIIbetical order.
I'll open a PR for that.

Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small things and the CI must be happy but otherwise this is LGTM.

lib/_tls_wrap.js Outdated
cb(null);
owner.server &&
!owner.server.emit('resumeSession', hello.sessionId, onSession)) {
loadSessionDone(null, owner);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All entries of loadSessionDone have a a single line that they would call in the function. This can always be inlined directly. There is no need for calling that function.

lib/_tls_wrap.js Outdated
if (once)
return cb(new errors.Error('ERR_MULTIPLE_CALLBACK'));
return requestOCSP(new errors.Error('ERR_MULTIPLE_CALLBACK'), owner);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to call the function if you know that self.destroy(err) is going to be called. This can be inlined and the function can be inlined for all error cases and that case can be removed from the function.

lib/_tls_wrap.js Outdated
});
}


function requestOCSP(self, hello, ctx, cb) {
function requestOCSP(err, self, hello) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calling functions calls hello info. This is somewhat inconsistent.

I see why you changed owner to self but I am also not very happy about that naming either. Right now I am not certain what would be best.

lib/_tls_wrap.js Outdated
if (!self._handle)
return cb(new errors.Error('ERR_SOCKET_CLOSED'));
if (self._handle === null)
return requestOCSPDone(new errors.Error('ERR_SOCKET_CLOSED'), self);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above again. Inlining would be good.

lib/_tls_wrap.js Outdated
function oncertcb(info) {
var self = this;
var servername = info.servername;
if (self._handle === null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not mistaken, it is not possible to reach this. self._handle should never be null.

Copy link
Member Author

@apapirovski apapirovski Sep 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the user has control over that callback, it could indeed be called after _handle has been nulled. Ignore me, thought this referred to the one above it.

lib/_tls_wrap.js Outdated
return cb(null);
return requestOCSPDone(null, self);

let ctx = self._handle.sni_context;

if (!ctx)
ctx = self.server._sharedCreds;

// TLS socket is using a `net.Server` instead of a tls.TLSServer.
// Some TLS properties like `server._sharedCreds` will not be present
if (!ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be simplified by moving this in the upper if (!ctx) check. This would safe a check in case ctx exists from the beginning on.

owner.destroy(err);
} else if (owner._tlsOptions.isServer &&
owner._rejectUnauthorized &&
/peer did not return a certificate/.test(err.message)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know if the error message is static / never changed before? In this case it would be faster to compare the whole message as string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure and we don't have a test that covers this. I don't think we should change in this PR. I tried to avoid anything that could actually break things.

lib/_tls_wrap.js Outdated
},
set: function set(value) {
res.reading = value;
set: (value) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When changing this, you can directly switch to get() { and set(value) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be an arrow function. (I'm trying to avoid referencing res so the unsetting can happen in a separate function and there's no leak.)

lib/_tls_wrap.js Outdated
var ctx;

this.server._contexts.some(function(elem) {
for (var i = 0; i < contexts.length; i++) {
const elem = contexts[i];
if (elem[0].test(servername)) {
ctx = elem[1];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the ctx variable at all. You could call the callback here and return and call the callback below with undefined (if that is possible - otherwise that is obsolete).

@BridgeAR
Copy link
Member

Nice work!

@apapirovski apapirovski force-pushed the patch-tls-perf branch 2 times, most recently from 5e3ed18 to b70f34b Compare September 25, 2017 12:31
@apapirovski
Copy link
Member Author

Thanks for the detailed feedback! Made all the requested changes.

(Sorry for the commit spam, went back and forth on the error message thing above but decided to leave because we don't have a test.)

Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please check the coverage once again though to be sure there is no uncovered code.

lib/_tls_wrap.js Outdated

// TLS socket is using a `net.Server` instead of a tls.TLSServer.
// Some TLS properties like `server._sharedCreds` will not be present
if (!ctx)
return cb(null);
return requestOCSPDone(socket);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could still be combined with the upper if (!ctx) statement. It might also be interesting to see what socket._handle.sni_context and socket.server._sharedCreds returns. If we know it is either a object or null / undefined, we could make the check explicit (e.g. if (ctx === null)).

if (!ctx) {
  ctx = socket.server._sharedCreds;
  if (!ctx)
    return requestOCSPDone(socket);
}

Writing more explicit JS does make a nice difference with Turbofan.

Copy link
Member Author

@apapirovski apapirovski Sep 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think we're guaranteed them to be anything in particular because user modules do have access and TLS has previously been broken by making these assumptions elsewhere. (Can't find the PR & issue in question right now but can link later.)

I will update to make it a nested if statement though. Thanks!

@jasnell jasnell requested a review from mcollina September 25, 2017 15:12
Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing stands out as being problematic. Would like @mcollina to take a look

@jasnell
Copy link
Member

jasnell commented Sep 25, 2017

CI: https://ci.nodejs.org/job/node-test-pull-request/10259/

@jasnell
Copy link
Member

jasnell commented Sep 26, 2017

Ping @mcollina

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not very comfortable with this change landing in 8 right now as we are getting closer to LTS. It might be fine, but I would prefer it to get some proper testing under the 9 line before we backport.

This would need some CITGM runs to check if there are any known regressions.

I have added some questions and feedbacks, once those are satisfied you can dismiss this review (I am traveling until the 10th of October).

if (owner._newSessionPending) {
owner._securePending = true;
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this added here? Or was it moved from somewhere else?

Copy link
Member Author

@apapirovski apapirovski Sep 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's moved from elsewhere. _finalInit has two paths: onhandshakedone (server) or being called directly (client). But _newSessionPending can only happen on the server because it's called from onnewsession which only runs on the server.

https://github.com/apapirovski/node/blob/47923ca39cc4bd3f841de2152e393ff7ffade63c/lib/_tls_wrap.js#L224-L230

}


function loadSession(self, hello, cb) {
function loadSession(hello) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A callback here is removed. It's very hard to track these changes, so I might be losing something, but it seems that the callback called self._handle.setOCSPResponse(response) which lost doing the refactoring.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cb for loadSession is self._handle.endParser(), which has instead been inlined.

lib/_tls_wrap.js Outdated
get: function get() {
return res.reading;
get: () => {
return this[kRes].reading;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a need to use this[kRes] here, as we are closing upon res anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added because I think otherwise it's a cyclic reference, no? Also see this commit apapirovski@f7620fb

res._parent = handle;
res._parentWrap = wrap;
res._secureContext = context;
res.reading = handle.reading;
this[kRes] = res;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this is a good idea, is it possible to remove it without allocating any more closures?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I can tell. Doing it this way is more performant based on my testing but can revert.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please benchmark with and without? it does not seem to be needed, but I would be happy to be wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what I mean is: I can't remove the closure without using this[kRes]. If we're ok with having a closure then this is fine. The performance difference is minor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which closure?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cyclic reference still exists, as the closures below are closing over res. The actual solution is to move the whole Object.defineProperty to an outside function that is not passed resto. Then this[kRes] is correct.

Copy link
Member Author

@apapirovski apapirovski Sep 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be going crazy here, sorry for dragging this on but...

  this[kRes] = res;
  Object.defineProperty(handle, 'reading', {
    get: () => {
      return this[kRes].reading;
    },
    set: (value) => {
      this[kRes].reading = value;
    }
  });


  this.on('close', onSocketCloseDestroySSL);

Then we unset this[kRes] and res is no longer referenced by anything. I could be completely nuts though, been a long day. That said, what you're saying makes sense so it could be:

  const res = tls_wrap.wrap(handle._externalStream,
                            context.context,
                            !!options.isServer);
  res._parent = handle;
  res._parentWrap = wrap;
  res._secureContext = context;
  res.reading = handle.reading;
  this[kRes] = res;
  defineHandleReading(this, handle);

  this.on('close', onSocketCloseDestroySSL);

  return res;
};

function defineHandleReading(socket, handle) {
  Object.defineProperty(handle, 'reading', {
    get: () => {
      return socket[kRes].reading;
    },
    set: (value) => {
      socket[kRes].reading = value;
    }
  });
}

function onSocketCloseDestroySSL() {
  // Make sure we are not doing it on OpenSSL's stack
  setImmediate(destroySSL, this);
  this[kRes] = null;
}

Not sure how much that actually improves things though since extra function call and all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would have to wait for the handle to be collected to be able to free res, while in this way it could be freed earlier. It was the spirit of https://github.com/apapirovski/node/blame/7a953929fef5a7892d4697ed73e6fd314055beee/lib/_tls_wrap.js#L404-L408, which makes the getter and setter refer to nothing, and it clear res from those closures as well.

The function call can likely be inlined by V8, so there should not be drop in performance but verify.

Copy link
Member

@BridgeAR BridgeAR Sep 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the most controversial part here. As the diff is definitely not easy to read here I would suggest to remove this optimization for now and you could open another PR for that or at least have a individual commit for it. Both would make the review a lot easier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated it based on @mcollina's feedback. I think it should be good now?

if (this._newSessionPending) {
this._securePending = true;
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure about removing this check? This function might be called outside of this module context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See one of the comments above regarding this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not answer my question. This function might be called from outside this module.

Copy link
Member Author

@apapirovski apapirovski Sep 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I mean, I suppose someone could reimplement the whole logic around _newSessionPending in user-land. But right now that _newSessionPending is set within onnewsession which only runs on the server and so it was moved into that branch instead. The reason this was added was to support async sessions, as far as I can tell and this is the code in question:

function onnewsession(key, session) {
   [removed]

    owner._newSessionPending = false;
    if (owner._securePending)
      owner._finishInit();
    owner._securePending = false;
  };


  owner._newSessionPending = true;
  if (!owner.server.emit('newSession', key, session, done))
    done();
}

Here's the commit in question: apapirovski@75ea11f

const kErrorEmitted = Symbol('error-emitted');
const kHandshakeTimeout = Symbol('handshake-timeout');
const kRes = Symbol('res');
const kSNICallback = Symbol('snicallback');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to define the new symbols in the constructor, assigning them to null is ok, so you would not alter the shape of the object for every request.

@mcollina
Copy link
Member

As a side note, these types of refactoring are very hard to review. It would have been better if it was split in different commits to track down from where stuff was moved.

@apapirovski
Copy link
Member Author

@mcollina If you would prefer me to open a separate PR for different parts of this, I can probably do that. It's a bit tricky because not all of it is an identifiable unit but can figure something out.

@mcollina mcollina requested a review from a team September 29, 2017 00:22
@mcollina
Copy link
Member

I might be overcautious, but it would be better if other @nodejs/tsc members review this.

lib/_tls_wrap.js Outdated

this.on('close', onSocketCloseDestroySSL);

return res;
};

function defineHandleReading(socket, handle) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding a comment that links to f7620fb, explaining why it is being done in this way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, I think the commit explains it well (and it shows all the relevant parts) but let me know if you prefer a full outline of it within the code.

@mscdex mscdex added the performance Issues and PRs related to the performance of Node.js. label Sep 29, 2017
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@apapirovski
Copy link
Member Author

Thanks for the review @mcollina — sorry about the size of change. 😓

Copy link
Member

@indutny indutny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, if it works

@apapirovski
Copy link
Member Author

CI run seemed clean twice, maybe we could kick off a CITGM?

@BridgeAR
Copy link
Member

BridgeAR commented Oct 2, 2017

Landed in 5723b5d

@BridgeAR BridgeAR closed this Oct 2, 2017
BridgeAR pushed a commit that referenced this pull request Oct 2, 2017
Limit the number of closures created and functions called
throughout the lifecycle of TLSSocket, Server & connect.

PR-URL: #15575
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Fedor Indutny <fedor.indutny@gmail.com>
@MylesBorins
Copy link
Contributor

This is not landing cleanly on 8.x. Could you please backport

@apapirovski
Copy link
Member Author

I think @mcollina had objections to landing this on 8.x. Still the case?

addaleax pushed a commit to addaleax/ayo that referenced this pull request Oct 4, 2017
Limit the number of closures created and functions called
throughout the lifecycle of TLSSocket, Server & connect.

PR-URL: nodejs/node#15575
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Fedor Indutny <fedor.indutny@gmail.com>
@MylesBorins
Copy link
Contributor

ping @mcollina

@mcollina
Copy link
Member

mcollina commented Oct 7, 2017

@MylesBorins @apapirovski I would prefer this to land on 9, brew there for a bit, and then be backported. There is a non-null chance that this introduces regressions, and I am concerned on landing this right before 8 goes LTS.

@MylesBorins MylesBorins added the baking-for-lts PRs that need to wait before landing in a LTS release. label Oct 7, 2017
@MylesBorins
Copy link
Contributor

@mcollina I've thrown the baking label on here :D

@BethGriggs
Copy link
Member

@mcollina @MylesBorins - is this still something that should be backported to v8.x now it has had time to bake?

@MylesBorins
Copy link
Contributor

I defer to @mcollina and @mhdawson on this. Generally we should not be prioritizing performance changes imho.

@mcollina
Copy link
Member

I'm +1 for backporting this if it lands cleanish.

@BethGriggs BethGriggs removed the baking-for-lts PRs that need to wait before landing in a LTS release. label Jun 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues and PRs related to the performance of Node.js. tls Issues and PRs related to the tls subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants