Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: tell the parser about CONNECT responses #6198

Closed
wants to merge 3 commits into from

Conversation

slushie
Copy link
Contributor

@slushie slushie commented Apr 14, 2016

Checklist
  • tests and code linting passes
  • a test and/or benchmark is included
  • documentation is changed or added
  • the commit message follows commit guidelines
Affected core subsystem(s)

http

Description of change

This commit fixes a bug in HTTP CONNECT response parsing. The parser
normally continues to look for additional HTTP messages after the
first message has completed. However, in the case of CONNECT responses,
the parser should stop looking for additional messages and treat
any further data as a separate protocol.

Because of the way that HTTP messages are parsed in JavaScript, this
bug only manifests itself in the case where the socket's data
handler receives the end of the response message and also includes
non-HTTP data.

In order to implement the fix, an .upgrade accessor is exposed to
JavaScript on the HTTPParser object that proxies the underlying
http_parser's upgrade field. Likewise in JavaScript, the http
client module sets this value to true when a response is received
to a CONNECT request.

The result of this is that callbacks on HTTPParser instances can
signal that the message indicates a change in protocol and further
HTTP parsing should not occur.

@mscdex mscdex added the http Issues or PRs related to the http subsystem. label Apr 14, 2016
'use strict';
var common = require('../common');
var assert = require('assert');
var http = require('http');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these can be const

@jasnell
Copy link
Member

jasnell commented Apr 15, 2016

@nodejs/http

@jasnell jasnell added the semver-minor PRs that contain new features and should be released in the next minor version. label Apr 15, 2016
Parser* parser = Unwrap<Parser>(args.Holder());

bool upgrade = value->BooleanValue();
parser->parser_.upgrade = upgrade;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't look correct to me. http_parser.upgrade is only supposed to be read, not written, by clients of libhttp_parser.

slushie added 3 commits April 15, 2016 12:40
This commit fixes a bug in HTTP CONNECT response parsing. The parser
normally continues to look for additional HTTP messages after the
first message has completed. However, in the case of CONNECT responses,
the parser should stop looking for additional messages and treat
any further data as a separate protocol.

Because of the way that HTTP messages are parsed in JavaScript, this
bug only manifests itself in the case where the socket's `data'
handler receives the end of the response message *and also* includes
non-HTTP data.

In order to implement the fix, an `.upgrade' accessor is exposed to
JavaScript on the HTTPParser object that proxies the underlying
http_parser's `upgrade' field. Likewise in JavaScript, the `http'
client module sets this value to `true' when a response is received
to a CONNECT request.

The result of this is that callbacks on HTTPParser instances can
signal that the message indicates a change in protocol and further
HTTP parsing should not occur.
@slushie slushie force-pushed the http-client-connect-head branch from daae733 to 08284b3 Compare April 15, 2016 18:53
@slushie
Copy link
Contributor Author

slushie commented Apr 15, 2016

I've updated the tests and the setter value to include changes from the comments.

But the real issue is what @bnoordhuis raises regarding the setting of http_parser.upgrade. It is certainly a valid concern! Before I submitted this PR, I considered a few different things:

  • The parser must stop parsing at the end of a CONNECT response message because the protocol has changed. This is very similar to a request or response with the Connection: upgrade header set. However, this bug allows the parser to continue parsing.
  • With the current usage of the parser, this bug propagates in a way that is difficult to control from within JavaScript alone. Any workarounds in JavaScript are obvious and fragile hacks. See below for details.
  • The http_parser library will stop parsing messages only if the upgrade flag is set. It also must satisfy the condition that either the message has a CONNECT method, or the message body is being skipped (by returning 1 from on_headers_complete). This means that for response messages, the only way to stop parsing after a message is to return 1 from on_headers_complete and to have upgrade be set.

From http_parser.c, line 1839:

    hasBody = parser->flags & F_CHUNKED ||
      (parser->content_length > 0 && parser->content_length != ULLONG_MAX);
    if (parser->upgrade && (parser->method == HTTP_CONNECT ||
                            (parser->flags & F_SKIPBODY) || !hasBody)) {
      /* Exit, the rest of the message is in a different protocol. */
      UPDATE_STATE(NEW_MESSAGE());
      CALLBACK_NOTIFY(message_complete);
      RETURN((p - data) + 1);
    }
  • One possible solution would be to tell the parser which request method was used when it parses the response. However, the http_parser.h header states that the http_parser.method field is both read-only and for requests only. It is therefore unacceptable to set parser.method = HTTP_CONNECT for response messages.
  • Another possible solution could be done from within JavaScript. It would be possible to check if the JS onMessageComplete callback had been fired by testing if (parser.incoming.complete == true) after the parser returns.

In _http_common.js line 130:

    function parserOnMessageComplete() {
      var parser = this;
      var stream = parser.incoming;

      if (stream) {
        stream.complete = true;

parser.execute() is called from _http_client.js line 359:

    var ret = parser.execute(d);
    if (ret instanceof Error) {
      debug('parse error');
      freeParser(parser, req, socket);
      socket.destroy();
      req.emit('error', ret);
      req.socket._hadError = true;
    } else if (parser.incoming && parser.incoming.upgrade) {
      // Upgrade or CONNECT
      var bytesParsed = ret;
      var res = parser.incoming;
      req.res = res;

The workaround would be to test if the message was complete, and if the request method is CONNECT. However, there would be no way to know where the CONNECT response message ended and the next protocol began, and therefore no way to send the correct head data to the request's connect handler.

The exception to this rule would be parser errors on the first byte after the CONNECT response. Indeed this is how I first noticed the bug and my initial solution was simply to check ret.bytesParsed and slice the buffer on that boundary. But upon further testing I found it is possible that the first few bytes of the non-HTTP data could actually be parsed as HTTP, and then bytesParsed would point to a byte somewhere in the middle of the non-HTTP data.

And the problems actually go deeper. If a parser error is not returned, it's possible that the changed protocol (after the CONNECT response) actually gets parsed as another HTTP message. This would cause all sorts of problems for the module's consumers.

  • The http_parser.h file does not directly mark the upgrade flag read-only, but that does seem to be implied.

Line 236:

  /* 1 = Upgrade header was present and the parser has exited because of that.
   * 0 = No upgrade header present.
   * Should be checked when http_parser_execute() returns in addition to
   * error checking.
   */
  unsigned int upgrade : 1;

This flag is being checked from within the parserOnHeadersComplete callback in JavaScript (_http_common.js line 43). It is also modified in this function and stored on the IncomingMessage (line 95), but this is only accessible within JavaScript and doesn't stop parser.execute from continuing to parse messages.

  • The upgrade flag is set immediately before calling on_headers_complete in http_parser.c line 1795, and not modified otherwise:

    /* Set this here so that on_headers_complete() callbacks can see it */
    parser->upgrade =
      ((parser->flags & (F_UPGRADE | F_CONNECTION_UPGRADE)) ==
       (F_UPGRADE | F_CONNECTION_UPGRADE) ||
       parser->method == HTTP_CONNECT);
    

In sum, I believe that this is the most elegant and "correct" solution. It certainly isn't the only solution, and I'd gladly write a patch using other functionality if it is suggested. As a corollary, and in order to alleviate any concerns about http_parser.upgrade being read-only, I will submit a suggestion to the http_parser project that they officially consider the flag to be read-write.

TL;DR Setting .upgrade appears to be the best solution to this problem, despite ambiguity about the flag being read-only.

@indutny
Copy link
Member

indutny commented Apr 16, 2016

FWIW, parser should set upgrade = 1 on CONNECT method: https://github.com/nodejs/http-parser/blob/master/http_parser.c#L1796-L1799 . Will look through the discussion here to see what exactly we are fixing.

@slushie
Copy link
Contributor Author

slushie commented Apr 16, 2016

@indutny The upgrade flag does get set, but only if method is set to HTTP_CONNECT, which would only happen when parsing requests.

Also, sorry for that big ol' wall of text. I think I got carried away ;)

@indutny
Copy link
Member

indutny commented Apr 16, 2016

Oh, looks like I get it now. There is indeed no branch for this in http-parser, will see how to fix it in a bit.

@indutny
Copy link
Member

indutny commented Apr 16, 2016

This is what I have in mind:

diff --git a/deps/http_parser/http_parser.c b/deps/http_parser/http_parser.c
index d51a2e7..7196175 100644
--- a/deps/http_parser/http_parser.c
+++ b/deps/http_parser/http_parser.c
@@ -1812,6 +1812,9 @@ reexecute:
             case 0:
               break;

+            case 2:
+              parser->upgrade = 1;
+
             case 1:
               parser->flags |= F_SKIPBODY;
               break;
diff --git a/lib/_http_client.js b/lib/_http_client.js
index bc29426..68eb125 100644
--- a/lib/_http_client.js
+++ b/lib/_http_client.js
@@ -432,7 +432,7 @@ function parserOnIncomingClient(res, shouldKeepAlive) {
   // Responses to CONNECT request is handled as Upgrade.
   if (req.method === 'CONNECT') {
     res.upgrade = true;
-    return true; // skip body
+    return 2; // skip body, and the rest
   }

   // Responses to HEAD requests are crazy.
diff --git a/lib/_http_common.js b/lib/_http_common.js
index 08f93d8..1e6490e 100644
--- a/lib/_http_common.js
+++ b/lib/_http_common.js
@@ -94,7 +94,7 @@ function parserOnHeadersComplete(versionMajor, versionMinor, headers, method,

   parser.incoming.upgrade = upgrade;

-  var skipBody = false; // response to HEAD or CONNECT
+  var skipBody = 0; // response to HEAD or CONNECT

   if (!upgrade) {
     // For upgraded connections and CONNECT method request, we'll emit this
@@ -103,7 +103,10 @@ function parserOnHeadersComplete(versionMajor, versionMinor, headers, method,
     skipBody = parser.onIncoming(parser.incoming, shouldKeepAlive);
   }

-  return skipBody;
+  if (typeof skipBody !== 'number')
+    return skipBody ? 1 : 0;
+  else
+    return skipBody;
 }

 // XXX This is a mess.
diff --git a/src/node_http_parser.cc b/src/node_http_parser.cc
index 4087ed2..11f44fc 100644
--- a/src/node_http_parser.cc
+++ b/src/node_http_parser.cc
@@ -300,7 +300,7 @@ class Parser : public AsyncWrap {
       return -1;
     }

-    return head_response->IsTrue() ? 1 : 0;
+    return head_response->IntegerValue();
   }

May I ask you to give it a try? It appears to be fixing the problem without touching internal parser's state. Will require an http-parser upgrade, though.

(@bnoordhuis @jasnell PTAL at that http-parser changes, if it looks ok to you - we will continue discussing it at relevant repo. Posted the patch here to show how it will work in a context of node.js)

@slushie
Copy link
Contributor Author

slushie commented Apr 16, 2016

I think this is a better solution, overall. My tests seem to be passing, too. Thanks for this!

Would it be difficult to backport this to node's LTS version, due to the dependency library change?

@indutny
Copy link
Member

indutny commented Apr 16, 2016

@slushie I think it should not be that difficult, this dependency change is of semver-patch level so it should not cause any problems at all.

indutny added a commit to indutny/http-parser that referenced this pull request Apr 16, 2016
Returning `2` from on_headers_complete will tell parser that it
should not expect neither a body nor any futher responses on
this connection. This is useful for handling responses to a
CONNECT request which may not contain `Upgrade` or
`Connection: upgrade` headers.

See: nodejs/node#6198
@indutny
Copy link
Member

indutny commented Apr 16, 2016

Opened an http-parser PR just in case: nodejs/http-parser#299

@jasnell
Copy link
Member

jasnell commented Apr 18, 2016

@indutny ... at first glance this looks good. Will have to dig into the new PR in more detail tho

indutny added a commit to nodejs/http-parser that referenced this pull request Apr 19, 2016
Returning `2` from on_headers_complete will tell parser that it
should not expect neither a body nor any futher responses on
this connection. This is useful for handling responses to a
CONNECT request which may not contain `Upgrade` or
`Connection: upgrade` headers.

See: nodejs/node#6198
PR-URL: #299
Reviewed-By: Brian White <mscdex@mscdex.net>
indutny added a commit to indutny/io.js that referenced this pull request Apr 19, 2016
Adds `2` as a return value of `on_headers_complete`, this mode will be
used to fix handling responses to `CONNECT` requests.

See: nodejs#6198
indutny added a commit to indutny/io.js that referenced this pull request Apr 19, 2016
When handling a response to `CONNECT` request - skip message body
and do not attempt to parse the next message. `CONNECT` requests are
used in similar sense to HTTP Upgrade.

Fix: nodejs#6198
indutny pushed a commit to indutny/io.js that referenced this pull request Apr 19, 2016
indutny added a commit that referenced this pull request Apr 19, 2016
Adds `2` as a return value of `on_headers_complete`, this mode will be
used to fix handling responses to `CONNECT` requests.

See: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
@indutny indutny closed this in 9d4d529 Apr 19, 2016
MylesBorins pushed a commit that referenced this pull request Apr 21, 2016
When handling a response to `CONNECT` request - skip message body
and do not attempt to parse the next message. `CONNECT` requests are
used in similar sense to HTTP Upgrade.

Fix: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Apr 21, 2016
See: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
joelostrowski pushed a commit to joelostrowski/node that referenced this pull request Apr 25, 2016
Adds `2` as a return value of `on_headers_complete`, this mode will be
used to fix handling responses to `CONNECT` requests.

See: nodejs#6198
PR-URL: nodejs#6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
joelostrowski pushed a commit to joelostrowski/node that referenced this pull request Apr 25, 2016
When handling a response to `CONNECT` request - skip message body
and do not attempt to parse the next message. `CONNECT` requests are
used in similar sense to HTTP Upgrade.

Fix: nodejs#6198
PR-URL: nodejs#6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
joelostrowski pushed a commit to joelostrowski/node that referenced this pull request Apr 25, 2016
See: nodejs#6198
PR-URL: nodejs#6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
jasnell pushed a commit that referenced this pull request Apr 26, 2016
Adds `2` as a return value of `on_headers_complete`, this mode will be
used to fix handling responses to `CONNECT` requests.

See: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
jasnell pushed a commit that referenced this pull request Apr 26, 2016
When handling a response to `CONNECT` request - skip message body
and do not attempt to parse the next message. `CONNECT` requests are
used in similar sense to HTTP Upgrade.

Fix: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
jasnell pushed a commit that referenced this pull request Apr 26, 2016
See: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
indutny added a commit to indutny/io.js that referenced this pull request Jun 28, 2016
Adds `2` as a return value of `on_headers_complete`, this mode will be
used to fix handling responses to `CONNECT` requests.

See: nodejs#6198
PR-URL: nodejs#6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
indutny added a commit to indutny/io.js that referenced this pull request Jun 28, 2016
When handling a response to `CONNECT` request - skip message body
and do not attempt to parse the next message. `CONNECT` requests are
used in similar sense to HTTP Upgrade.

Fix: nodejs#6198
PR-URL: nodejs#6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
indutny pushed a commit to indutny/io.js that referenced this pull request Jun 28, 2016
See: nodejs#6198
PR-URL: nodejs#6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit to MylesBorins/node that referenced this pull request Jul 5, 2016
Adds `2` as a return value of `on_headers_complete`, this mode will be
used to fix handling responses to `CONNECT` requests.

See: nodejs#6198
PR-URL: nodejs#6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit to MylesBorins/node that referenced this pull request Jul 5, 2016
When handling a response to `CONNECT` request - skip message body
and do not attempt to parse the next message. `CONNECT` requests are
used in similar sense to HTTP Upgrade.

Fix: nodejs#6198
PR-URL: nodejs#6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit to MylesBorins/node that referenced this pull request Jul 5, 2016
See: nodejs#6198
PR-URL: nodejs#6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Jul 11, 2016
Adds `2` as a return value of `on_headers_complete`, this mode will be
used to fix handling responses to `CONNECT` requests.

See: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Jul 11, 2016
When handling a response to `CONNECT` request - skip message body
and do not attempt to parse the next message. `CONNECT` requests are
used in similar sense to HTTP Upgrade.

Fix: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Jul 11, 2016
See: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Jul 12, 2016
Adds `2` as a return value of `on_headers_complete`, this mode will be
used to fix handling responses to `CONNECT` requests.

See: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Jul 12, 2016
When handling a response to `CONNECT` request - skip message body
and do not attempt to parse the next message. `CONNECT` requests are
used in similar sense to HTTP Upgrade.

Fix: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Jul 12, 2016
See: #6198
PR-URL: #6279
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. semver-minor PRs that contain new features and should be released in the next minor version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants