Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #12396 - fix and re-enable NcsaRequestLogTest #12420

Merged
merged 1 commit into from
Oct 24, 2024

Conversation

lachlan-roberts
Copy link
Contributor

issue #12396

Signed-off-by: Lachlan Roberts <lachlan.p.roberts@gmail.com>
String log = _entries.poll(5, TimeUnit.SECONDS);
assertThat(log, containsString("GET http://hostname:8888/foo?name=value"));
assertThat(log, containsString("GET /foo?name=value"));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have access in Jetty 12 to get the URI as it was received, so the request log reconstructs it using HttpURI.getPathQuery().

I think to fix this we would need some new API on Request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "URI as it was received" is a strange one.
We build the URI from the components on the HTTP protocol (host, path, and query), and information from the connector (like isSecure).
What we see down here is the URI after it has possibly been modified (like the RewriteHandler), right?
If we had the raw host/path as seen on the protocol, would that be sufficient? or would that possibly cause NCSA format/syntax issues? (like what if the path was received on HTTP/2 with a space or control character that we stripped out / changed due to a Rewrite Rule?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NCSA combined format can be represented like this ...

%h %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i"

    %h: the client hostname or IP address
    %l: the client identifier or - if none is set
    %u: the username used by the client for authentication or - if none is set
    %t: the date and time of the HTTP request
    %r: the first line of the HTTP request
    %s: the status code of the HTTP response
    %b: the byte size of the HTTP response
    %{Referer}i: the URL that linked to the requested page or - if none is set
    %{User-agent}i: the web browser and platform used by the client

I think you are attempting to handle the %r with this, right?
If so, this would really only work for HTTP/1.x, not HTTP/2 or HTTP/3 (as those two don't have the concept of "the first line of the HTTP request")

This "first line of the HTTP request" could also help with your BAD /badRequest HTTP/1.0 issue below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry that if we implement this, someone nefarious sort could come along and just hammer a server with huge "first line" and blow out the request log and/or server memory (as we would now store that raw "first line" in a way accessible from the Request object).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its probably fine to keep the behavior for this as is. Even though it might not be the exact first line of the request for HTTP/1.x.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this PR being what it is.
I opened #12423 to cover the other things brought up though.

@@ -270,7 +295,7 @@ public void testBadVersion(String logType) throws Exception

_connector.getResponse("METHOD /foo HTTP/9\n\n");
String log = _entries.poll(5, TimeUnit.SECONDS);
assertThat(log, containsString("\"- - -\""));
assertThat(log, containsString("\"BAD /badMessage HTTP/1.0\""));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no URI the HttpConnection creates a fake one to call the ErrorHandler. This was causing this to be GET /badMessage HTTP/1.0.

I have changed HttpConnection to set the Method to BAD so we can differentiate this from a valid request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, not sure about this change.
Can it be empty? or something that couldn't be sent on the protocol? (like a lowercase string with a space, eg: <bad request>).
Putting on my user/operations hat, seeing the hardcoded path /badMessage (or HTTP/1.0) would be a bit odd to see in the output of RequestLog too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it is strange to have this in the RequestLog.

There is a test which does _connector.getResponse("XXXXXXXX") and it still shows up in RequestLog as BAD /badMessage HTTP/1.0.

but @gregw said it was necessary for the ErrorHandler.

maybe we could just add some flag to the request so requestLog knows to just display it as "- - -"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on how NCSA Request Log is loosely defined, I would expect connector.getResponse("XXXXXXX") to result in XXXXXXX showing up in the request log (in place of where BAD /badMessage HTTP/1.0 shows up)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this PR being what it is.
I opened #12423 to cover the other things brought up though.

Copy link
Contributor

@joakime joakime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't merge until @gregw has had a look at this too.

@olamy olamy added the Test label Oct 23, 2024
@lachlan-roberts lachlan-roberts merged commit 2d9ef61 into jetty-12.1.x Oct 24, 2024
11 checks passed
@joakime joakime deleted the jetty-12.1.x-12396-NcsaRequestLogTest branch October 24, 2024 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants