Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make httpd logger closer to Common Log Format #6812

Merged
merged 1 commit into from
Jun 14, 2016
Merged

Conversation

sparrc
Copy link
Contributor

@sparrc sparrc commented Jun 8, 2016

Required for all non-trivial PRs
  • Tests pass
  • CHANGELOG.md updated

changes the httpd log lines from this:

[httpd] 2016/06/08 14:06:39 ::1 - - [08/Jun/2016:14:06:39 +0100] POST /write?consistency=any&db=telegraf&precision=s&rp= HTTP/1.1 204 0 - InfluxDBClient d6aa01fc-2d79-11e6-8024-000000000000 2.751391ms

to this:

[httpd] ::1 - - [08/Jun/2016:14:06:39 +0100] "POST /write?consistency=any&db=telegraf&precision=s&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" d6aa01fc-2d79-11e6-8024-000000000000 2751

So it changes a few things:

  1. Remove the logger timestamp at the beginning which isn't very relevant anyways
  2. adds quotes around "METHOD URI PROTOCOL", because this is part of the
    common log format.
  3. adds quotes around "AGENT" and "REFERRER" because this is part of the
    "combined" log format.
  4. Puts the response time in integer microseconds, because this is
    consistent with apache's %D config mod option.

Compared with CLF, our logs now look like this:

[httpd] %{COMMON_LOG_FORMAT} "<agent>" "<referrer>" <request_uuid> <response_time_µs>

For reference, see:
https://en.wikipedia.org/wiki/Common_Log_Format
http://httpd.apache.org/docs/current/mod/mod_log_config.html

@sparrc
Copy link
Contributor Author

sparrc commented Jun 8, 2016

cc @nathanielc

@@ -913,8 +913,7 @@ func (h *Handler) logging(inner http.Handler, name string) http.Handler {
start := time.Now()
l := &responseLogger{w: w}
inner.ServeHTTP(l, r)
logLine := buildLogLine(l, r, start)
h.Logger.Println(logLine)
fmt.Println("[httpd] " + buildLogLine(l, r, start))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the switch to use STDOUT instead of STDERR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I should read the PR first ;)

We should probably just create a logger without the log.Ldate flag if we just need to remove the date from the log. This way it will end up with the rest of the logs and not in STDOUT or something else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, fixed it so it doesn't change to STDOUT

@sparrc sparrc force-pushed the httpd-log-line-clf branch from ca9b0d4 to e087bdc Compare June 8, 2016 18:15
@sparrc sparrc changed the title Make httpd logger closer to Common Log Format WIP Make httpd logger closer to Common Log Format Jun 8, 2016
@sparrc
Copy link
Contributor Author

sparrc commented Jun 8, 2016

marking as a WIP because it's not clear to me why it was logging with the [httpd] prefix when the logger of the handler is getting set to [http]. Need to investigate.

@sparrc sparrc force-pushed the httpd-log-line-clf branch 2 times, most recently from 0d46b9c to b311131 Compare June 8, 2016 21:02
@sparrc sparrc changed the title WIP Make httpd logger closer to Common Log Format Make httpd logger closer to Common Log Format Jun 8, 2016
@jsternberg
Copy link
Contributor

@sparrc look inside of services/httpd/service.go. The SetLogOutput(io.Writer) method sets the logger of the handler to the same logger as the service so the initial logger created by the handler never gets used. This happens in a few places and the best way to go about things is just to make sure they agree with each other.

@toni-moreno
Copy link

Hi @sparrc

I like this idea but I can't take response time with my log analyzer because it expect as response time an integer with fixed unit ( Apache %D parameter prints response time in "ms" ). Instead you are logging a variable string with variable unit, "1.25ms" , "1.45s" . This response time format is too difficult to parse with any log analyzer or a simple awk script.

Could you change the response time to an integer number with fixed unit please?

@sparrc
Copy link
Contributor Author

sparrc commented Jun 9, 2016

Sure, I can make it always log in ms

@sparrc
Copy link
Contributor Author

sparrc commented Jun 9, 2016

@toni-moreno according to mod_log_config, %D is actually in microseconds: http://httpd.apache.org/docs/current/mod/mod_log_config.html

@sparrc
Copy link
Contributor Author

sparrc commented Jun 9, 2016

mod_log_config appears to provide 3 options:

%D              The time taken to serve the request, in microseconds.
%T              The time taken to serve the request, in seconds.
%{UNIT}T        The time taken to serve the request, in a time unit given by UNIT. 
Valid units are ms for milliseconds, us for microseconds, and s for seconds. 
Using s gives the same result as %T without any format; 
using us gives the same result as %D.
Combining %T with a unit is available in 2.4.13 and later.

I'm not sure what to go with, but I think we should pick a unit and stick with it. @toni-moreno are you OK with it having the unit appended? It doesn't seem overly hard to parse to me, makes it immediately clear to all users what the unit is, and the regex (if we went with ms) is as simple as this:

([0-9]+\.[0-9]+)ms

@sparrc sparrc force-pushed the httpd-log-line-clf branch from b311131 to 2bece37 Compare June 9, 2016 10:07
@toni-moreno
Copy link

Hi @sparrc .

This kind of regex if just what we are trying to avoid. A single integer (not floating point and not text ) is the same as apache does and would be perfect for anybody how needs parse time or look for something in a easy way ( awk scripts by example )

For awk this:

" .... 7543 ms ...."

is better than:

"...... 7543ms ....."

With %D in apache we will see that:

"....... 7543000 ....."

In microseconds as you know.

If you show time in microseconds (without unit) all apache analizers will be able to analize influx logs !!!

@sparrc sparrc force-pushed the httpd-log-line-clf branch from 2bece37 to ce698b0 Compare June 9, 2016 15:39
@sparrc
Copy link
Contributor Author

sparrc commented Jun 9, 2016

fair enough @toni-moreno.....I've changed it to look more like the "Combined log format" with ID and response_time (in microseconds) at the end.

%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" %L %D

(see example section at end of http://httpd.apache.org/docs/current/mod/mod_log_config.html)

ready for review again @nathanielc

@jwilder
Copy link
Contributor

jwilder commented Jun 9, 2016

@joelegasse @jsternberg Can you take look?

@nathanielc
Copy link
Contributor

LGTM and +1 to matching apache %D exactly using microseconds.

fmt.Sprintf("%s", time.Since(start)),
// response time, report in microseconds because this is consistent
// with apache's %D parameter in mod_log_config
fmt.Sprintf("%d", time.Since(start).Nanoseconds()/1000),
Copy link
Contributor

@nathanielc nathanielc Jun 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In implementing this for Kapacitor I noticed that we should probably change this to strcnv.FormatInt to be consistent and probably faster.

strconv.FormatInt(time.Since(start).Nanoseconds()/1000, 10)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@sparrc sparrc force-pushed the httpd-log-line-clf branch 2 times, most recently from 7d0d349 to 4f6e854 Compare June 10, 2016 17:05
@joelegasse
Copy link
Contributor

I'm not a big fan of all the string creation and concatenation that's being done. I know it was already there before, but since we're here updating the format, can we switch to a fmt.Sprintf call, rather than creating a list of strings and calling strings.Join?

fmt.Sprintf("%s", time.Since(start)),
// response time, report in microseconds because this is consistent
// with apache's %D parameter in mod_log_config
strconv.FormatInt(time.Since(start).Nanoseconds()/1000, 10),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 1000 should probably be replaced with time.Microsecond to make it clear, and prevent reviewers from having to count zeros :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personally I think .Nanoseconds()/1000 is more readable than .Nanoseconds()/time.Duration(time.Microsecond))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you would only need time.Since(start)/time.Microsecond. No need for the double-conversion :-)

@sparrc
Copy link
Contributor Author

sparrc commented Jun 10, 2016

I'll make the change to just use fmt.Sprintf

@sparrc sparrc force-pushed the httpd-log-line-clf branch from 4f6e854 to 2c8e3c3 Compare June 10, 2016 17:32
@@ -99,7 +101,7 @@ func buildLogLine(l *responseLogger, r *http.Request, start time.Time) string {

userAgent := r.UserAgent()

fields := []string{
return fmt.Sprintf("%s %s %s [%s] \"%s %s %s\" %s %s \"%s\" \"%s\" %s %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second '%s' is a static "-", so that can be included directly. The start time is still being sent in as a result of fmt.Sprintf itself... making the actual output double-bracketed. The strconv calls are now unnecessary, and should be replaced with %d verbs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you have a lot of double-quotes, I would recommend a backtick string here

@sparrc sparrc force-pushed the httpd-log-line-clf branch from 2c8e3c3 to 672edc5 Compare June 13, 2016 11:41
@sparrc sparrc force-pushed the httpd-log-line-clf branch from 672edc5 to ba05c97 Compare June 13, 2016 11:44
@sparrc
Copy link
Contributor Author

sparrc commented Jun 13, 2016

alright I've made all the changes, please review again @joelegasse

cc @nathanielc

@joelegasse
Copy link
Contributor

LGTM

@e-dard
Copy link
Contributor

e-dard commented Jun 14, 2016

LGMT 👍

@sparrc sparrc force-pushed the httpd-log-line-clf branch 2 times, most recently from 6fc4cf1 to 38d54c8 Compare June 14, 2016 12:54
changes the httpd log lines from this:

    [httpd] 2016/06/08 14:06:39 ::1 - - [08/Jun/2016:14:06:39 +0100] POST /write?consistency=any&db=telegraf&precision=s&rp= HTTP/1.1 204 0 - InfluxDBClient d6aa01fc-2d79-11e6-8024-000000000000 2.751391ms

to this:

    [httpd] ::1 - - [08/Jun/2016:14:06:39 +0100] "POST /write?consistency=any&db=telegraf&precision=s&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" d6aa01fc-2d79-11e6-8024-000000000000 2751

So it changes a few things:

1. Remove the logger timestamp at the beginning which isn't very relevant anyways
2. adds quotes around "METHOD URI PROTOCOL", because this is part of the
common log format.
3. adds quotes around "AGENT" and "REFERRER" because this is part of the
"combined" log format.
4. Puts the response time in integer microseconds, because this is
consistent with apache's %D config mod option.

Compared with CLF, our logs now look like this:

    [httpd] %{COMMON_LOG_FORMAT} "<agent>" "<referrer>" <request_uuid> <response_time_µs>

For reference, see:
https://en.wikipedia.org/wiki/Common_Log_Format
http://httpd.apache.org/docs/current/mod/mod_log_config.html
@sparrc sparrc force-pushed the httpd-log-line-clf branch from 38d54c8 to b9a3f2e Compare June 14, 2016 13:48
@sparrc sparrc merged commit b9a3f2e into master Jun 14, 2016
@mark-rushakoff mark-rushakoff deleted the httpd-log-line-clf branch July 6, 2016 17:08
@timhallinflux timhallinflux added this to the 1.0.0 milestone Dec 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants