Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose metrics using go-metrics (#683) #688

Closed
wants to merge 2 commits into from

Conversation

slaunay
Copy link
Contributor

@slaunay slaunay commented Jun 29, 2016

  • add MetricRegistry configuration parameter that defaults to metrics.DefaultRegistry
  • provide the following metrics:
    • incoming-byte-rate meter (global and per registered broker)
    • request-rate meter (global and per registered broker)
    • request-size histogram (global and per registered broker)
    • outgoing-byte-rate meter (global and per registered broker)
    • response-rate meter (global and per registered broker)
    • response-size histogram (global and per registered broker)
    • batch-size histogram (global and per topic)
    • record-send-rate meter (global and per topic)
    • records-per-request histogram (global and per topic)
    • compression-rate histogram (global and per topic)
  • add metrics flag to kafka-console-producer to output metrics
  • validate metrics in functional_producer_test

This is a work in progress, next steps:

  • rebase with master to fix merge issue
  • unit tests
  • documentation
  • example of exporting metrics to Graphite using a prefix

- add MetricRegistry configuration parameter that defaults to
  metrics.DefaultRegistry
- provide the following metrics:
  - incoming-byte-rate meter (global and per registered broker)
  - request-rate meter (global and per registered broker)
  - request-size histogram (global and per registered broker)
  - outgoing-byte-rate meter (global and per registered broker)
  - response-rate meter (global and per registered broker)
  - response-size histogram (global and per registered broker)
  - batch-size histogram (global and per topic)
  - record-send-rate meter (global and per topic)
  - records-per-request histogram (global and per topic)
  - compression-rate histogram (global and per topic)
- add metrics flag to kafka-console-producer to output metrics
- validate metrics in functional_producer_test
@slaunay
Copy link
Contributor Author

slaunay commented Jun 29, 2016

Example of metrics output when using kafka-console-producer:

./kafka-console-producer -brokers 192.168.100.67:9091 -topic test.1 -value message -metrics
topic=test.1    partition=0 offset=10003
histogram batch-size
  count:               1
  min:                41
  max:                41
  mean:               41.00
  stddev:              0.00
  median:             41.00
  75%:                41.00
  95%:                41.00
  99%:                41.00
  99.9%:              41.00
histogram batch-size-for-topic-test.1
  count:               1
  min:                41
  max:                41
  mean:               41.00
  stddev:              0.00
  median:             41.00
  75%:                41.00
  95%:                41.00
  99%:                41.00
  99.9%:              41.00
histogram compression-rate
  count:               1
  min:               100
  max:               100
  mean:              100.00
  stddev:              0.00
  median:            100.00
  75%:               100.00
  95%:               100.00
  99%:               100.00
  99.9%:             100.00
histogram compression-rate-for-topic-test.1
  count:               1
  min:               100
  max:               100
  mean:              100.00
  stddev:              0.00
  median:            100.00
  75%:               100.00
  95%:               100.00
  99%:               100.00
  99.9%:             100.00
meter incoming-byte-rate
  count:            5241
  1-min rate:          0.00
  5-min rate:          0.00
  15-min rate:         0.00
  mean rate:      356437.67
meter incoming-byte-rate-for-broker-9093
  count:              38
  1-min rate:          0.00
  5-min rate:          0.00
  15-min rate:         0.00
  mean rate:        6004.04
meter outgoing-byte-rate
  count:             107
  1-min rate:          0.00
  5-min rate:          0.00
  15-min rate:         0.00
  mean rate:       12559.68
meter outgoing-byte-rate-for-broker-9093
  count:              83
  1-min rate:          0.00
  5-min rate:          0.00
  15-min rate:         0.00
  mean rate:      510750.37
meter record-send-rate
  count:               1
  1-min rate:          0.00
  5-min rate:          0.00
  15-min rate:         0.00
  mean rate:      857632.93
meter record-send-rate-for-topic-test.1
  count:               1
  1-min rate:          0.00
  5-min rate:          0.00
  15-min rate:         0.00
  mean rate:      519480.52
histogram records-per-request
  count:               1
  min:                 1
  max:                 1
  mean:                1.00
  stddev:              0.00
  median:              1.00
  75%:                 1.00
  95%:                 1.00
  99%:                 1.00
  99.9%:               1.00
histogram records-per-request-for-topic-test.1
  count:               1
  min:                 1
  max:                 1
  mean:                1.00
  stddev:              0.00
  median:              1.00
  75%:                 1.00
  95%:                 1.00
  99%:                 1.00
  99.9%:               1.00
meter request-rate
  count:               2
  1-min rate:          0.00
  5-min rate:          0.00
  15-min rate:         0.00
  mean rate:         234.18
meter request-rate-for-broker-9093
  count:               1
  1-min rate:          0.00
  5-min rate:          0.00
  15-min rate:         0.00
  mean rate:        5795.62
histogram request-size
  count:               2
  min:                24
  max:                83
  mean:               53.50
  stddev:             29.50
  median:             53.50
  75%:                83.00
  95%:                83.00
  99%:                83.00
  99.9%:              83.00
histogram request-size-for-broker-9093
  count:               1
  min:                83
  max:                83
  mean:               83.00
  stddev:              0.00
  median:             83.00
  75%:                83.00
  95%:                83.00
  99%:                83.00
  99.9%:              83.00
meter response-rate
  count:               2
  1-min rate:          0.00
  5-min rate:          0.00
  15-min rate:         0.00
  mean rate:         136.45
meter response-rate-for-broker-9093
  count:               1
  1-min rate:          0.00
  5-min rate:          0.00
  15-min rate:         0.00
  mean rate:         158.71
histogram response-size
  count:               2
  min:                38
  max:              5203
  mean:             2620.50
  stddev:           2582.50
  median:           2620.50
  75%:              5203.00
  95%:              5203.00
  99%:              5203.00
  99.9%:            5203.00
histogram response-size-for-broker-9093
  count:               1
  min:                38
  max:                38
  mean:               38.00
  stddev:              0.00
  median:             38.00
  75%:                38.00
  95%:                38.00
  99%:                38.00
  99.9%:              38.00

@@ -84,6 +99,23 @@ func (b *Broker) Open(conf *Config) error {

b.conf = conf

// Create or reuse the metrics shared between brokers
b.incomingByteRate = metrics.GetOrRegisterMeter("incoming-byte-rate", conf.MetricRegistry)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the metrics.GetOrRegister* methods concurrency-safe? Because many brokers can call these at once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they are through a sync.Mutex.
I was thinking about caching the metrics resolved to avoid hitting that mutex but after running the existing benchmarks for the master branch and this branch I did not notice any slowness or contention.

- provide underlying MessageSet when encoding fake compressed "message"
- use len(msg.Set.Messages) for counting records
- expose the configured metric registry in ProduceRequest
- expose current offset in packetEncoder for batch size metric
- expose real encoder flag in packetEncoder for recording metrics only once
- record metrics in produce_request.go
@slaunay
Copy link
Contributor Author

slaunay commented Jul 4, 2016

I moved most of the metric gathering code to produce_request.go but I had to expose a couple of things to make it work:

  • len(set.Messages[0].Msg.Set.Messages) was not working as the Set field was not set for fake compressed "message"
  • the current offset of the packetEncoder is necessary to compute the batch size
  • a flag to know if this is the second pass in order to collect metrics once

If this looks better, I'll rebase against master, provide unit tests, documentation and some examples.

@@ -464,13 +498,15 @@ func (b *Broker) responseReceiver() {
}

buf := make([]byte, decodedHeader.length-4)
_, err = io.ReadFull(b.conn, buf)
remainingPayloadSize, err := io.ReadFull(b.conn, buf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to capture this value (or headerSize above) since ReadFull will return an error unless it completely fills the buffer, and we'll never even get to the metric call in that case. You should be able to use len(header) + len(buf) instead.

@eapache
Copy link
Contributor

eapache commented Jul 9, 2016

This is definitely better, thanks for all your work! It's getting large enough that when you rebase etc. I would ask you split it in two please: one PR for the config and docs and broker metrics, and then a follow-up PR adding producer metrics. It will be easier to review in chunks like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants