Skip to content
Todd Palino edited this page Dec 1, 2017 · 7 revisions

Overview

Burrow is a monitoring tool for keeping track of consumer lag in Apache Kafka. It is designed to monitor every consumer group that is committing offsets to either Kafka or Zookeeper, and to monitor every topic and partition consumed by those groups. This provides a comprehensive view of consumer status.

Burrow also provides several HTTP request endpoints for getting information about the Kafka cluster and consumers, separate from the lag status. This can be very useful for creating applications that assist with managing your Kafka clusters when it is not convenient (or possible) to run a Java Kafka client.

Why Not MaxLag?

The standard Kafka consumer does have a built-in metric to track MaxLag. While this can be convenient, it has several flaws:

  • MaxLag must be monitored on every consumer The MaxLag metric must be collected from every consumer. These metrics have to be collated and interpreted separately.
  • MaxLag is only valid when the consumer is live The metric is reported by the consumer itself. If the consumer is not running, no metric is available.
  • MaxLag is not objective Because the consumer itself reports the metric, MaxLag cannot be an objective measure of consumer lag. The consumer measures it after fetching messages, so if there is any problem with consuming, an incorrect value can be reported.
  • MaxLag is only provided by the Java client The only official Kafka client is the Java client, and this is the only client that has the metric available. It can certainly be worked into other clients, but then you have to worry about subtle differences in measurement and collection of the metric.

How Does It Work?

Burrow has a modular design that separates out the work needed to multiple subsystems:

  • Clusters run a Kafka client that periodically updates topic lists and the current HEAD offset (the most recent offset) for every partition.
  • Consumers fetch information about consumer groups from a repository. This could be a Kafka cluster (consuming the __consumer_offsets topic), or Zookeeper, or some other repository.
  • The Storage subsystem stores all of this information in Burrow
  • The Evaluator subsystem retrieves information from the Storage subsystem for a specific consumer group and calculates the status of that group. This follows the consumer lag evaluation rules.
  • The Notifier subsystem requests status on consumer groups on a configured interval and send out notifications (Email, HTTP, or some other method) for groups that meet configured criteria.
  • The HTTP Server subsystem provides an API interface to Burrow for fetching information about clusters and consumers. See what HTTP requests are available to use.

Configuration and Starting Burrow

Burrow uses the viper configuration framework for Golang applications. More details on the required configs are provided on the Configuration page.

Once you have a configuration file, simply start burrow with the configuration to point to the directory that the configuration file is in. That option can be omitted if the file is in the current working directory.

$ ./Burrow --config-dir=/path/to/configurations

Other Docs