Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graphite Input Protocol Parsing #3125

Merged
merged 29 commits into from
Jun 25, 2015
Merged

Graphite Input Protocol Parsing #3125

merged 29 commits into from
Jun 25, 2015

Conversation

jwilder
Copy link
Contributor

@jwilder jwilder commented Jun 24, 2015

Introduction

The current graphite input assumes a format for incoming metrics in order to be able to pull out relevant pieces to use as tags.

Unfortunately, many systems do not use this format and are unable to change how metrics are sent which makes it difficult to switch to this plugin. The graphite plugin needs more flexibility in how it receives and parses graphite metrics.

There has been some discussion and proposed changes to the graphite plugin. See these issues for more discussion and background:

This PR builds on the ideas found in those two issues and aims to make the plugin usable for all graphite inputs but also allow for the ability to extract tags without requiring a custom format to be used.

Design

The graphite plugin allows measurement to be saved using the graphite line protocol. By default, enabling the graphite plugin will allow you to collect metrics and store them using the the metric name as the measurement. If you send a metric named servers.localhost.cpu.loadavg.10, it will store the the full metric name as the measurement with no extracted tags.

While this default setup works, it is not the ideal way to store measurements in InfluxDB since it does not take advantage of tags. It also will not perform optimally with a large dataset sizes since queries will be forced to use regexes which is known to not scale well.

To extract tags from metrics, one or more templates must be configured to parse metrics into tags and measurements.

Templates

Templates allow matching parts of a metric name to use as tag names in the stored metric. They have a similar format to graphite metric names. The values in between the separators are used as the tag name. The location of the tag name that matches the same position as the graphite metric section is used as the value. If there is no value, the graphite portion is skipped.

The special value measurement is used to define the measurement name. It can have a trailing * to indicate that the remainder of the metric should be used. If a measurement is not specified, the full metic name is used.

Basic Matching

servers.localhost.cpu.loadavg.10

  • Template: .host.resource.measurement*
  • Output: measurement =loadavg.10 tags =host=localhost resource=cpu

Multiple Measurement Matching

The measurement can be specified multiple times in a template to provide more control over the measurment name. Multiple values will be joined together using the Separator config variable. By default, this value is is ..

servers.localhost.cpu.cpu0.user

  • Template: .host.measurement.cpu.measurement
  • Output: measurement = cpu.user tags = host=localhost cpu=cpu0

Since '.' requires queries on measurements to be double-quoted, you may want to set this to _ to simplify querying parsed metrics.

servers.localhost.cpu.cpu0.user

  • Separator: _
  • Template: .host.measurement.cpu.measurement
  • Output: measurement = cpu_user tags = host=localhost cpu=cpu0

Adding Tags

Additional tags can be added to a metric that don't exist on the received metric. You can add additional tags by specifying them after the pattern. Tags have the same format as the line protocol. Multiple tags are separated by commas.

servers.localhost.cpu.loadavg.10

  • Template: .host.resource.measurement* region=us-west,zone=1a
  • Output: measurement = loadavg.10 tags = host=localhost resource=cpu region=us-west zone=1a

Multiple Templates

One template may not match all metrics. For example, using multiple plugins with diamond will produce metrics in different formats. If you need to use multiple templates, you'll need to define a prefix filter that must match before the template can be applied.

Filters

Filters have a similar format to templates but work more like wildcard expressions. When multiple filters would match a metric, the more specific one is chosen. Filters are configured by adding them before the template.

For example,

servers.localhost.cpu.loadavg.10
servers.host123.elasticsearch.cache_hits 100
servers.host456.mysql.tx_count 10
  • servers.* would match all values
  • servers.*.mysql would match servers.host456.mysql.tx_count 10
  • servers.localhost.* would match servers.localhost.cpu.loadavg

Default Templates

If no template filters are defined or you want to just have one basic template, you can define a default template. This template will apply to any metric that has not already matched a filter.

dev.http.requests.200
prod.myapp.errors.count
dev.db.queries.count
  • env.app.measurement* would create
    • measurement=requests.200 tags=env=dev,app=http
    • measurement= errors.count tags=env=prod,app=myapp
    • measurement=queries.count tags=env=dev,app=db

Global Tags

If you need to add the same set of tags to all metrics, you can define them globally at the plugin level and not within each template description.

Minimal Config

[[graphite]]
  enabled = true
  # bind-address = ":2003"
  # protocol = "tcp"
  # consistency-level = "one"

  ### If matching multiple measurement files, this string will be used to join the matched values.
  # separator = "."

  ### Default tags that will be added to all metrics.  These can be overriden at the template level
  ### or by tags extracted from metric
  # tags = ["regions=us-east", "zone=1c"]

  ### Each template line requires a template pattern.  It can have an optional
  ### filter before the template and separated by spaces.  It can also have optional extra
  ### tags following the template.  Multiple tags should be separated by commas and no spaces
  ### similar to the line protocol format.  The can be only one default template.
  # templates = [
  #   "*.app env.service.resource.measurement",
  #   # Default template
  #   "server.*",
 #]

Customized Config

[[graphite]]
   enabled = true
   separator = "_"
   tags = ["region=us-east", "zone=1c"]
   templates = [
      # filter + template
      "*.app env.service.resource.measurement",

     # filter + template + extra tag
     "stats.* .host.measurement* region=us-west,agent=sensu",

      # default template. Ignore the first graphite component "servers"
     ".measurement*",
 ]

@beckettsean
Copy link
Contributor

@jwilder I directly edited to fix some typos, all makes sense to me as a naive graphite user.

hasMeasurement := false
for _, p := range strings.Split(template, ".") {
if p == "measurement" || p == "measurement*" {
hasMeasurement = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it looks like you're looping here, can hasMeasurement go true, then false? If not, then why not just return the moment you find a measurement? Am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no, because you never set it back to false again. OK. But could you just return once you find the first one?

@otoolep
Copy link
Contributor

otoolep commented Jun 25, 2015

Looks like a thorough job. I paid particular attention to the test cases, as a way of validating the code. Looks good. I did have some nitpicks about spelling, but since much of this user-facing documentation, I know you'll want it to be right.

Had a couple of minor logic questions, but I cool with this going in once those questions are answered. +1

@otoolep
Copy link
Contributor

otoolep commented Jun 25, 2015

Also, I wonder if there is any point in an integration test in server_test.go? Even a simple one to ensure the input can come up?

jwilder added 20 commits June 24, 2015 23:09
This filter implementation is fairly naive and won't scale well
to large numbers of templates and filters.  It will be replaced
with a trie-based approach in the future.
These are tags that can be add to all metrics.
These are tags that can be added at the template level.  They
will override any global tags and any parsed tags with the same
name from the metric will override these.
This adds a sorted search tree for matchining filters to a template
more efficiently.  Each filter is split on "." and each element is
added to the tree.  Patterns with matching prefixes are added under
the same subtree.
Provides a little more flexibility in controlling the parsed
metric names for metris like:

  servers.localhost.cpu.cpu0.user

Previously, you could only use a single field like "cpu", "user"
or a wildcard to match "cpu.cpu0.user".  You can now pull out "cpu"
and "user" and join them together in the metric name using a custom
separator character.  By default this is ".".
A filter should map directly to one template, allowing duplicate
filters is not supported.
If no timestamp is sent or the value -1 is sent, the current UTC
time is used.
jwilder added a commit that referenced this pull request Jun 25, 2015
Graphite Input Protocol Parsing
@jwilder jwilder merged commit 357b658 into master Jun 25, 2015
@jwilder jwilder deleted the jw-graphite branch June 25, 2015 06:13
```

* `env.app.measurement*` would create
* _measurement_=`requests.200` _tags_=`env=dev,app=http`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are tags comma- or space-separated? (see "Adding Tags" above)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When used in the template, the should be comma separated. The tags should be separated from the template which a space though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants