Graphite Input Protocol Parsing #3125

jwilder · 2015-06-24T19:38:56Z

Introduction

The current graphite input assumes a format for incoming metrics in order to be able to pull out relevant pieces to use as tags.

Unfortunately, many systems do not use this format and are unable to change how metrics are sent which makes it difficult to switch to this plugin. The graphite plugin needs more flexibility in how it receives and parses graphite metrics.

There has been some discussion and proposed changes to the graphite plugin. See these issues for more discussion and background:

This PR builds on the ideas found in those two issues and aims to make the plugin usable for all graphite inputs but also allow for the ability to extract tags without requiring a custom format to be used.

Design

The graphite plugin allows measurement to be saved using the graphite line protocol. By default, enabling the graphite plugin will allow you to collect metrics and store them using the the metric name as the measurement. If you send a metric named servers.localhost.cpu.loadavg.10, it will store the the full metric name as the measurement with no extracted tags.

While this default setup works, it is not the ideal way to store measurements in InfluxDB since it does not take advantage of tags. It also will not perform optimally with a large dataset sizes since queries will be forced to use regexes which is known to not scale well.

To extract tags from metrics, one or more templates must be configured to parse metrics into tags and measurements.

Templates

Templates allow matching parts of a metric name to use as tag names in the stored metric. They have a similar format to graphite metric names. The values in between the separators are used as the tag name. The location of the tag name that matches the same position as the graphite metric section is used as the value. If there is no value, the graphite portion is skipped.

The special value measurement is used to define the measurement name. It can have a trailing * to indicate that the remainder of the metric should be used. If a measurement is not specified, the full metic name is used.

Basic Matching

servers.localhost.cpu.loadavg.10

Template: .host.resource.measurement*
Output: measurement =loadavg.10 tags =host=localhost resource=cpu

Multiple Measurement Matching

The measurement can be specified multiple times in a template to provide more control over the measurment name. Multiple values will be joined together using the Separator config variable. By default, this value is is ..

servers.localhost.cpu.cpu0.user

Template: .host.measurement.cpu.measurement
Output: measurement = cpu.user tags = host=localhost cpu=cpu0

Since '.' requires queries on measurements to be double-quoted, you may want to set this to _ to simplify querying parsed metrics.

servers.localhost.cpu.cpu0.user

Separator: _
Template: .host.measurement.cpu.measurement
Output: measurement = cpu_user tags = host=localhost cpu=cpu0

Adding Tags

Additional tags can be added to a metric that don't exist on the received metric. You can add additional tags by specifying them after the pattern. Tags have the same format as the line protocol. Multiple tags are separated by commas.

servers.localhost.cpu.loadavg.10

Template: .host.resource.measurement* region=us-west,zone=1a
Output: measurement = loadavg.10 tags = host=localhost resource=cpu region=us-west zone=1a

Multiple Templates

One template may not match all metrics. For example, using multiple plugins with diamond will produce metrics in different formats. If you need to use multiple templates, you'll need to define a prefix filter that must match before the template can be applied.

Filters

Filters have a similar format to templates but work more like wildcard expressions. When multiple filters would match a metric, the more specific one is chosen. Filters are configured by adding them before the template.

For example,

servers.localhost.cpu.loadavg.10
servers.host123.elasticsearch.cache_hits 100
servers.host456.mysql.tx_count 10

servers.* would match all values
servers.*.mysql would match servers.host456.mysql.tx_count 10
servers.localhost.* would match servers.localhost.cpu.loadavg

Default Templates

If no template filters are defined or you want to just have one basic template, you can define a default template. This template will apply to any metric that has not already matched a filter.

dev.http.requests.200
prod.myapp.errors.count
dev.db.queries.count

env.app.measurement* would create
- measurement=requests.200 tags=env=dev,app=http
- measurement= errors.count tags=env=prod,app=myapp
- measurement=queries.count tags=env=dev,app=db

Global Tags

If you need to add the same set of tags to all metrics, you can define them globally at the plugin level and not within each template description.

Minimal Config

[[graphite]]
  enabled = true
  # bind-address = ":2003"
  # protocol = "tcp"
  # consistency-level = "one"

  ### If matching multiple measurement files, this string will be used to join the matched values.
  # separator = "."

  ### Default tags that will be added to all metrics.  These can be overriden at the template level
  ### or by tags extracted from metric
  # tags = ["regions=us-east", "zone=1c"]

  ### Each template line requires a template pattern.  It can have an optional
  ### filter before the template and separated by spaces.  It can also have optional extra
  ### tags following the template.  Multiple tags should be separated by commas and no spaces
  ### similar to the line protocol format.  The can be only one default template.
  # templates = [
  #   "*.app env.service.resource.measurement",
  #   # Default template
  #   "server.*",
 #]

Customized Config

[[graphite]]
   enabled = true
   separator = "_"
   tags = ["region=us-east", "zone=1c"]
   templates = [
      # filter + template
      "*.app env.service.resource.measurement",

     # filter + template + extra tag
     "stats.* .host.measurement* region=us-west,agent=sensu",

      # default template. Ignore the first graphite component "servers"
     ".measurement*",
 ]

beckettsean · 2015-06-24T21:06:53Z

@jwilder I directly edited to fix some typos, all makes sense to me as a naive graphite user.

otoolep · 2015-06-25T01:23:53Z

services/graphite/config.go

+	hasMeasurement := false
+	for _, p := range strings.Split(template, ".") {
+		if p == "measurement" || p == "measurement*" {
+			hasMeasurement = true


Since it looks like you're looping here, can hasMeasurement go true, then false? If not, then why not just return the moment you find a measurement? Am I missing something?

Ah, no, because you never set it back to false again. OK. But could you just return once you find the first one?

otoolep · 2015-06-25T02:03:23Z

Looks like a thorough job. I paid particular attention to the test cases, as a way of validating the code. Looks good. I did have some nitpicks about spelling, but since much of this user-facing documentation, I know you'll want it to be right.

Had a couple of minor logic questions, but I cool with this going in once those questions are answered. +1

otoolep · 2015-06-25T02:04:30Z

Also, I wonder if there is any point in an integration test in server_test.go? Even a simple one to ensure the input can come up?

This filter implementation is fairly naive and won't scale well to large numbers of templates and filters. It will be replaced with a trie-based approach in the future.

These are tags that can be add to all metrics.

These are tags that can be added at the template level. They will override any global tags and any parsed tags with the same name from the metric will override these.

This adds a sorted search tree for matchining filters to a template more efficiently. Each filter is split on "." and each element is added to the tree. Patterns with matching prefixes are added under the same subtree.

Provides a little more flexibility in controlling the parsed metric names for metris like: servers.localhost.cpu.cpu0.user Previously, you could only use a single field like "cpu", "user" or a wildcard to match "cpu.cpu0.user". You can now pull out "cpu" and "user" and join them together in the metric name using a custom separator character. By default this is ".".

A filter should map directly to one template, allowing duplicate filters is not supported.

another filter.

Fixes #2102 #2966

If no timestamp is sent or the value -1 is sent, the current UTC time is used.

Graphite Input Protocol Parsing

haf · 2015-06-26T16:12:38Z

services/graphite/README.md

+```
+
+* `env.app.measurement*` would create
+  * _measurement_=`requests.200` _tags_=`env=dev,app=http`


Are tags comma- or space-separated? (see "Adding Tags" above)

When used in the template, the should be comma separated. The tags should be separated from the template which a space though.

jwilder added the 2 - Working label Jun 24, 2015

jwilder mentioned this pull request Jun 24, 2015

Graphite name parser #2572

Closed

otoolep reviewed Jun 25, 2015
View reviewed changes

cannium added 2 commits June 24, 2015 23:09

Add fields to config metric name schema of graphite

c130efb

Fix unit tests for graphite

2a383e6

jwilder added 20 commits June 24, 2015 23:09

Add basic template filtering support

cab9e36

This filter implementation is fairly naive and won't scale well to large numbers of templates and filters. It will be replaced with a trie-based approach in the future.

Add support for global tags

9cd82ae

These are tags that can be add to all metrics.

Add support for per-template default tags

b55981f

These are tags that can be added at the template level. They will override any global tags and any parsed tags with the same name from the metric will override these.

Add validation for graphite config templates and tags

fed8d67

Use strings.Fields to bef more forgiving of whitespace

dd0e6e5

Add sample graphite config to default config

ea348dd

Fix validation failing when using a default template

1ecf9b5

Add graphite plugin readme

b294930

Update tempalte format comment

98cbfdc

Use search tree for filter matching

a2a1956

This adds a sorted search tree for matchining filters to a template more efficiently. Each filter is split on "." and each element is added to the tree. Patterns with matching prefixes are added under the same subtree.

Prevent duplicate filters in config

613b1d2

A filter should map directly to one template, allowing duplicate filters is not supported.

Add test for matching similar patterns

9ed71ad

Add graphite parser benchmark

a76e812

Add comments to graphite parser

ba7187f

Fix default template being returned when partially matching

320a951

another filter.

Update changelog

b0cda03

Fixes #2102 #2966

Code review fixes

fbfb90d

Use raw metric name when default template fails to match

c5a10cf

Handle timestamp special cases

562d7cd

If no timestamp is sent or the value -1 is sent, the current UTC time is used.

jwilder force-pushed the jw-graphite branch from e8a623c to 562d7cd Compare June 25, 2015 05:53

jwilder added a commit that referenced this pull request Jun 25, 2015

Merge pull request #3125 from influxdb/jw-graphite

357b658

Graphite Input Protocol Parsing

jwilder merged commit 357b658 into master Jun 25, 2015

jwilder removed the 2 - Working label Jun 25, 2015

jwilder deleted the jw-graphite branch June 25, 2015 06:13

jwilder mentioned this pull request Jun 25, 2015

Graphite Input Parsing Proposal #2996

Closed

haf reviewed Jun 26, 2015
View reviewed changes

bmhatfield mentioned this pull request Mar 8, 2016

Update influxdbHandler.py python-diamond/Diamond#420

Closed

icinga-migration mentioned this pull request Jan 17, 2017

[dev.icinga.com #10480] Add InfluxDbWriter feature Icinga/icinga2#3562

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Graphite Input Protocol Parsing #3125

Graphite Input Protocol Parsing #3125

jwilder commented Jun 24, 2015

beckettsean commented Jun 24, 2015

otoolep Jun 25, 2015

otoolep Jun 25, 2015

otoolep commented Jun 25, 2015

otoolep commented Jun 25, 2015

haf Jun 26, 2015

jwilder Jun 26, 2015

Graphite Input Protocol Parsing #3125

Graphite Input Protocol Parsing #3125

Conversation

jwilder commented Jun 24, 2015

Introduction

Design

Templates

Basic Matching

Multiple Measurement Matching

Adding Tags

Multiple Templates

Filters

Default Templates

Global Tags

Minimal Config

Customized Config

beckettsean commented Jun 24, 2015

otoolep Jun 25, 2015

Choose a reason for hiding this comment

otoolep Jun 25, 2015

Choose a reason for hiding this comment

otoolep commented Jun 25, 2015

otoolep commented Jun 25, 2015

haf Jun 26, 2015

Choose a reason for hiding this comment

jwilder Jun 26, 2015

Choose a reason for hiding this comment