Skip to content

kaecyra/fluent-plugin-rewrite-tag-filter

 
 

Repository files navigation

fluent-plugin-rewrite-tag-filter Build Status

Overview

Rewrite Tag Filter for Fluentd. It is designed to rewrite tags like mod_rewrite.
Re-emit the record with rewrited tag when a value matches/unmatches with a regular expression.
Also you can change a tag from Apache log by domain, status code (ex. 500 error),
user-agent, request-uri, regex-backreference and so on with regular expression.

Installation

Install with gem or fluent-gem command as:

# for fluentd
$ gem install fluent-plugin-rewrite-tag-filter

# for td-agent
$ sudo /usr/lib64/fluent/ruby/bin/fluent-gem install fluent-plugin-rewrite-tag-filter

Configuration

Syntax

rewriterule<num> <attribute> <regex_pattern> <new_tag>

# Optional: Capitalize letter for every matched regex backreference. (ex: maps -> Maps)
# for more details, see usage.
capitalize_regex_backreference <yes/no> (default no)

# Optional: remove tag prefix for tag placeholder. (see the section of "Tag placeholder")
remove_tag_prefix <string>

# Optional: override hostname command for placeholder. (see the section of "Tag placeholder")
hostname_command <string>

Usage

It's a sample to exclude some static file log before split tag by domain.

<source>
  type tail
  path /var/log/httpd/access_log
  format apache2
  time_format %d/%b/%Y:%H:%M:%S %z
  tag td.apache.access
  pos_file /var/log/td-agent/apache_access.pos
</source>

# "capitalize_regex_backreference yes" affects converting every matched first letter of backreference to upper case. ex: maps -> Maps
# At rewriterule2, redirect to tag named "clear" which unmatched for status code 200.
# At rewriterule3, redirect to tag named "clear" which is not end with ".com"
# At rewriterule6, "site.$2$1" to be "site.ExampleMail" by capitalize_regex_backreference option.
<match td.apache.access>
  type rewrite_tag_filter
  capitalize_regex_backreference yes
  rewriterule1 path   \.(gif|jpe?g|png|pdf|zip)$  clear
  rewriterule2 status !^200$                      clear
  rewriterule3 domain !^.+\.com$                  clear
  rewriterule4 domain ^maps\.example\.com$        site.ExampleMaps
  rewriterule5 domain ^news\.example\.com$        site.ExampleNews
  rewriterule6 domain ^(mail)\.(example)\.com$    site.$2$1
  rewriterule7 domain .+                          site.unmatched
</match>

<match site.*>
  type mongo
  host localhost
  database apache_access
  remove_tag_prefix site
  tag_mapped
  capped
  capped_size 100m
</match>

<match clear>
  type null
</match>

Result

$ mongo
MongoDB shell version: 2.2.0
> use apache_access
switched to db apache_access
> show collections
ExampleMaps
ExampleNews
ExampleMail
unmatched

Debug

On starting td-agent, Logging supported like below.

$ tailf /var/log/td-agent/td-agent.log
2012-09-16 18:10:51 +0900: adding match pattern="td.apache.access" type="rewrite_tag_filter"
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [1, "path", /\.(gif|jpe?g|png|pdf|zip)$/, "clear"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [2, "domain", /^maps\.example\.com$/, "site.ExampleMaps"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [3, "domain", /^news\.example\.com$/, "site.ExampleNews"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [4, "domain", /^(mail)\.(example)\.com$/, "site.$2$1"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [5, "domain", /.+/, "site.unmatched"]

Tag placeholder

It is supported these placeholder for new_tag (rewrited tag).

  • ${tag}
  • __TAG__
  • {$tag_parts[n]}
  • __TAG_PARTS[n]__
  • ${hostname}
  • __HOSTNAME__

The placeholder of {$tag_parts[n]} and __TAG_PARTS[n]__ acts accessing the index which split the tag with "." (dot).
For example with td.apache.access tag, it will get td by ${tag_parts[0]} and apache by ${tag_parts[1]}.

Note Currently, range expression ${tag_parts[0..2]} is not supported.

Placeholder Option

  • remove_tag_prefix

This option adds removing tag prefix for ${tag} or __TAG__ in placeholder.

  • hostname_command

By default, execute command as hostname to get full hostname.
On your needs, it could override hostname command using hostname_command option.
It comes short hostname with hostname_command hostname -s configuration specified.

Placeholder Usage

It's a sample to rewrite a tag with placeholder.

# It will get "rewrited.access.ExampleMail"
<match apache.access>
  type rewrite_tag_filter
  rewriterule1  domain  ^(mail)\.(example)\.com$  rewrited.${tag}.$2$1
  remove_tag_prefix apache
</match>

# It will get "rewrited.ExampleMail.app30-124.foo.com" when hostname is "app30-124.foo.com"
<match apache.access>
  type rewrite_tag_filter
  rewriterule1  domain  ^(mail)\.(example)\.com$  rewrited.$2$1.${hostname}
</match>

# It will get "rewrited.ExampleMail.app30-124" when hostname is "app30-124.foo.com"
<match apache.access>
  type rewrite_tag_filter
  rewriterule1  domain  ^(mail)\.(example)\.com$  rewrited.$2$1.${hostname}
  hostname_command hostname -s
</match>

# It will get "rewrited.game.pool"
<match app.game.pool.activity>
  type rewrite_tag_filter
  rewriterule1  domain  ^.+$  rewrited.${tag_parts[1]}.${tag_parts[2]}
</match>

Example

Related Articles

TODO

Pull requests are very welcome!!

Copyright

Copyright : Copyright (c) 2012- Kentaro Yoshida (@yoshi_ken)
License : Apache License, Version 2.0

About

Fluentd Output filter plugin to rewrite tags that matches specified attribute.

Resources

License

Stars

Watchers

Forks

Packages

No packages published