Skip to content

Sawmill Pipeline

Joshua Schnitzer edited this page Apr 12, 2018 · 14 revisions

Sawmill Pipelines

Pipelines DSL

Sawmill pipeline can be written in JSON or HOCON formats. Most wiki examples are in HOCON format. More info on HOCON here

Pipeline Structure

Sawmill pipeline is a list of the steps that should process on document. The steps executes in regular order. The pipeline step is a processor or if statement. The list of steps can be finished with stopOnFailure command.

{
  steps: [
    {<processor or statement>}
    ...
    {<processor or statement>}
  ]
}

Templates

Templates is the ability to add data from other fields to a new field name or value. Sawmill uses mustache for templating.

  • You can call the value of another field using Mustache syntax EG: {{field_name}}
  • You can call the value of an element in an array field by using {{field_name.index}}. EG: {{field_name.0}} or {{field_name.first}}, {{field_name.last}}
  • Date template could be used to put the current date in a desired format

The example is how to add a field called "timestamp" with the previous values of the "date" and "time" fields, and the current year

{
  addField: {
    config: {
      path: "timestamp"
      value: "{{date}} {{time}} {{#dateTemplate}}yyyy{{/dateTemplate}}"
    }
  }
}

If Statement

You can add conditions in pipelines. If conditions have the following structure:

{  
  if: {  
    condition: {  
      <condition or operator>
      },
      then: [{<processor or statement>}, {<processor or statement>}]
      else: [{<processor or statement>}, {<processor or statement>}]         
    }
  }
}

Here is an example of a pipeline which will drop any Doc if its 'message' field is starting with '#'

{
  if: {
    condition: {
      matchRegex: {
        field: "message"
        regex: "^#"
        matchPartOfValue: "true"
      }
    },
    "then": [{
      "drop": { "config": {} }
    }]
  }
}
Clone this wiki locally