Skip to content

Implementing pre and post processors

Thibaut Barrère edited this page Feb 10, 2020 · 2 revisions

Pre-processors and post-processors are code blocks which get called only once per ETL run:

  • Pre-processors get called before the ETL starts reading rows from the sources.
  • Post-processors get invoked after the ETL successfully processed all the rows.

⚠️ Post-processors won't get called if an error occurred before them.

These blocks can be used for a variety of tasks.

For instance, one can handling a form of reporting this way:

count = 0
job = Kiba.parse do
  source MySource, source_config
  transform do |row|
    count += 1
    row
  end
  # SNIP

  post_process do
    Email.send(supervisor_address, "#{count} rows read from source")
  end
end

⚠️ This is an example of why you do not want to re-use the job instance between calls to Kiba.run. Here the count variable would not be reset!

Next: Implementation Guidelines