Implementing pre and post processors

Pre-processors and post-processors are code blocks which get called only once per ETL run:

Pre-processors get called before the ETL starts reading rows from the sources.
Post-processors get invoked after the ETL successfully processed all the rows.

⚠️ Post-processors won't get called if an error occurred before them.

These blocks can be used for a variety of tasks.

For instance, one can handling a form of reporting this way:

count = 0
job = Kiba.parse do
  source MySource, source_config
  transform do |row|
    count += 1
    row
  end
  # SNIP

  post_process do
    Email.send(supervisor_address, "#{count} rows read from source")
  end
end

⚠️ This is an example of why you do not want to re-use the job instance between calls to Kiba.run. Here the count variable would not be reset!

Next: Implementation Guidelines

This wiki is tracked by git and publicly editable. You are welcome to fix errors and typos. Any defacing or vandalism of content will result in your changes being reverted and you being blocked.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing pre and post processors

Clone this wiki locally