-
-
Notifications
You must be signed in to change notification settings - Fork 88
Implementing pre and post processors
Thibaut Barrère edited this page Apr 16, 2017
·
2 revisions
Pre-processors and post-processors are currently blocks, which get called only once per ETL run:
- Pre-processors get called before the ETL starts reading rows from the sources.
- Post-processors get invoked after the ETL successfully processed all the rows.
Note that post-processors won't get called if an error occurred earlier.
count = 0
def system!(cmd)
fail "Command #{cmd} failed" unless system(cmd)
end
file = 'my_file.csv'
sample_file = 'my_file.sample.csv'
pre_process do
# it's handy to work with a reduced data set. you can
# e.g. just keep one line of the CSV files + the headers
system! "sed -n \"1p;25706p\" #{file} > #{sample_file}"
end
source MyCsv, file: sample_file
transform do |row|
count += 1
row
end
post_process do
Email.send(supervisor_address, "#{count} rows successfully processed")
end
Home | Core Concepts | Defining jobs | Running jobs | Writing sources | Writing transforms | Writing destinations | Implementation Guidelines | Kiba Pro
This wiki is tracked by git and publicly editable. You are welcome to fix errors and typos. Any defacing or vandalism of content will result in your changes being reverted and you being blocked.