This repository aimed to aggregate airflow plugins developed based on some specific ETL scenarios in the company within plugins
folder, but only event_plugins
with kafka and some related kafka operators available so far. Check Event Plugins for more design details.
To be short, the task in DAG can be triggered by multiple kafka messages and show status in UI. Check example DAG
Default airflow plugins are in
$AIRFLOW_HOME/plugins
folder (path is configured inairflow.cfg
)
- copy needed plugins within plugins folder to
$AIRFLOW_HOME/plugins
folder. e.g.,
cp -r event_plugins $AIRFLOW_HOME/plugins
- use it within DAGs
If using AIRFLOW, you might have a repository to aggregate all the airflow plugins developed by all the developers. It's recommended to use different folders to store different types of plugins
Only supports Python 2
- Event Plugins
- KafkaConsumerOperator: Works as a
Airflow Sensor
that can define multiple events to trigger jobs afterwards. It use dummy tasks to show the status of each events.
- KafkaConsumerOperator: Works as a
- kafka_consumer
- KafkaConsumerOperator
- send email for status of kafka consumer operator
- kafka_producer
apache-airflow
croniter
python-dateutil
sqlalchemy
confluent-kafka
jinja
pytest
mock
pytest-mock
pytest-cov
./run_test.sh
- It's available to add test arguments
# show detail
./run_test.sh -vvv
# show coverage in console
./run_test.sh --cov-config=.coveragerc --cov=./
- Maybe use unittest instead of pytest to follow testing framework in airflow ...
- Add more unit tests for error handling and some operators
- Add integration tests with DAG
- Support Python 3
- Event plugins with other source types
- Event plugins with multiple source types