Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes your tasks on an array of workers while following the specified dependencies.
References
- Product - https://airflow.apache.org/
- Documentation - https://airflow.apache.org/docs/
- Github - https://github.com/apache/airflow
Prerequisites: You should allocate at least 4GB memory for the Docker Engine (ideally 8GB).
Local
- Docker Desktop Running
Cloud
- Linux VM
- SSH Connection
- Installed Docker Engine - Install using the convenience script
-
Create a new directory
mkdir -p ~/app cd ~/app
-
Running Airflow in Docker - Refer
a. You can check if you have enough memory by running this command
docker run --rm "debian:bullseye-slim" bash -c 'numfmt --to iec $(echo $(($(getconf _PHYS_PAGES) * $(getconf PAGE_SIZE))))'
b. Fetch docker-compose.yaml
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.5.1/docker-compose.yaml'
c. Setting the right Airflow user
mkdir -p ./dags ./logs ./plugins ./working_data echo -e "AIRFLOW_UID=$(id -u)" > .env
d. Update the following in docker-compose.yml
# Donot load examples AIRFLOW__CORE__LOAD_EXAMPLES: 'false' # Additional python package _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:- pandas } # Output dir - ${AIRFLOW_PROJ_DIR:-.}/working_data:/opt/airflow/working_data # Change default admin credentials _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow2} _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow2}
e. Initialize the database
docker compose up airflow-init
f. Running Airflow
docker compose up
Wait until terminal outputs
app-airflow-webserver-1 | 127.0.0.1 - - [17/Feb/2023:09:34:29 +0000] "GET /health HTTP/1.1" 200 141 "-" "curl/7.74.0"
g. Enable port forwarding
h. Visit
localhost:8080
login with credentials set on step2.d
-
Explore UI and add user
Security > List Users
-
Create a python script
dags/sandbox.py
- BashOperator
- PythonOperator
- Task Dependencies
- Params
- Crontab schedules
You can have n number of scripts inside dags dir
-
Stop docker containers
docker compose down