Skip to content

The setup of a data pipeline using Airflow, Docker, Postgres, and AWS

Notifications You must be signed in to change notification settings

JacobwLyman/bitcoin_monitoring

Repository files navigation

Setup

Configuration

Use the template_airflow.cfg file to create the necessary airflow.cfg file in the /docker/airflow/ sub-directory. Place your preferred email credentials into this new file under the [smtp] section.

Similarly, necessary AWS credentials are stored in a /dag/project_config.py file. You can recreate this using the template_project_config.py file in this repository.

Project

Prompt:

Coindesk has a public facing REST API that provides bitcoin price data (https://api.coindesk.com/v1/bpi/currentprice.json). Design a system to pull data from the API every 5 minutes and land each data pull as a file in an AWS S3 Bucket. Prepare a short presentation or diagram to share with us during your interview so that you can walk us through your design.

The system that I have is built using Airflow, Docker, Python, and an AWS S3 Bucket. When scheduled, the Airflow DAG file_to_S3 (1) hits the coinbase API every five minutes, (2) writes the results into an AWS S3 bucket, and (3) sends a confirmation to my personal email.

Airflow GUI:

S3 Bucket:

Confirmation Emails:

Email Message:

About

The setup of a data pipeline using Airflow, Docker, Postgres, and AWS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published