Data pipelines from re-usable components
-
Updated
Mar 30, 2023 - Python
Data pipelines from re-usable components
The open-source Useful SDK. One python decorator in the Useful library allows for full observability of Python functions within an ETL.
A project structure for doing and sharing data engineer work.
Lien de l'application
e-Portfolio showcasing my personal projects.
Build ETL piplines on AirFlow to load data from BigQuery and store it in MySQL
DataSift auto applies a data pre-processing pipeline to Data Science Projects.
An extension that registers all pharmacies in Argentina.
A deployed machine learning model that has the capability to automatically classify the incoming disaster messages into related 36 categories. Project developed as a part of Udacity's Data Science Nanodegree program.
This repo contains the DAGs that run on my local Airflow environment. I use the local environment to test my DAGs before deploying them to virtual machines via Kubernetes
A Python and Spark based ETL framework. While it operates within speed limits that is framework and standards, but offers boundless possibilities.
JSON-driven ETL pipeline framework prototype
End To End MLOPS Project With ETL Pipelines- Building Network Security System
This project demonstrates a complete ETL pipeline for Formula 1 racing data using Azure Databricks, Delta Lake, and Azure Data Factory. It covers data ingestion, transformation with PySpark and Spark SQL, data governance with Unity Catalog, and visualization through Power BI. Designed to showcase real-world data engineering workflows in Azure.
Add a description, image, and links to the etl-pipelines topic page so that developers can more easily learn about it.
To associate your repository with the etl-pipelines topic, visit your repo's landing page and select "manage topics."