Beginner data engineering project - batch edition
-
Updated
Sep 4, 2024 - HTML
Beginner data engineering project - batch edition
Data Engineering Pilipinas is a community for data engineers, data analysts, data scientists, developers, AI / ML engineers, and users of closed and open source data tools and methods / techniques in the Philippines. Data Engineering Pilipinas is a PyData group.
Nextract is a Extract Transform Load (ETL) platform build on top of Node.js streams
Ethereum Analytical Database - Ethereum data access solution that can be used for analytics and application development. The solution works on a fast DB - Clickhouse.
dbt tutorial using a local PostgreSQL database
Integrating Apache Airflow, dbt, Great Expectations and Apache Superset to develop a modern open source data stack.
Socialytics is a social network data acquisition (ETL), analytics and intelligence tool.
This repository is a place for the Data Warehousing course at the Information Systems & Analytics department, Santa Clara University.
In this project, dbt, Great Expectations, Python and Pandas were used to transform and validate the "Inside Airbnb" dataset. The tools ensure quality data, ready for analysis.
Tutorial using Great Expectations library, validating and profiling data on a local PostgreSQL database.
Node-RED ndoes for streaming data - utilising NodeJS Stream APIs
NiFi, Data Engineering, Data Ingest, REST, ETL, Mapping, ELT, SQL, Spark, Kafka for Good
In this project, we built a database that demonstrates the changes in American top fastest-growing private companies through time. The database is built on by ingesting, combining, and restructuring data from three main data sources into a conformed one Postgresql database, and deploy into the Flask app.
An open-source archive of campaign finance and lobbying disclosure data from the California Secretary of State’s CAL-ACCESS database
webscrapping and data analysis
WBSC Europe Baseball and Softball Statistics Data warehouse and BI application
This project is an automated ETL pipeline (Extract, Transform, Load) designed to extract commodity data (specifically Gold and Silver) from a free public API.
Add a description, image, and links to the etl topic page so that developers can more easily learn about it.
To associate your repository with the etl topic, visit your repo's landing page and select "manage topics."