Welcome to the Shopping Data Transformation project! This project leverages DBT (Data Build Tool) to transform raw shopping data into a well-structured, analytics-ready format. The transformed data can be used for business intelligence, reporting, and data science workflows. This project demonstrates the creation of a data mart/data view using Pandas, PostgreSQL, DBT (Data Build Tool), and Docker.
The goal of this project is to:
- Extract and transform raw data into a structured format using Pandas.
- Load the transformed data into a PostgreSQL database.
- Model and optimize the data using DBT to create a data mart.
- Containerize the environment using Docker for easy deployment and reproducibility.
- ETL Pipeline: Extract, transform, and load data using Python and Pandas.
- PostgreSQL Database: Central storage for the dataset.
- DBT Models: Clean and optimize data to create insightful views.
- Dockerized Setup: Simplified deployment with Docker and Docker Compose.
https://www.kaggle.com/datasets/bhadramohit/customer-shopping-latest-trends-dataset
This project demonstrates the creation of a data mart/data view using Pandas, PostgreSQL, DBT (Data Build Tool), and Docker. The dataset is sourced from Kaggle, containing customer shopping trends and purchase behaviors.
- git clone
https://github.com/lixx21/dbt-shopping-data-transform.git
cd dbt-shopping-data-transform
- run
docker-compose.yaml
usingdocker-compose up --build -d
command. This command will automatically run PostgreSQL, PgAdmin (in localhost:5050) and ETL file in load_data folder - you can find the email and password of the PgAdmin and also username, password and main db name in
docker-compose.yaml
(feel free to adjust with your own account)