Skip to content

This Repository contains the jupyter notebooks code written during the data management Module in which I created an ETL workflow and designed a data warehouse using star schema and later answered a data mining question.

License

Notifications You must be signed in to change notification settings

vinayak-parab/Data_Management_ETL

Repository files navigation

Data_Management_ETL

Project Requirement

The objective of this project is to develop an ETL workflow and answer the derived data mining, OLAP questions.

Tools / Technologies Used

MySQL, Python, Pandas

Project Brief

Worked with Talend Data Integration tool to perform the Extract , Transform and Load steps for the given dataset. Collected the raw data, transformed and pre-processed the data and designed a Star Schema out of it to store the data into the respective fact and dimension tables in the Data Warehouse. Designed a potential set of business questions which could be answered from the data and visualized their answers using Microsoft Power BI.

How to Run

Run the DM_ETL.ipynb Jupyter Notebook cell by cell to create an ETL Workflow and load the data into the data warehouse. Run the Data_mining.ipynb jupyter notebook file cell by cell to get the answer for the derived question.

About

This Repository contains the jupyter notebooks code written during the data management Module in which I created an ETL workflow and designed a data warehouse using star schema and later answered a data mining question.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published