A data transformation job on data sourced from a MongoDB database.
Input data was a deeply nested JSON format from a MongoDB source system.
During the transformation stage of the ETL, the data was normalised into structured relational format
for subsequent feature engineering and analysis to build a prediction model.
The result dataset had over 56,000 features (i.e. columns).
- Python
- Jupyter Notebook
- Pandas
- Flatten_json