This repository contains a data analysis of the PPDB (Penerimaan Peserta Didik Baru) in Bandung for the year 2022. The data is used to analyze the quotas, scores, and zoning of schools in Bandung, as well as the socio-economic index of each district.
- numpy
- pandas
- matplotlib
- folium
The data used in this analysis is sourced from several CSV files:
dataset/data_kuota.csv
: contains information on the quotas for each school
dataset/data_rapor.csv
: contains information on the scores of each student
dataset/data_zonasi.csv
: contains information on the zoning of each student
dataset/data_koordinat.csv
: contains information on the coordinates of each school
dataset/ik-berdasarkan-aspek-dan-kecamatan-2018.csv
: contains information on the socio-economic index of each district
The data is cleaned and preprocessed before analysis. This includes:
- Replacing missing values with np.NaN
- Changing the data types of certain columns
- Renaming columns for consistency
The cleaned data is then used to perform various analyses, such as:
- Analyzing the distribution of quotas among schools
- Examining the relationship between scores and school quotas
- Mapping the locations of schools and the socio-economic index of each district
The code above is just the data preparation and data cleansing, there is no analysis being done here. The analysis would typically happen after this step and would involve using the data to answer specific questions or test hypotheses.
To see the analysis, please visit the link below: https://drive.google.com/file/d/18Xhse8jhQA-Pc5om4MDbadIiHT7heFy_/view