Welcome to my Data Science and Machine Learning portfolio! This project is a result of my participation in the UmojaHack Africa 2023: Carbon Dioxide Prediction Challenge (BEGINNER) on Zindi.
The challenge aimed to harness the power of machine learning and deep learning to predict carbon emissions in Africa using open-source CO2 emissions data from Sentinel-5P satellite observations. The goal is to assist governments and researchers in monitoring carbon emissions across the continent, even in areas with limited on-the-ground monitoring capabilities.
I am proud to share my journey and achievements in this competition, where I ranked in the top 50%. Here's a glimpse of what I accomplished:
The challenge focused on predicting carbon emissions in Africa, addressing a critical aspect of climate change mitigation. Accurate monitoring of carbon emissions is crucial for understanding their sources and patterns.
-
Prizes: I competed for a chance to win monetary prizes, with the top three participants receiving cash rewards. Additionally, there were country prizes for the highest-ranking participants from specific countries.
-
Evaluation: The competition's performance metric was Root Mean Squared Error (RMSE), used to assess the accuracy of predictions.
-
Datasets: I utilized publicly-available, open-source CO2 emissions data obtained from Sentinel-5P satellite observations. The dataset included various features related to pollutants such as Sulphur Dioxide, Carbon Monoxide, Nitrogen Dioxide, and more.
-
Challenges: I faced challenges in feature engineering, model selection, and data preprocessing to create a robust predictive model.
Here are the key files related to this project:
- Train.csv - This dataset was used for training and contained target information.
- Test.csv - The test set used to evaluate the model's predictions.
- SampleSubmission.csv - An example of the submission format.
- Starter Notebook - A helpful notebook to kickstart the project.
-
Data Exploration: I began by thoroughly exploring the provided datasets to gain insights into the data's structure and distribution.
-
Feature Engineering: I engineered new features and transformed existing ones to extract meaningful information for building predictive models.
-
Model Selection: I experimented with various machine learning and deep learning algorithms to identify the best-performing model.
-
Hyperparameter Tuning: To improve model accuracy, I fine-tuned hyperparameters and optimized the model.
-
Validation: I used cross-validation techniques to assess model performance and avoid overfitting.
-
Submission: After obtaining satisfactory results, I created submission files following the required format.
For more details on my approach and analysis, you can check the accompanying notebook EY_Carbon_Prediction.ipynb.
I am pleased to share that I achieved a ranking in the top 50% of participants in the UmojaHack Africa 2023 challenge. My predictive model demonstrated promising results in estimating carbon emissions.
As I continue to develop my data science and machine learning skills, I plan to enhance this project further. Some future steps may include:
- Exploring advanced machine learning techniques.
- Incorporating additional external data sources for improved predictions.
- Enhancing model interpretability for stakeholders.
I'm always eager to collaborate and learn from others in the data science community. Feel free to connect with me on LinkedIn or GitHub.
I'd like to express my gratitude to the organizers of the UmojaHack Africa 2023 competition for providing this valuable learning opportunity.
Thank you for visiting my portfolio, and I look forward to sharing more data science projects with you in the future! 🚀✨