I am a Data Scientist with 4 years of experience in Data Analytics, Machine Learning, and Deep Learning. I have helped companies make data-driven decisions that reduced costs, optimized workflows, and solved problems.
Some examples include:
Predictive Maintenance using Machine Learning - Water industry:
I implemented predictive maintenance using machine learning in water filtration pipes for a water filtration and delivery company, reducing maintenance costs by an estimated 20-30%. I achieved this by conducting data analysis, recommending tailored machine learning models, and implementing refined sensor data optimizations, adjusted data collection frequency, and enhanced maintenance scheduling. The result is a more efficient and cost-effective system.
Deep Learning Curriculum for a Company Course:
I developed a 120-hour structured deep learning curriculum for a company course, guiding learners from foundational neural networks to advanced topics like CNNs, RNNs, GANs, and transformers. Through 3 hands-on projects in image classification, sentiment analysis, and speech recognition, learners achieved practical mastery with real datasets. Proficiency in TensorFlow, a prominent deep learning framework, rounded out their skill set, ensuring a seamless transition to impactful real-world applications.
These are some of my personal projects:
This project tackled the challenges of speech-based emotion recognition, employing three diverse models on four datasets: RAVDESS, TESS, SAVEE, and CREMA-D. The Mel Spectrogram CNN model showed promise but faced challenges, while the MFCCs CNN model outperformed, achieving 74% accuracy and a remarkable Macro Average F1-score of 0.76. Despite the MFCCs CRNN model's good performance, it slightly lagged behind the MFCCs CNN model and showed signs of overfitting. Evaluation metrics, including precision, recall, and F1 score, underscore the project's success in effectively classifying emotional content from voice alone, with the best model nearing state-of-the-art performance.
In this project, I used Convolutional Neural Networks (CNNs) to detect 14 different thoracic diseases, navigating the challenge of imbalanced datasets in medical image analysis. Using the NIH Chest X-ray dataset with 112,120 labeled images, I implemented two CNN models, highlighting the difference between the binary cross-entropy and weighted binary cross-entropy loss functions. The former, while boasting high accuracy, exhibited a concerning 10.31% success rate in identifying positive cases, risking delayed treatment. The latter, with a focus on reducing false negatives, achieved a superior 86.62% success rate in detecting positives, showcasing the critical balance between accuracy and sensitivity in medical machine learning. The project's findings emphasize the importance of minimizing the risks associated with false negatives in a medical context, which can lead to fatal consequences.
Sales Data Analysis and Forecasting using Ensemble Methods
In the Sales Data Analysis and Forecasting project, I utilized Ensemble Methods to predict future sales, leveraging a dataset of 1,115 drugstores. The Random Forest Regressor and XGBRegressor models were used, with both models demonstrating exceptional performance, and XGBoost slightly edging out due to its iterative nature. The Random Forest Regressor exhibited an MAE of 1115, SMAPE of 16.12%, and an R2 score of 0.70. In comparison, the XGBoost Regressor showcased superior results with an MAE of 1031, SMAPE of 15.12%, and an R2 score of 0.73. Notably, the XGBoost model revealed insights into the significance of Promo and Promo2 features, underscoring its ability to learn and improve predictions. This project not only highlights the effectiveness of ensemble methods in sales forecasting but also provides valuable insights into customer behavior, seasonal patterns, and the pivotal impact of promotional strategies on sales outcomes.
You can reach me through linkedin or my email: amir@datascrutineer.com.