Skip to content

5_8_22: Project Meeting

Allen Lau edited this page May 8, 2023 · 3 revisions

Project Status:

  • Modeling
  • Currently working on overfitting issues with all models. Using deep learning extracted features and LDA features. Need to troubleshoot why we are getting overfit results. With only deep learning features, we are getting not-overfit results with training and testing accuracies at around 50%.
  • XGBoost - accuracy for train .65 and test .48.
  • Stacking - have not trained model yet, waiting for other models.
  • Evaluation
  • functions for classification report, confusion matrix, KAPPA, MCC. Working on ROC curve

Next Steps:

  • All: Run trainingDataGenerator.py to generate image data for retraining of models. Need to define label name before capturing images. 100 Images per letter.
  • Shubham: Retrain NN on generated dataset
  • Sumaiya, Allen: Troubleshoot overfitting results with combined LDA and Deep Learning Features
  • Sumaiya: ROC Curve Troubleshooting for the straight lines in the curve
  • Shubham & Allen: Troubleshoot demo accuracy issues
  • Allen & Shubham: for demo, add confidence probability, predicted letter
  • All: Run additional evaluation functions on models
  • Allen: Add combined LDA and deep learning features into make_dataset.py script
  • All: Final Notebook
  • All: Final Report
  • All: Final Presentation

Rubric Callouts:

  • random classifier performance = 0.042
  • Baseline = logistic, naive bayes

Project Report

  • look at rubric

Presentation Outline (15 min limit)

  • overview

  • data wrangling

  • data cleaning

  • other steps of the DS pipeline

  • conclusion

  • keep comments restricted to what we did, not what we would do

Questions:

  • Clarification on if line counts are going to be accounted for in the grade