5_8_22: Project Meeting

Project Status:

Modeling

Currently working on overfitting issues with all models. Using deep learning extracted features and LDA features. Need to troubleshoot why we are getting overfit results. With only deep learning features, we are getting not-overfit results with training and testing accuracies at around 50%.

XGBoost - accuracy for train .65 and test .48.

Stacking - have not trained model yet, waiting for other models.

Evaluation

functions for classification report, confusion matrix, KAPPA, MCC. Working on ROC curve

Next Steps:

All: Run trainingDataGenerator.py to generate image data for retraining of models. Need to define label name before capturing images. 100 Images per letter.
Shubham: Retrain NN on generated dataset
Sumaiya, Allen: Troubleshoot overfitting results with combined LDA and Deep Learning Features
Sumaiya: ROC Curve Troubleshooting for the straight lines in the curve
Shubham & Allen: Troubleshoot demo accuracy issues
Allen & Shubham: for demo, add confidence probability, predicted letter
All: Run additional evaluation functions on models
Allen: Add combined LDA and deep learning features into make_dataset.py script
All: Final Notebook
All: Final Report
All: Final Presentation

Rubric Callouts:

random classifier performance = 0.042
Baseline = logistic, naive bayes

Project Report

look at rubric

Presentation Outline (15 min limit)

overview
data wrangling
data cleaning
other steps of the DS pipeline
conclusion
keep comments restricted to what we did, not what we would do

Questions:

Clarification on if line counts are going to be accounted for in the grade

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

5_8_22: Project Meeting

Clone this wiki locally