Skip to content

This Repository contains machine learning classification projects

Notifications You must be signed in to change notification settings

dinabandhu50/ABALONE_PROJECT

Repository files navigation

ABALONE PROJECT

This Repository Abalone classification project. This is the first project in my self learning curriculum.

Dataset

The dataset I have used can be found here (https://www.kaggle.com/rodolfomendes/abalone-dataset)

Purpose of project

In the journey of learning machine learning especially classification EDA and classification model evaluation, I have made a self learning curriculum which is comprised of building 10 classification projects end-to-end involving steps, which are as follows:

  • Do the EDA

    • Using pandas, numpy, statsmoel, sklearn, seaborn and matplotlib
    • Check for bias-sampling such as data imbalance
    • descriptive statstics
      • Measure of central tendency
      • Measure of dispersion
      • Measure of association
      • Check for skewness and kurtosis
      • If needed data imputation
      • If needed data transformation
      • Outlier detection and handling
  • Choose best Model

    • Train model systematically
    • Use ensemble models
    • Use cross-validation method to reduce variance error
  • Do the model evaluation

    • metrics: accuracy, precision, recall, F1-score and ROC-AUC
    • use mlxtent to observe bias variance decomposition of error
    • AIC and BIC for checking model bias-variance

Building model API

Using flask library build model-api. Learn the folder structure of api backend which can be scaled later if required.

Deploying API

Learn deplyment to digitalocean and deploying to heroku from github.