GitHub - IMsumitkumar/No-code-ML-platform-DashB.ai: A no code machine learning pipelines and data visualization platform

DashB.ai

Video Demo

About The Project

Overview

This is a web app that automates the data preprocessing pipeline.Target is to automate the whole machine learning pipeline.But this project is final till data preprocessing pipeline.
Currently this project is in developement phase.
User can upload comma seperated value files or directly fetch the data from mysql database.(Make sure mysql is installed in your system).
User's have all the command what to perform and what to not so selected operations can be passed to the pipeline to showcase the result.
User's can visualize the data using dataviz tool comes along with Dash.ai which can visualize the data without writing any code. (Made by Dash by plotly)

Built With

Bootstrap
scikit learn
plotly

Getting Started

To get a local copy up and running follow these simple steps. make sure git is installed in yout machine.

Installation

Clone the repo

git clone https://github.com/IMsumitkumar/No-code-ML-platform-DashB.ai

create a virtual env and activate

conda create -n <env_name> python=3.7
conda activate <env_name>

Install dependencies

pip install -r requirements.txt      -      (inside project directory)

RUN

STEP 1 : Migrate the databse tables and create superuser

python manage.py makemigrations
python manage.py migrate
python manage.py createsuperuser

    username : *****
    email    : *****
    password : ******

STEP 2

python manage.py runserver

STEP 3 : OPTIONAL For email recovery you have to set our credentials in DashB -> settings.py

Set your email and password

Preprocessing Pipeline Tree

├── Handle Datatypes
│   ├── Drop unnecessary features.
│   ├── replace inf with NaN.
│   ├── Make sure all the column names are of string type and clean them.
│   ├── Remove the column if target column has NaN.
│   ├── Remove Duplicate columns
│   ├── handle numerical, catergorical and time features.
│   └── Try to determine Ml usecase and encode.
├── Handle Missing Values
│   ├────── Numerical Features
│   ├── Replace with mean.
│   ├── Replace with median.
│   ├── Repalce with Mode.
│   ├── Replace with standard deviation.
│   ├── Replace with zero.
│   ├────── Categorical Features
│   ├── Replace with mean.
│   ├── Replace with "Missing".
│   └── Repalce with Most frequent value.
├── Removing zero and near zero variance columns
│   ├── Eliminate the features that have zero varinace,
│   └── Eliminate the features that have near zero variace.
├── Group Similiar Features
│   └── Group more than two features Make new features with them.
├── Normalization and Transformation
│   ├────── Operations to apply only on numerical features
│   ├── ZScore
│   ├── MinMax
│   ├── Quantile
│   ├── MaxAbs
│   ├── Yeo-Johnson
│   ├────── Target t7ransformation (regression)
│   ├── Box-Cox
│   └── Yeo-Johnson
├── Making Time Features
│   ├── Take a time feature and extract more features from it
│   └── (Day, Month, Year, Hour, Minute, Second, Quantile, Quarter, Day of week, week day name, day of year, week of year )
├── Feature Encoding
│   ├────── Ordinal Encoding
│   ├── LabelEncoding
│   ├── Target Guided ordinal encoding
│   ├────── One hot encoding
│   ├── KDD orange
│   ├── Mean Encoding
│   └── Counter/frequency encoding
├── Removing Outliers
│   ├── Isolaton Forest
│   ├── KNN
│   ├── PCA
│   └── Elliptical envelope
├── Feature Selection
│   ├── Chi squared (Not working perfectly)
│   ├── RFE (Not working on all the data)
│   ├── Lasso (works perfectly)
│   ├── Random Forest
│   ├── lgbm (works perfectly)
│   └── Remove zero variance features
├── Imbalance Dataset (Not done yet)
│   ├── Ensemble techniques automatically handles imblance dataset
│   ├── Undersampling (Not a good idea)
│   ├── Oversampling 
│   ├── SMOTE
│   └── Isolation Forest
└──NExt Step

Directory Tree

├── accounts 
│   └─────────── # handles login, signup and password recovery. 
├── DashB
│   └─────────── # main folder contains wsgi, routing, settings and urls.
├── data
│   └─────────── # main folder for performing pipeline.
├── Viz
│   └─────────── # project app for data visualizatio tool.
├── static
│   └─────────── # contains static files.
├── media
│   └─────────── # storage folder of uploaded media.
├── templates
│   └─────────── # contains landing page templates
├── manage.py
├── requirements.txt
├── LICENSE
├── README.md
└── db.sqlite3

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Team


Sumit

License

Contact

Sumit Kumar - email me @sksumit068@gmail.com

Project Link: https://github.com/IMsumitkumar/No-code-ML-platform-DashB.ai

References

Credits

HTML templates are being used from open source.
Modificatons are made by me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DashB.ai

Table of Contents

About The Project

Overview

Built With

Getting Started

Installation

RUN

Preprocessing Pipeline Tree

Directory Tree

Contributing

Team

License

Contact

References

Credits

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.vscode		.vscode
DashB		DashB
Viz		Viz
accounts		accounts
data		data
images		images
media		media
static		static
staticfiles		staticfiles
templates		templates
LICENSE		LICENSE
README.md		README.md
README1.md		README1.md
db.sqlite3		db.sqlite3
manage.py		manage.py
requirements.txt		requirements.txt

License

IMsumitkumar/No-code-ML-platform-DashB.ai

Folders and files

Latest commit

History

Repository files navigation

DashB.ai

Table of Contents

About The Project

Overview

Built With

Getting Started

Installation

RUN

Preprocessing Pipeline Tree

Directory Tree

Contributing

Team

License

Contact

References

Credits

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages