Machine Learning Project Template

V1.5.0

Here is where you should have the description of the project

What is in this reporitory?

Some things about the project

Setup steps

Use the Use this template button to create a new repository from this one

Tested on:

Ubuntu 18.04 with Nvidia K80
- Azure Data Science Virtual machine template
Ubuntu 18.04 with Nvidia V100
Ubuntu 20.04 with Nvidia GEFORCE RTX 2070 SUPER
Ubuntu 18.04 with Nvidia GEFORCE 3070
Ubuntu 20.04 with Nvidia GEFORCE 3070
Macbook Pro 2015
Macbook Pro 2017

NOTE: Both setups assume that your your CUDA and GPU drivers work if not check troubleshooting below

Prerequisites:

Python3
Miniconda

Option 1: Anaconda (recommended)

Initialize conda environment conda create -n <env name>
Activate conda environment conda activate <env name>
To install PyTorch for your OS follow instructions here (they keep changing it so it's hard to standardize)
Installing packages
- conda install -c pytorch -c conda-forge -c bioconda --file=requirements.txt
- If the above doesn't work then try: pip intall --user -r requirements.txt
Setup DVC and other libraries chmod 755 setup.sh; ./setup.sh

Option 2: Directly on machine

Install requirements pip install --user -r requirements.txt
Setup DVC and other libraries chmod 755 setup.sh; ./setup.sh

Tools in this repository and recommended toolset

Data versioning: DVC
Model versioning + Training monitoring + Hyperparameter sweeps: Weights and Biases - Please signup first
Distributed training: Weights and Biases
Coding: JupyterLab or if you want free GPUs and data privacy isn't an issue Colab
Training framework: Pytorch + PyTorchLightning
Tabular data management: Pandas
Plotting: Matplotlib + Plotly
Deployment: FastAPI

Setup guides and docs

Training models

The train.py script contains an outline to effectively use and train PyTorch models using PyTorch Lightning. Once you've customized the script you can run it by doing python3 main.py --train
You can find more tutorials for customizing stuff in the PyTorch lighting section below.

DVC

DVC will be initalized by default in the root directory via the setup.sh
Rest of the data versioning guide can be found here: https://dvc.org/doc/start/data-versioning

PyTorch Lightning

PyTorch Lightning will speed up a lot of your PyTorch workflow especially in the training phase.
Overview
Setting PyTorch Lightning to use in-line arguments e.g. python main.py --gpus 2 --max_steps 10 --limit_train_batches 10 --any_trainer_arg x
Turn your existing models into Lightning models
Setting up mid-training checkpoints
Auto Learning Rate Finder
Other advanced guides for things like multi-gpu training

Weights and Biases

To setup Weights and biases first sign up and create a project or get added to the project
Login via terminal and add Weights and Biases to start. Instructions here
How to log via PyTorch lightning to W&B?
How to integrate Ray Tune with Weights and Biases?
How to use built-in hyperparameter sweep functionality?

Jupyter Lab

Starting Jupyterlab jupyter lab
User guide, key mapping and shortcut customization

Deploying models with ease

In the deployment section there is a pre-built FastAPI template that will allow you to import a model, set the correct data type and return predictions
Once you've customized it acoording to your needs you can just run python3 main.py --deploy
There's also a customizable Dockerfile to make a Docker image for it

Troubleshooting

No Python3 Kernel in Jupyter Lab

Solve by doing the following:

python3 -m pip install ipykernel
python3 -m ipykernel install --user
Restart Jupyter Lab

DVC/Tensorboard/JupterLab command not found

python3 -m <command here>
Issue with python versions on the machine

Can't access jupyter from local machine after cloud deployment

Run Jupyter Lab jupyter lab --no-browser --ip=0.0.0.0 --port=8888

GPU driver/ tensorflow-gpu issues

Check which GPU you have on the machine and make sure it's supported
Install CUDA if you don't have it installed the best way to do this is to just do conda install tensorflow-gpu and it should take care of itself. 3.If the above doesn't work check if you have installed the correct version of tensorflow-gpu for installed CUDA drivers. nvidia-smi
Check which version of GPU drivers are needed for the GPU and the supported CUDA versions for the drivers from here.
When you have found the right version follow this guide and replace the CUDA driver files in the commands with the right versions.
If nothing works try a pre-built VM template on the cloud

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
artifacts		artifacts
config		config
data		data
deployment		deployment
documentation		documentation
evaluation		evaluation
models		models
notebooks		notebooks
testing		testing
training		training
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
__init__.py		__init__.py
main.py		main.py
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Project Template

What is in this reporitory?

Setup steps

Prerequisites:

Option 1: Anaconda (recommended)

Option 2: Directly on machine

Tools in this repository and recommended toolset

Setup guides and docs

Troubleshooting

About

Releases

Packages

Languages

aasimsani/ml_project_template

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Project Template

What is in this reporitory?

Setup steps

Prerequisites:

Option 1: Anaconda (recommended)

Option 2: Directly on machine

Tools in this repository and recommended toolset

Setup guides and docs

Troubleshooting

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages