ESI-DCAFM-TACO-VDSP Summer School
11.07 - 22.07. 2022 in Vienna, Austria, official website here.
Date: 12.07.2022, 14:00 - 17:00
Tutor: Pavol Harar (orcid)
Title: A tool a day keeps the bad review away
Slides: presentation.pdf
The main aim of this session is to have a hands on experience with (some of) the tools presented in the slides. This session is not focused on data/modeling, etc. but rather on the technical aspects around your ML projects. Due to the number of participants, it is not possible to help everybody so in case you feel overwhelmed, do not feel guilty to just follow the presentation.
The following material builds on a Jupyter notebook in which a model was previously trained. The pre-trained model(s) are assumed to be stored and ready to use. In total, there are 7 exercises. Each of the exercises contains a rough (on purpose non-complete) guide to complete the assignment. Often, the project files themselves are the solution, so you can help yourself by peeking into them whenever you feel stuck. All the commands bellow are suited for Ubuntu operating system but links to relevant resources are provided for users of other operating systems.
By the end of the day, you should be able to:
- run a machine learning project within a virtual environment on your own computer
- put your code into a git repository like this one
- specify, build, and run a Docker container which runs your project
- track your experiments in a nice and useful web interface
- wrap your project into an interactive web application for common users
- configure your project to run on Binder, e.g. for reviewers of your paper
- deploy the web application such that it is available from the Internet
The assignment: Run a Jupyter notebook server from within a virtual environment. In case you are more comfortable with or prefer Conda, feel free to use it instead of virtualenv.
- Create a new
MLSummerSchoolVienna2022
folder (this will be the root of this project). - Change director to your newly created project root folder.
- Install
virtualenv
orminioconda
. - Download the
notebook.ipynb
file from this repository. - (virtualenv) Create a new empty file called
requirements.txt
- (conda) Create a new empty file called
requirements.yml
- Fill it the requirements file with dependencies of
notebook.ipynb
+jupyter
andvoila
(or just download it from the repository). - (virtualenv) Create a virtual environment in
venv
folder and install the dependencies from requirements file. - (conda) Recreate the virtual environment from using the requirements file.
- Activate the virtual environment.
- Run the
jupyter notebook
server. - Run the cell 1 to see whether all imports work.
- Create a readme.md file with just a name of the project.
- Optionally, train the models and save them as files.
The assignment: Create a new repository for this project and push an initial commit into it.
- On Ubuntu
sudo apt install git-all
For other OS consult the user guide. We will use git without a graphical user interface (so on Windows, please use Git Bash emulator which should be installed automatically with git).
- Create an account on github.com.
- Make sure you have an SSH key generated. If not, generate it using this guide.
- Go to
github.com > settings > SSH and GPG keys
. - Copy your public SSH key and add it to your GitHub keys.
On Ubuntu you can copy your key usingcat ~/.ssh/id_rsa.pub
. - Verify the SSH key authentication works with
ssh -T git@github.com
. - In case of different OS or some problems consult the GitHub guide.
- Go to github.com and create a new repository called
MLSummerSchoolVienna2022
(change the name of the repository to your liking but do not forget to change it in some of the commands bellow). - Do not check the automatic creation of readme, license or other files.
- Go to your project root folder.
- Create
.gitignore
file withvenv
,.ipynb_checkpoints
in it. - Run
git init
. - Run
git remote add origin git@github.com:<your_username>/MLSummerSchoolVienna2022.git
. - If you use
git
for the first time, you might be asked to configure your user name and commit email address with:- Run
git config --global user.name "Your Name"
. - Run
git config --global user.email "Your Name"
.
- Run
- Run
git add .
. - Run
git commit -m "Initial commit"
. - Run
git push origin main
. - Now your changes should be visible in your repository on github.com.
- In case your HEAD branch is not called
main
butmaster
, change the commands accordingly to avoid problems.
The assignment: Run a Jupyter notebook server in a Docker container.
- If you use different OS than Ubuntu, check Docker installation guide.
- On Ubuntu install with
sudo apt install docker.io
- Check if it is installed correctly with
sudo docker run hello-world
- Run
sudo docker pull intelliseqngs/ubuntu-minimal-20.04:3.0.5
. - Add a file
.dockerfile
into your project. (Here we use a nonstandard name for a reason that we actually do not want Binder and Heroku to use our Dockerfile.) - Base your
.dockerfile
onintelliseqngs/ubuntu-minimal-20.04:3.0.5
. - Fill the
.dockerfile
with commands to copy and install your project. - Reference on writing the Dockerfile is here.
- In case you have problems, consult the solution in .dockerfile.
- Run
sudo docker build -t mlssv2022:latest -f .dockerfile .
.
If you have a problem with DNS, try restarting docker withsudo pkill docker
andservice docker restart
. - In case your Docker errors on "killed" Adjust Docker Preferences Resources RAM - make it bigger, i.e. 4 or 6GB in the settings of your Docker.
- Run
sudo docker run --rm -p 8888:8888 mlssv2022:latest jupyter notebook --allow-root --ip 0.0.0.0
.-p
forwards port 8888 of the container to 8888 on the host--allow-root
since all in the container runs as root--ip 0.0.0.0
expose the jupyter server so host can see it
- Visit
localhost:8888
in your browser and copy the token, the jupyter notebook should now run.
The assignment: Run and explore the wandb examples.
Optional assignment: Adjust notebook.ipynb
such that training is tracked in wandb.
- Create an account at wandb.ai.
- Log in to your account and try the Example (wandb.me/intro) and run it until the "Run experiment" cell finishes.
- Check the results in the wandb.ai account.
- Check also these examples https://github.com/wandb/examples.
- If you feel motivated, open one of the Google Colab notebooks from Monday's tutorial and change it such that it tracks the training into wandb, and view the results in the web interface.
The assignment: Create a simple interactive webapp using ipywidgets and Voila.
Optional assignment: Make the app run in Docker container.
- Go back to the notebook which is running in virtual environment.
- Create a new python3 notebook called
webapp.ipynb
. - Make sure you trained the model or downloaded the pre-trained model(s) from the repository to the project root.
- Build a simple interactive webapp using ipywidgets which allows the user to input data (e.g. as an URL to a file), then loads a pre-trained model, and finally it computes and displays the prediction to the user.
- Click on Voila button in the Jupyter notebook menu to test whether everything runs as a web app.
- If you wish to have a functional Docker container with a webapp inside, update the
.dockerfile
to include web app related files (webapp.ipynb
and pre-trained model(s)) and rebuild your Docker image. - Push your changes to git.
The assignment: Make your project run for free using myBinder.org.
- Make your git repo public if it is not already.
- Go to mybinder.org and fill in the form:
- Repository URL:
https://github.com/<your_username>/mlssv2022
- Git ref:
main
(ormaster
depending on your repo) - Path to a URL (not a file):
/voila/render/webapp.ipynb
- Repository URL:
- Copy the binder markup badge into your
readme.md
. - Wait for app to run in Binder. It will take quite some time, but Binder is a free service, so...
- Check how to use Docker with binder if needed here.
- When the app runs, have fun... You can try this image for example.
*An example badge to run the webapp from this repository on Binder is bellow the title of this exercise. Try to click it.
The assignment: Deploy your project to a free instance on Heroku.
- Create a free account on Heroku. It might still ask you to fill in your credit card though.
- Add
Procfile
withweb: voila webapp.ipynb --no-browser --port $PORT
. - Add
runtime.txt
into project folder withpython-3.8.10
. - Push changes to the repo.
- Install Heroku cli by following the official guide.
- Deploy your app to heroku using git:
- Run
heroku update
to make sure Heroku cli is up to date - Run
heroku create
to create a new Heroku app - Run
git push heroku master
to deploy. - Optionally set
heroku ps:scale web=1
. - Openy your app with
heroku open
.
- Run
- Or follow the deployment guide directly from Voila.