This repository contains code for the O'Reilly Live Online Training for Deep Learning for Modern AI
This training provides the theory and practical concepts for a comprehensive introduction to machine learning and deep learning with PyTorch —foundational knowledge needed to successfully build and train GenAI and multimodal models. By making our way through several real-world case studies including object recognition and text classification this session is an excellent crash course in deep learning with PyTorch.
We use tools including large pre-trained models and model training dashboards to set up reproducible deep learning experiments and build machine learning models optimized for performance. There are several code examples throughout the training to help solidify the theoretical concepts that will be introduced. Models like Stable Diffusion, Llama 3, GPT, and BERT are highlighted as we uncover the training and optimization strategies to get the most of our models' performance, speed, and memory usage.
All data can be downloaded for the art classification example here. Note it is about 6GB so it may take a bit.
- First steps with Deep Learning with MNIST
- RNNs and CNNs
- Working with a pre-trained VGG-11 model
- Fine-tuning BERT vs ChatGPT
- Fine-tuning OpenAI: the code to compare against BERT
- Fine-tuning GPT-2 to convert English to LaTEX
- Fine-tuning Llama 3 to be a chatbot
- Production Optimization
- Quantizing Llama 3
- Testing different fine-tuning configurations
- Distilling BERT models
-
Intro to Multimodality: An introduction to multimodality with CLIP and SHAP-E + Diffusion
-
Whisper: An introduction to using Whisper for audio transcription
-
Llava: Using an open source mult-turn multimodal engine
-
CLIP-based Stock Image Search: Using CLIP to search through a library of images
-
-
Visual Q/A
-
Constructing and Training our model
-
Using our VQA system
-
app.py
is a Flask app that uses a VGG16 model to classify the art style of an uploaded image. The app currently supports 10 different art styles:
- Abstract Expressionism
- Art Nouveau (Modern)
- Baroque
- Expressionism
- Impressionism
- Northern Renaissance
- Post-Impressionism
- Realism
- Romanticism
- Symbolism
Start the Flask app:
python app.py
This should start the Flask app and make it available at http://localhost:5000
.
To classify an image, you can use a cURL request in the following format:
curl -X POST -F 'image=@/path/to/your/image.jpg' http://localhost:5000/predict
Replace /path/to/your/image.jpg
with the path to your own image. The response will be in JSON format and will contain the predicted art style and associated confidence scores, as shown below:
e.g.
curl -X POST -F \
'image=@images/Venus_and_Adonis_by_Peter_Paul_Rubens.jpg' \
http://localhost:5000/predict
[
["Northern_Renaissance",0.13392961025238037],
["Realism",0.12794768810272217],
["Romanticism",0.12592236697673798],
["Post_Impressionism",0.11863630264997482],
["Baroque",0.11325731128454208],
["Symbolism",0.1120268702507019],
["Expressionism",0.08971412479877472],
["Impressionism",0.086906298995018],
["Art_Nouveau_Modern",0.05910796299576759],
["Abstract_Expressionism",0.03255145251750946]]
If there is an error with the request, such as no image being provided, the response will contain an error message instead:
{
"error": "No image provided"
}
Sinan Ozdemir is the Founder and CTO of LoopGenius where he uses State of the art AI to help people create and run their businesses. Sinan is a former lecturer of Data Science at Johns Hopkins University and the author of multiple textbooks on data science and machine learning. Additionally, he is the founder of the recently acquired Kylie.ai, an enterprise-grade conversational AI platform with RPA capabilities. He holds a master’s degree in Pure Mathematics from Johns Hopkins University and is based in San Francisco, CA.
- CHeck out Deep Learning Illustrated: A best seller by Jon Krohn, it's a very visual introduction to deep learning
- Deep Learning course: lecture slides and lab notebooks: The course covers the basics of Deep Learning, with a focus on applications.