Skip to content

My Retrieval Augmented Generation project for LLM Zoomcamp. A RAG app for Azure foundation course AZ900.

Notifications You must be signed in to change notification settings

viviensiu/LLM-project

Repository files navigation

AZ-900 Study Buddy

Table of contents

Problem Statement

credits: xkcd.com

Studying for exams can often be overwhelming and time-consuming, especially when there's a large volume of material to review. It's common to need quick access to information on specific topics but struggle to find the relevant content. To tackle this, a RAG (Retrieval-Augmented-Generation) application, built with course materials as its knowledge base, can streamline the study process by optimizing time and effort.

As I prepare for the Azure Fundamentals AZ-900 exam, I developed the AZ-900 Study Buddy, a RAG application designed to address these challenges. This project not only helps with exam preparation but also serves as a practical example of how RAG technology can be applied to educational use cases in general.

Dataset

  • All data for this project are available in data folder.
  • Raw data: My AZ900 course notes, see module_*.md files.
  • readme_notes_with_ids.json: Raw data that are loaded and chunked into JSON format.
  • az900_notes_with_vectors.pkl: The RAG knowledge base source, a structured data with vector embeddings.
  • ground-truth-data.pkl: Ground truth data, contains questions to be used in retrieval evaluation and RAG evaluation.
  • RAG_evaluation.pkl: A set of LLM generated responses, cosine similarity, latency, cost and tokens usage data from RAG evaluation.
  • results_backup.json: A backup of JSON format questions generated for ground truth data (to skip question regeneration if any error occurs during parsing).

Tech Stack

  • Python 3.12.
  • Docker.
  • Elasticsearch for knowledge base retrieval.
  • OpenAI or Ollama for LLM.

Must-have

  • Docker, since the knowledge base is indexed into elasticsearch docker container. You could download it at Download Docker Desktop.
  • An OpenAI account with credits, or an Ollama docker container with Phi3 model pulled.

Good-to-have

  • Basic knowledge of these Docker commands for troubleshooting: docker ps, docker ps -a, docker start, docker compose up, docker compose down.

Methodology

There are 4 main steps, the first 2 steps:

  • Ingestion: load, chunk and embed raw data into structured data with embeddings.
  • Evaluation: evaluate the best retrieval and RAG methods for this use case.

Once the best retrieval method and RAG is determined, the app AZ900 Study Buddy is then built in these 2 steps:

  • Interface: streamlit application and RAG backend.
  • Containerization: dockerize application, knowledge base and LLM using docker compose.

Environment setup

One-time setup to reproduce any parts of this repo on your workstation. You can skip to Running the application using docker compose if you only want to run the application.

  • conda create -n llm-zoomcamp-env python
  • conda activate llm-zoomcamp-env
  • conda install pip
  • pip install pipenv
  • pipenv install tqdm notebook==7.1.2 openai elasticsearch pandas jupyter sentence_transformers==2.7.0 python-dotenv seaborn streamlit
  • pipenv shell: This allows you to run commands such as python xxx.py, streamlit run xxx.py in the virtual environment.
  • Make sure docker service is up and running!
  • git clone this repo to your local workstation.
  • Prepare the .env file.
  • Start elasticsearch.

.env template

  • Refer ChatGPT's suggestions to view .env* files. Else you can't proceed with next step.
  • Rename .env_template to .env.
  • (Not applicable for running application, only for code reproduction) Copy-paste your OpenAI API key to env. variable "OPENAI_API_KEY".

Elastic Search

Not applicable for running application as it's handled by docker compose. For some code reproduction, you will need elasticsearch container to be running..

  • To check if elasticsearch container is running, go to http://localhost:9200/.
  • If not, either start an existing elasticsearch container using docker start elasticsearch or a new elasticsearch container with the following command:
docker run -it \
    --name elasticsearch \
    -p 9200:9200 \
    -p 9300:9300 \
    -e "discovery.type=single-node" \
    -e "xpack.security.enabled=false" \
    docker.elastic.co/elasticsearch/elasticsearch:8.4.3

Running the application (using docker compose)

This is for reproduction of the AZ900 Study Buddy application only (without reproducing other steps e.g. ingestion, evaluation).

  • Make sure Docker service is up and running!

Pick your LLM

  • There are 2 LLM options available in the application: GPT-4o-mini and Ollama Phi3. You need to decide on which one to use and to have it ready before starting the application. See diagram for setup flow: app setup flow
    • GPT-4o-mini: You will be asked to select GPT-4o-mini and input your OpenAI API key at the application screen. You can now proceed to docker compose step directly.
    • Ollama Phi3: You need to have a running ollama container with Phi3 model inside.To check if you have an existing ollama container, execute docker ps -a.
      • If one exists but stopped, start container with docker start ollama.
      • If there's no existing Ollama container, execute the following:
docker run -it \
    --rm \
    -v ollama:/root/.ollama \
    -p 11434:11434 \
    --name ollama \
    ollama/ollama
  • Once ollama container is running (you can see it in docker ps), check if the ollama container has a Phi3 model by executing:
    • docker exec -it ollama bash, then ollama list.
    • If you can see a phi3 model then you could use Ollama Phi3 to test the app. Example: alt text
    • Otherwise, execute ollama pull phi3. Example: alt text

Start up application

  • Download this zip file az900_study_buddy.zip to your Desktop.
  • Unzip the downloaded az900_study_buddy.zip. You should see a new folder az900_study_buddy on your Desktop.
  • Execute the following in command prompt:
    • cd Desktop/az900_study_buddy.
    • Run docker compose up. This takes about 2-3 minutes. Once it's done you should see the following: docker compose completed
  • Optional health checks:
    • Cross check containers az900_study_buddy_app and elasticsearch are up with docker ps. If you plan to use Ollama, you need to see ollama container as well, refer this guide Pick your LLM.
    • To ensure that data are indexed in elasticsearch, go to http://localhost:9200/_cat/indices?v. You should see az900_course_notes under "index".
  • To access the application, open this link http://localhost:8501/ in a browser. If it is successful, you should now see the following screen: app screen
  • Note: If you executed docker compose up -d(detached mode), it would only tell you the containers are started but it doesn't tell you when the Streamlit app is ready. So you might encounter 404 not found when loading the application. Please try refreshing the page again after a minute or so.

How to use the application

  • You must choose an LLM model first before you could start asking questions!
  • If you're using GPT-4o-mini, select "OpenAI GPT-4o-mini" in drop down list and input your API key, press submit.
  • If you're using Ollama, select "Ollama Microsoft Phi3" and press submit.
  • You can now start to ask questions. Here's a screenshot of the interaction using Ollama: sample interaction

Stopping the application

  • Execute the following in command prompt:
    • cd Desktop/az900_study_buddy.
    • docker compose down
  • You could also stop your ollama container if you're done with it.

Evaluation Criteria

For peer review: A full list of LLM zoomcamp project evaluation criteria is available here.

  • Problem description
  • RAG flow
    • Both a knowledge base and an LLM are used in the RAG flow. See Methodology.
  • Retrieval evaluation
    • Metrics used: Hit rate and Mean Reciprocal Rank (MRR)
    • Multiple retrieval approaches are evaluated: text search, vector search, hybrid search, together with fine-tuning on the boost parameter. Refer notebook retrieval_evaluation.ipynb for details.
    • Hybrid search with document reranking is also evaluated. Refer notebook hybrid_search_reranking.ipynb for details.
    • Conclusion: Hybrid search (without document reranking) has the best retrieval performance.
  • RAG evaluation
    • Metrics used: Cosine similarity, latency, cost and tokens usage.
    • Models evaluated: GPT-4o-mini, GPT-3.5-turbo.
    • Multiple RAG approaches are evaluated based on metrics above. Refer notebook RAG_evaluation.ipynb.
    • Conclusion: The best one, GPT-4o-mini, is used in the application.
  • Interface
    • A Streamlit UI is used. See app.py
  • Ingestion pipeline
  • Monitoring
    • Currently unavailable.
  • Containerization
  • Reproducibility
  • Best practices
  • Bonus points (not covered in the course)
    • Currently unavailable in the cloud.

Future works

  • Monitoring using PostgreSQL and Grafana.
  • Migrate data to MongoDB.
  • Deploy app onto cloud: HuggingFace Spaces, Streamlit Cloud, AWS EC2.
  • A project report to talk more about methodology, findings.

Credits

A big thanks to Alexey Grigorev and the DataTalks.Club for the LLM Zoomcamp course, which made this project possible! 😃

Contact info for peers

  • For any questions regarding peer review, please reach out to me on DataTalks.Club user "Vivien S.".
  • Alternatively, you can create a new issue under Issues.

About

My Retrieval Augmented Generation project for LLM Zoomcamp. A RAG app for Azure foundation course AZ900.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published