Spring AI RAG implementation and related projects

This project will be the core project where we will be testing various features of the Spring AI framework. Will will be listening to an AWS SQS queue to consume messages. These will be unpacked and saved into our Vector database. This will allow us to use RAG to enhance our customer query data.

Getting Your Development Environment Setup

Recommended Versions

Recommended	Reference	Notes
Java 23 JDK	sdk install java 23-zulu	Java 23 will be used in these projects
IntelliJ 2024 or Higher	Download	Ultimate Edition recommended. Students can get a free 120 trial license here
Maven 3.9.6 or higher	Download	Installation Instructions
Docker		Installation Instructions
Ollama	Download	Installation Instructions
Weather Service		Installation Instructions

Setup the local environment to run this application

Start Ollama with local LLM of llama3.2

 ollama run llama3.2

Once the Ollama is running, you can start the Spring Boot application. Also - in the ~/.ollama/logs folder, you have a server.log file. Open this up and look for the following bits:

llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Llama 3.2 3B Instruct
llama_model_loader: - kv   3:                           general.finetune str              = Instruct
llama_model_loader: - kv   4:                           general.basename str              = Llama-3.2
llama_model_loader: - kv   5:                         general.size_label str              = 3B
llama_model_loader: - kv   6:                               general.tags arr[str,6]       = ["facebook", "meta", "pytorch", "llam...
llama_model_loader: - kv   7:                          general.languages arr[str,8]       = ["en", "de", "fr", "it", "pt", "hi", ...
llama_model_loader: - kv   8:                          llama.block_count u32              = 28
llama_model_loader: - kv   9:                       llama.context_length u32              = 131072
llama_model_loader: - kv  10:                     llama.embedding_length u32              = 3072

Here we have the embedding_length. This must be set the same when the VectorDatase properties is set in the application.yaml file, else you will have chunk sizing issues reading the data again. Take note that this value is set when the vector_store database table is created! Another way to get these parameters is to use a crul command:

curl http://localhost:11434/api/show -d '{
  "name": "llama3.2"
}'

Format the response to proper Json and you will find these properties:

  ],
    "general.type": "model",
    "llama.attention.head_count": 24,
    "llama.attention.head_count_kv": 8,
    "llama.attention.key_length": 128,
    "llama.attention.layer_norm_rms_epsilon": 0.00001,
    "llama.attention.value_length": 128,
    "llama.block_count": 28,
    "llama.context_length": 131072,
    "llama.embedding_length": 3072,
    "llama.feed_forward_length": 8192,
    "llama.rope.dimension_count": 128,
    "llama.rope.freq_base": 500000,
    "llama.vocab_size": 128256,

Choosing the right model is important. One of the limitations is the chunking size in the PGVector DB. For this we need to have the right embedding model. In this app we will be using the mxbai-embed-large model. Reference on the following page: Ollama Embedding Models

The embedding size is now 1024. This is well withing the 2000 PGVector limit and the model is supported by Ollama. To pull the model for use in Ollama, use the following command:

ollama pull mxbai-embed-large

You can test the embedding model by using the following command:

  "model": "mxbai-embed-large",
  "prompt": "Llamas are members of the camelid family"
}'

Start Open WebUI so we can get the Ollama UI

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

[Now start the WebUI on] (http://localhost:3000)

Start PGVector DB

docker run -d --name postgres -p 5432:5432 -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=postgres pgvector/pgvector:0.7.4-pg16

Configure the PGVector DB indexes: PGVector details

We also have a JPA entity that maps onto the vector table. So run the following to initialize the table so JPA is happy.

create table public.vector_store
(
    id        uuid default uuid_generate_v4() not null primary key,
    content   text,
    metadata  json,
    embedding vector(1024)
);

alter table public.vector_store
    owner to customerai;

create index spring_ai_vector_index
    on public.vector_store using hnsw (embedding public.vector_cosine_ops);

Configure customerai user in PostgreSQL

For this we are simply using local PostgreSQL instance, so not using encrypted passwords and fancy stuff.
The following commands will sort out datasource for us:

CREATE DATABASE customerai;
CREATE USER customerai WITH PASSWORD 'customerai';
CREATE SCHEMA IF NOT EXISTS customerai AUTHORIZATION customerai;
GRANT ALL PRIVILEGES ON SCHEMA customerai TO customerai;
ALTER ROLE customerai WITH LOGIN;

https://docs.spring.io/spring-ai/reference/api/vectordbs/pgvector.html

Setting up SpringAI with Ollama

Setting up Functions to be used by the LLM

The first function will use the weather api to retrieve current weather information. This is a free service where we can get an api key to call it.

Advisors in Spring AI

Spring AI has the concept of advisors. Advisors are used to provide additional information to the LLM. They are also used to transform the input and output of the LLM. One big use case is for sharing context accross multiple calls.
Below is a very good article on advisors:
Advisor Implementation

Reference Documentation

For further reference, please consider the following sections:

Testcontainers support

This project uses Testcontainers at development time.

Testcontainers has been configured to use the following Docker images:

Maven Parent overrides

Due to Maven's design, elements are inherited from the parent POM to the project POM. While most of the inheritance is fine, it also inherits unwanted elements like <license> and <developers> from the parent. To prevent this, the project POM contains empty overrides for these elements. If you manually switch to a different parent and actually want the inheritance, you need to remove those overrides.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.mvn/wrapper		.mvn/wrapper
src		src
.gitignore		.gitignore
README.md		README.md
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spring AI RAG implementation and related projects

Getting Your Development Environment Setup

Recommended Versions

Setup the local environment to run this application

Start Ollama with local LLM of llama3.2

Start Open WebUI so we can get the Ollama UI

Start PGVector DB

Configure customerai user in PostgreSQL

Setting up SpringAI with Ollama

Setting up Functions to be used by the LLM

Advisors in Spring AI

Reference Documentation

Testcontainers support

Maven Parent overrides

About

Releases

Packages

Languages

kappaj2/customerai

Folders and files

Latest commit

History

Repository files navigation

Spring AI RAG implementation and related projects

Getting Your Development Environment Setup

Recommended Versions

Setup the local environment to run this application

Start Ollama with local LLM of llama3.2

Start Open WebUI so we can get the Ollama UI

Start PGVector DB

Configure customerai user in PostgreSQL

Setting up SpringAI with Ollama

Setting up Functions to be used by the LLM

Advisors in Spring AI

Reference Documentation

Testcontainers support

Maven Parent overrides

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages