This project is managed by devenv
. 1 It consists of a Next.js 2 application and a Postgres database with the pgvector
3 extension installed. The query uses the L2 distance <->
to calcualate similarity.
The movie dataset 4 is fetched from Kaggle with the download
script from devenv
. To download the dataset, you need to set KAGGLE_USERNAME
and KAGGLE_API_KEY
environment variables.
The seed
script inserts the data with generated gte-small
5 embeddings into the database. It takes about 17 minutes on my M2 MacBook Air and the database size is around 200 MB.