Skip to content

Commit

Permalink
Merge branch 'master' into thomasht86/support-document-expiry
Browse files Browse the repository at this point in the history
  • Loading branch information
thomasht86 committed Sep 25, 2024
2 parents e397868 + 3ed4948 commit 69c5b45
Show file tree
Hide file tree
Showing 3 changed files with 217 additions and 78 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
"</picture>\n",
"\n",
"\n",
"## ColPali Ranking Experiments on DocVQA\n",
"# ColPali Ranking Experiments on DocVQA\n",
"\n",
"This notebook demonstrates how to reproduce the ColPali results on [DocVQA](https://huggingface.co/datasets/vidore/docvqa_test_subsampled) with Vespa. The dataset consists of PDF documents with questions and answers. \n",
"\n",
Expand Down Expand Up @@ -65,8 +65,9 @@
"id": "yGfNhRP4RKBJ"
},
"source": [
"## Load the model\n",
"\n"
"### Load the model\n",
"\n",
"Load the model, also choose the correct device and model weights."
]
},
{
Expand Down Expand Up @@ -531,7 +532,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure Vespa\n",
"### Configure Vespa\n",
"[PyVespa](https://pyvespa.readthedocs.io/en/latest/) helps us build the [Vespa application package](https://docs.vespa.ai/en/application-packages.html).\n",
"A Vespa application package consists of configuration files, schemas, models, and code (plugins).\n",
"\n",
Expand Down Expand Up @@ -720,7 +721,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploy to Vespa Cloud\n",
"### Deploy to Vespa Cloud\n",
"\n",
"With the configured application, we can deploy it to [Vespa Cloud](https://cloud.vespa.ai/en/).\n",
"\n",
Expand Down Expand Up @@ -748,6 +749,7 @@
"source": [
"from vespa.deployment import VespaCloud\n",
"import os\n",
"os.environ['TOKENIZERS_PARALLELISM'] = \"false\"\n",
"\n",
"# Replace with your tenant name from the Vespa Cloud Console\n",
"tenant_name = \"vespa-team\" \n",
Expand Down Expand Up @@ -823,7 +825,7 @@
"id": "j2pUyGjYf4Wv"
},
"source": [
"## Run queries and evaluate effectiveness"
"### Run queries and evaluate effectiveness"
]
},
{
Expand Down Expand Up @@ -1038,7 +1040,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"### Conclusion\n",
"The binary representation of the patch embeddings reduces the storage by 32x, and using hamming distance instead of dotproduc saves us about 4x in computation compared to the float-float model or the float-binary model (which only saves storage). Using a re-ranking step with only depth 10, we can improve the effectiveness of the binary-binary model to almost match the float-float MaxSim model. The additional re-ranking step only requires that we pass also the float query embedding version without any additional storage overhead. \n",
" "
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@
"\n",
"# Vespa 🤝 ColPali: Efficient Document Retrieval with Vision Language Models\n",
"\n",
"For a simpler example of using ColPali, where we use one Vespa document = One PDF page, see [simplified-retrieval-with-colpali](https://pyvespa.readthedocs.io/en/latest/examples/simplified-retrieval-with-colpali-vlm_Vespa-cloud.html).\n",
"\n",
"This notebook demonstrates how to represent [ColPali](https://huggingface.co/vidore/colpali) in Vespa. ColPali is a powerful visual language model that can generate embeddings for images and text. \n",
"In this notebook, we will use ColPali to generate embeddings for images of PDF _pages_ and store them in Vespa. \n",
"We will also store the base64 encoded image of the PDF page and some meta data like title and url. We will then demonstrate how to retrieve the pdf pages using the embeddings generated by ColPali.\n",
Expand Down Expand Up @@ -41,9 +43,12 @@
"\n",
"Then we store colbert embeddings in Vespa and use the [long-context variant](https://blog.vespa.ai/announcing-long-context-colbert-in-vespa/)\n",
"where we represent the colbert embeddings per document with the tensor `tensor(page{}, patch{}, v[128])`. This enables \n",
"us to use the PDF as the document (retrievable unit), storing the page embeddings in the same document.\n",
"us to use the PDF as the document (retrievable unit), storing the page embeddings in the same document. \n",
"\n",
"The upside of this is that we do not need to duplicate document level meta data like title, url, etc. But, the downside is that \n",
"we cannot retrieve using the ColPali embeddings directly, but need to use the extracted text for retrieval. The ColPali embeddings are only used for reranking the results. \n",
"\n",
"For a simpler example where we use one vespa document = One PDF page, see [this notebook](simplified-retrieval-with-colpali-vlm_Vespa-cloud.ipynb).\n",
"For a simpler example where we use one vespa document = One PDF page, see [simplified-retrieval-with-colpali](https://pyvespa.readthedocs.io/en/latest/examples/simplified-retrieval-with-colpali-vlm_Vespa-cloud.html).\n",
"\n",
"We also store the base64 encoded image, and page meta data like title and url so that we can display it in the result page, but also\n",
"use it for RAG with powerful LLMs with vision capabilities. \n",
Expand Down Expand Up @@ -799,6 +804,7 @@
"source": [
"from vespa.deployment import VespaCloud\n",
"import os\n",
"os.environ['TOKENIZERS_PARALLELISM'] = \"false\"\n",
"\n",
"# Replace with your tenant name from the Vespa Cloud Console\n",
"tenant_name = \"vespa-team\" \n",
Expand Down

Large diffs are not rendered by default.

0 comments on commit 69c5b45

Please sign in to comment.