Merge branch 'master' into thomasht86/support-document-expiry

vespa-engine · Sep 25, 2024 · 69c5b45 · 69c5b45
2 parents e397868 + 3ed4948
commit 69c5b45
Show file tree

Hide file tree

Showing 3 changed files with 217 additions and 78 deletions.
diff --git a/docs/sphinx/source/examples/colpali-benchmark-vqa-vlm_Vespa-cloud.ipynb b/docs/sphinx/source/examples/colpali-benchmark-vqa-vlm_Vespa-cloud.ipynb
@@ -13,7 +13,7 @@
         "</picture>\n",
         "\n",
         "\n",
-        "## ColPali Ranking Experiments on DocVQA\n",
+        "# ColPali Ranking Experiments on DocVQA\n",
         "\n",
         "This notebook demonstrates how to reproduce the ColPali results on [DocVQA](https://huggingface.co/datasets/vidore/docvqa_test_subsampled) with Vespa. The dataset consists of PDF documents with questions and answers. \n",
         "\n",
@@ -65,8 +65,9 @@
         "id": "yGfNhRP4RKBJ"
       },
       "source": [
-        "## Load the model\n",
-        "\n"
+        "### Load the model\n",
+        "\n",
+        "Load the model, also choose the correct device and model weights."
       ]
     },
     {
@@ -531,7 +532,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## Configure Vespa\n",
+        "### Configure Vespa\n",
         "[PyVespa](https://pyvespa.readthedocs.io/en/latest/) helps us build the [Vespa application package](https://docs.vespa.ai/en/application-packages.html).\n",
         "A Vespa application package consists of configuration files, schemas, models, and code (plugins).\n",
         "\n",
@@ -720,7 +721,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## Deploy to Vespa Cloud\n",
+        "### Deploy to Vespa Cloud\n",
         "\n",
         "With the configured application, we can deploy it to [Vespa Cloud](https://cloud.vespa.ai/en/).\n",
         "\n",
@@ -748,6 +749,7 @@
       "source": [
         "from vespa.deployment import VespaCloud\n",
         "import os\n",
+        "os.environ['TOKENIZERS_PARALLELISM'] = \"false\"\n",
         "\n",
         "# Replace with your tenant name from the Vespa Cloud Console\n",
         "tenant_name = \"vespa-team\" \n",
@@ -823,7 +825,7 @@
         "id": "j2pUyGjYf4Wv"
       },
       "source": [
-        "## Run queries and evaluate effectiveness"
+        "### Run queries and evaluate effectiveness"
       ]
     },
     {
@@ -1038,7 +1040,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## Conclusion\n",
+        "### Conclusion\n",
         "The binary representation of the patch embeddings reduces the storage by 32x, and using hamming distance instead of dotproduc saves us about 4x in computation compared to the float-float model or the float-binary model (which only saves storage).  Using a re-ranking step with only depth 10, we can improve the effectiveness of the binary-binary model to almost match the float-float MaxSim model. The additional re-ranking step only requires that we pass also the float query embedding version without any additional storage overhead. \n",
         " "
       ]

diff --git a/docs/sphinx/source/examples/colpali-document-retrieval-vision-language-models-cloud.ipynb b/docs/sphinx/source/examples/colpali-document-retrieval-vision-language-models-cloud.ipynb
@@ -14,6 +14,8 @@
                 "\n",
                 "# Vespa 🤝 ColPali: Efficient Document Retrieval with Vision Language Models\n",
                 "\n",
+                "For a simpler example of using ColPali, where we use one Vespa document = One PDF page, see [simplified-retrieval-with-colpali](https://pyvespa.readthedocs.io/en/latest/examples/simplified-retrieval-with-colpali-vlm_Vespa-cloud.html).\n",
+                "\n",
                 "This notebook demonstrates how to represent [ColPali](https://huggingface.co/vidore/colpali) in Vespa. ColPali is a powerful visual language model that can generate embeddings for images and text. \n",
                 "In this notebook, we will use ColPali to generate embeddings for images of PDF _pages_ and store them in Vespa. \n",
                 "We will also store the base64 encoded image of the PDF page and some meta data like title and url. We will then demonstrate how to retrieve the pdf pages using the embeddings generated by ColPali.\n",
@@ -41,9 +43,12 @@
                 "\n",
                 "Then we store colbert embeddings in Vespa and use the [long-context variant](https://blog.vespa.ai/announcing-long-context-colbert-in-vespa/)\n",
                 "where we represent the colbert embeddings per document with the tensor `tensor(page{}, patch{}, v[128])`. This enables \n",
-                "us to use the PDF as the document (retrievable unit), storing the page embeddings in the same document.\n",
+                "us to use the PDF as the document (retrievable unit), storing the page embeddings in the same document. \n",
+                "\n",
+                "The upside of this is that we do not need to duplicate document level meta data like title, url, etc. But, the downside is that \n",
+                "we cannot retrieve using the ColPali embeddings directly, but need to use the extracted text for retrieval. The ColPali embeddings are only used for reranking the results. \n",
                 "\n",
-                "For a simpler example where we use one vespa document = One PDF page, see [this notebook](simplified-retrieval-with-colpali-vlm_Vespa-cloud.ipynb).\n",
+                "For a simpler example where we use one vespa document = One PDF page, see [simplified-retrieval-with-colpali](https://pyvespa.readthedocs.io/en/latest/examples/simplified-retrieval-with-colpali-vlm_Vespa-cloud.html).\n",
                 "\n",
                 "We also store the base64 encoded image, and page meta data like title and url so that we can display it in the result page, but also\n",
                 "use it for RAG with powerful LLMs with vision capabilities. \n",
@@ -799,6 +804,7 @@
             "source": [
                 "from vespa.deployment import VespaCloud\n",
                 "import os\n",
+                "os.environ['TOKENIZERS_PARALLELISM'] = \"false\"\n",
                 "\n",
                 "# Replace with your tenant name from the Vespa Cloud Console\n",
                 "tenant_name = \"vespa-team\" \n",

diff --git a/docs/sphinx/source/examples/simplified-retrieval-with-colpali-vlm_Vespa-cloud.ipynb b/docs/sphinx/source/examples/simplified-retrieval-with-colpali-vlm_Vespa-cloud.ipynb