Prerequisites:

Python >=3.10 or <=3.11

Testing:

It is implementation of indexing pipeline which by default stores indexes locally.

From command prompt create, activate virtual environment, and install the dependencies using requirement.txt.
Manually create below folders:
- For file system:
  C:\Temp\unittest\infy_dpp_processor\STORAGE C:\Temp\unittest\infy_dpp_processor\STORAGE\data\input C:\Temp\unittest\infy_dpp_processor\STORAGE\data\config
OR
- For cloud storage:
  Make input and config folder inside data folder relative to your cloud storage path i.e., DPP_STORAGE_ROOT_URI in script.
Keep input files and config files in correct folder (check script for config file names)
Based on from where you are running the script use/modify config file.
- Local system:
  Take config files from \config\dev\testing\
OR
- Container image in VM:
  Refer config files from \config\dev\
In .env files provide values against DPP_STORAGE_ACCESS_KEY= and DPP_STORAGE_SECRET_KEY=

If a centralized vector dB is being used to store indexes, then:

infy_db_service is expected to be running or deployed.
Modify indexing pipline input config file to enable only infy_db_service under vectordband sparseindex of DbIndexer processor config and provide the db_service_url.
Below URL's are supposed to added in config against db_service_url.(replace the hostname with your hostname where infy_db_service is deployed).
http://:8005/api/v1/sparsedb/saverecords
http://:8005/api/v1/vectordb/saverecords
Provide index_name and enable index under DbIndexer processor config.

"DbIndexer": {
"embedding": {},
"index": {
        "enabled": true,
        "index_name": "",
        "index_id": ""
    },
"storage": {
    "vectordb": {
        "faiss": {},
        "infy_db_service": {
                "enabled": true,
                "configuration": {
                "db_service_url": "http://localhost:8005/api/v1/vectordb/saverecords",
                "model_name": "all-MiniLM-L6-v2",
                "collections": [
                    {
                    "collection_name": "documents",
                    "collection_secret_key": "",
                    "chunk_type": ""
                    }
                ]
                }
            }
        },
        "sparseindex": {
            "bm25s": {},
            "infy_db_service": {
                "enabled": true,
                "configuration": {
                "db_service_url": "http://localhost:8005/api/v1/sparsedb/saverecords",
                "method_name": "bm25s",
                "collections": [
                    {
                    "collection_name": "documents",
                    "collection_secret_key": "",
                    "chunk_type": ""
                    }
                ]
                }
        }
        }
    }
}

Indexing pipeline creates index_id

Run the provided scripts for testing indexing pipeline. e.g.test_indexing_script_local_to_file_sys.ps1

NOTE: While running indexing script ignore list index out of range error for now, in Content Extractor processor

Build Package

Before building the package please add values for DPP_STORAGE_ACCESS_KEY and DPP_STORAGE_SECRET_KEY in .env.tf file.
Run BuildPackage.bat.
Package will be available at apps\infy_dpp_processor\target.

Deploy Package as Docker container:

Copy the below folders to the machine where you have access to create a docker image.

apps\infy_dpp_processor\target
MyProgramFiles (refer docs/notebook/src/use_cases/dpp/installation.ipynb)

The folder structure should look as below:
```
<folder_root_path>
        /dpp_processor_app
        /Dockerfile
        /MyProgramFiles
```

Create, activate virtual environment, and install packages.
Create docker image.
```
docker build -t <ImageURI> .
```

Deploy Package to a Virtual Environment:

MyProgramFiles folder creation

Create MyProgramFiles folder (refer docs/notebook/src/use_cases/dpp/installation.ipynb)

Installation

Copy the package from apps\infy_dpp_processor\target to target server machine where you want to deploy.

Create and activate a virtual environment:

python -m venv .venv
source ./.venv/bin/activate

Upgrade pip:
```
pip install --upgrade pip
```
Install required dependencies
```
pip install -r requirements.txt
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Prerequisites:

Testing:

Build Package

Deploy Package as Docker container:

Deploy Package to a Virtual Environment:

MyProgramFiles folder creation

Installation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Prerequisites:

Testing:

Build Package

Deploy Package as Docker container:

Deploy Package to a Virtual Environment:

MyProgramFiles folder creation

Installation