ARGUS: Automated Retrieval and GPT Understanding System

Argus Panoptes, in ancient Greek mythology, was a giant with a hundred eyes and a servant of the goddess Hera. His many eyes made him an excellent watchman, as some of his eyes would always remain open while the others slept, allowing him to be ever-vigilant.

This solution demonstrates Azure Document Intelligence + GPT4 Vision

Classic OCR (Object Character Recognition) models lack reasoning ability based on context when extracting information from documents. In this project we demonstrate how to use a hybrid approach with OCR and LLM (multimodal Large Language Model) to get better results without any pre-training.

This solution uses Azure Document Intelligence combined with GPT4-Vision. Each of the tools have their strong points and the hybrid approach is better than any of them alone.

Notes:

The Azure OpenAI model needs to be vision capable i.e. GPT-4T-0125, 0409 or Omni

Solution Overview

Backend: An Azure Function for core logic, Cosmos DB for auditing, logging, and storing output schemas, Azure Document Intelligence, GPT-4 Vision and a Logic App for integrating with Outlook Inbox.
Frontend: A Streamlit Python web-app for user interaction (not deployed automatically).
Demo: Sample documents, system prompts, and output schemas.

Prerequisites

Azure OpenAI Resource

Before deploying the solution, you need to create an OpenAI resource and deploy a model that is vision capable.

Create an OpenAI Resource:
- Follow the instructions here to create an OpenAI resource in Azure.
Deploy a Vision-Capable Model:
- Ensure the deployed model supports vision, such as GPT-4T-0125, GPT-4T-0409 or GPT-4-Omni.

Deployment

Deployment with `azd up`

Prerequisites:
- Install Azure Developer CLI.
- Ensure you have access to an Azure subscription.
- Create an OpenAI resource and deploy a vision-capable model.
Deployment Steps:
- Run the following commands to login (if needed):
```
az login
```
- Run the following commands to deploy all resources:
```
azd up
```

After deployment the frontend will load automatically on your browser.

NOTE: After deployment wait for about 10 minutes for the docker images to be pulled. You can check the progress in your Azure Portal > Resource Group > FunctionApp > Deployment Center > Logs.

KNOWN ISSUE: Occasionally, the FunctionApp encounters a runtime issue, preventing the solution from processing files. To resolve this, restart the FunctionApp by follow these steps: Azure Portal > Resource Group > FunctionApp > Monitoring > Health Check > Instances > Click Restart.

Running the Streamlit Frontend (recommended)

To run the Streamlit app app.py located in the frontend folder, follow these steps:

Install the required dependencies by running the following command in your terminal:
```
pip install -r frontend/requirements.txt
```
Execute the following command:
```
azd env get-values > frontend/.env
```
Alternatively, if you did not use AZD to provision the resources: Rename the .env.temp file to .env:
```
mv frontend/.env.temp frontend/.env
```
then populate the .env file with the necessary environment variables. Open the .env file in a text editor and provide the required values for each variable.
Start the Streamlit app by running the following command in your terminal:
```
streamlit run frontend/app.py
```

Running the Outlook integration with Logic App

You can connect a Outlook inbox to send incoming attachments directly to the blob storage to trigger the extraction process. For that a Logic App was already built for you. The only thing you need to do is to open the resource "LogicAppName" add a trigger and connect it to your Outlook inbox. Open this Microsoft Learn page and search for "Add a trigger to check incoming email" follow the described steps then activate it with the "Run" button.

How to Use

Upload and Process Documents (without using the Frontend)

Upload PDF Files:
- Navigate to the sa-uniqueID storage account and the datasets container
- Create a new folder called default-dataset and upload your PDF files.
View Results:
- Processed results will be available in your Cosmos DB database under the doc-extracts collection and the documents container.

Model Input Instructions

The input to the model consists of two main components: a model prompt and a JSON template with the schema of the data to be extracted.

`Model Prompt`

The prompt is a textual instruction explaining what the model should do, including the type of data to extract and how to extract it. Here are a couple of example prompts:

Default Prompt: Extract all data from the document.
Example Prompt: Extract all financial data, including transaction amounts, dates, and descriptions from the document. For date extraction use american formatting.

`JSON Template`

The JSON template defines the schema of the data to be extracted. This can be an empty JSON object {} if the model is supposed to create its own schema. Alternatively, it can be more specific to guide the model on what data to extract or for further processing in a structured database. Here are some examples:

Empty JSON Template (default):

{}

Specific JSON Template Example:

{
    "transactionDate": "",
    "transactionAmount": "",
    "transactionDescription": ""
}

By providing a prompt and a JSON template, users can control the behavior of the model to extract specific data from their documents in a structured manner.

JSON Schemas created using JSON Schema Builder.

Team behind ARGUS

This README file provides an overview and quickstart guide for deploying and using Project ARGUS. For detailed instructions, consult the documentation and code comments in the respective files.

Name		Name	Last commit message	Last commit date
Latest commit History 194 Commits
.devcontainer		.devcontainer
.github		.github
.vscode		.vscode
demo		demo
docker		docker
docs		docs
frontend		frontend
infra		infra
notebooks		notebooks
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
azure.yaml		azure.yaml
cosmosdb_cli_addrole.sh		cosmosdb_cli_addrole.sh
docker_run.sh		docker_run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARGUS: Automated Retrieval and GPT Understanding System

This solution demonstrates Azure Document Intelligence + GPT4 Vision

Solution Overview

Prerequisites

Azure OpenAI Resource

Deployment

Deployment with `azd up`

Running the Streamlit Frontend (recommended)

Running the Outlook integration with Logic App

How to Use

Upload and Process Documents (without using the Frontend)

Model Input Instructions

`Model Prompt`

`JSON Template`

Team behind ARGUS

About

Releases

Packages

Contributors 10

Languages

License

Azure-Samples/ARGUS

Folders and files

Latest commit

History

Repository files navigation

ARGUS: Automated Retrieval and GPT Understanding System

This solution demonstrates Azure Document Intelligence + GPT4 Vision

Solution Overview

Prerequisites

Azure OpenAI Resource

Deployment

Deployment with azd up

Running the Streamlit Frontend (recommended)

Running the Outlook integration with Logic App

How to Use

Upload and Process Documents (without using the Frontend)

Model Input Instructions

Model Prompt

JSON Template

Team behind ARGUS

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 10

Languages

Deployment with `azd up`

`Model Prompt`

`JSON Template`

Packages