Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Amazon Knowledge Bases retriever support #4035

Merged
merged 6 commits into from
Jan 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
hide_table_of_contents: true
---

# Knowledge Bases for Amazon Bedrock

Knowledge Bases for Amazon Bedrock is a fully managed support for end-to-end RAG workflow provided by Amazon Web Services (AWS). It provides an entire ingestion workflow of converting your documents into embeddings (vector) and storing the embeddings in a specialized vector database. Knowledge Bases for Amazon Bedrock supports popular databases for vector storage, including vector engine for Amazon OpenSearch Serverless, Pinecone, Redis Enterprise Cloud, Amazon Aurora (coming soon), and MongoDB (coming soon).

## Setup

import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";

<IntegrationInstallTooltip></IntegrationInstallTooltip>

```bash npm2yarn
npm i @aws-sdk/client-bedrock-agent-runtime @langchain/community
```

## Usage

import CodeBlock from "@theme/CodeBlock";
import Example from "@examples/retrievers/amazon_knowledge_bases.ts";

<CodeBlock language="typescript">{Example}</CodeBlock>
17 changes: 17 additions & 0 deletions examples/src/retrievers/amazon_knowledge_bases.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import { AmazonKnowledgeBaseRetriever } from "@langchain/community/retrievers/amazon_knowledge_base";

const retriever = new AmazonKnowledgeBaseRetriever({
topK: 10,
knowledgeBaseId: "YOUR_KNOWLEDGE_BASE_ID",
region: "us-east-2",
clientOptions: {
credentials: {
accessKeyId: "YOUR_ACCESS_KEY_ID",
secretAccessKey: "YOUR_SECRET_ACCESS_KEY",
},
},
});

const docs = await retriever.getRelevantDocuments("How are clouds formed?");

console.log(docs);
3 changes: 3 additions & 0 deletions libs/langchain-community/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -364,6 +364,9 @@ callbacks/handlers/lunary.d.ts
retrievers/amazon_kendra.cjs
retrievers/amazon_kendra.js
retrievers/amazon_kendra.d.ts
retrievers/amazon_knowledge_base.cjs
retrievers/amazon_knowledge_base.js
retrievers/amazon_knowledge_base.d.ts
retrievers/chaindesk.cjs
retrievers/chaindesk.js
retrievers/chaindesk.d.ts
Expand Down
13 changes: 13 additions & 0 deletions libs/langchain-community/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
},
"devDependencies": {
"@aws-crypto/sha256-js": "^5.0.0",
"@aws-sdk/client-bedrock-agent-runtime": "^3.496.0",
"@aws-sdk/client-bedrock-runtime": "^3.422.0",
"@aws-sdk/client-dynamodb": "^3.310.0",
"@aws-sdk/client-kendra": "^3.352.0",
Expand Down Expand Up @@ -160,6 +161,7 @@
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there! 👋 I noticed that a new peer dependency "@aws-sdk/client-bedrock-agent-runtime" has been added in this PR. This comment is to flag the change for maintainers to review. Keep up the great work! 🚀

"peerDependencies": {
"@aws-crypto/sha256-js": "^5.0.0",
"@aws-sdk/client-bedrock-agent-runtime": "^3.485.0",
"@aws-sdk/client-bedrock-runtime": "^3.422.0",
"@aws-sdk/client-dynamodb": "^3.310.0",
"@aws-sdk/client-kendra": "^3.352.0",
Expand Down Expand Up @@ -243,6 +245,9 @@
"@aws-crypto/sha256-js": {
"optional": true
},
"@aws-sdk/client-bedrock-agent-runtime": {
"optional": true
},
"@aws-sdk/client-bedrock-runtime": {
"optional": true
},
Expand Down Expand Up @@ -1092,6 +1097,11 @@
"import": "./retrievers/amazon_kendra.js",
"require": "./retrievers/amazon_kendra.cjs"
},
"./retrievers/amazon_knowledge_base": {
"types": "./retrievers/amazon_knowledge_base.d.ts",
"import": "./retrievers/amazon_knowledge_base.js",
"require": "./retrievers/amazon_knowledge_base.cjs"
},
"./retrievers/chaindesk": {
"types": "./retrievers/chaindesk.d.ts",
"import": "./retrievers/chaindesk.js",
Expand Down Expand Up @@ -1662,6 +1672,9 @@
"retrievers/amazon_kendra.cjs",
"retrievers/amazon_kendra.js",
"retrievers/amazon_kendra.d.ts",
"retrievers/amazon_knowledge_base.cjs",
"retrievers/amazon_knowledge_base.js",
"retrievers/amazon_knowledge_base.d.ts",
"retrievers/chaindesk.cjs",
"retrievers/chaindesk.js",
"retrievers/chaindesk.d.ts",
Expand Down
2 changes: 2 additions & 0 deletions libs/langchain-community/scripts/create-entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ const entrypoints = {
"callbacks/handlers/lunary": "callbacks/handlers/lunary",
// retrievers
"retrievers/amazon_kendra": "retrievers/amazon_kendra",
"retrievers/amazon_knowledge_base": "retrievers/amazon_knowledge_base",
"retrievers/chaindesk": "retrievers/chaindesk",
"retrievers/databerry": "retrievers/databerry",
"retrievers/metal": "retrievers/metal",
Expand Down Expand Up @@ -276,6 +277,7 @@ const requiresOptionalDependency = [
"chat_models/iflytek_xinghuo",
"chat_models/iflytek_xinghuo/web",
"retrievers/amazon_kendra",
"retrievers/amazon_knowledge_base",
"retrievers/metal",
"retrievers/supabase",
"retrievers/vectara_summary",
Expand Down
1 change: 1 addition & 0 deletions libs/langchain-community/src/load/import_constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ export const optionalImportEntrypoints = [
"langchain_community/callbacks/handlers/llmonitor",
"langchain_community/callbacks/handlers/lunary",
"langchain_community/retrievers/amazon_kendra",
"langchain_community/retrievers/amazon_knowledge_base",
"langchain_community/retrievers/metal",
"langchain_community/retrievers/supabase",
"langchain_community/retrievers/vectara_summary",
Expand Down
3 changes: 3 additions & 0 deletions libs/langchain-community/src/load/import_type.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,9 @@ export interface OptionalImportMap {
"@langchain/community/retrievers/amazon_kendra"?:
| typeof import("../retrievers/amazon_kendra.js")
| Promise<typeof import("../retrievers/amazon_kendra.js")>;
"@langchain/community/retrievers/amazon_knowledge_base"?:
| typeof import("../retrievers/amazon_knowledge_base.js")
| Promise<typeof import("../retrievers/amazon_knowledge_base.js")>;
"@langchain/community/retrievers/metal"?:
| typeof import("../retrievers/metal.js")
| Promise<typeof import("../retrievers/metal.js")>;
Expand Down
113 changes: 113 additions & 0 deletions libs/langchain-community/src/retrievers/amazon_knowledge_base.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
import {
RetrieveCommand,
BedrockAgentRuntimeClient,
BedrockAgentRuntimeClientConfig,
} from "@aws-sdk/client-bedrock-agent-runtime";

import { BaseRetriever } from "@langchain/core/retrievers";
import { Document } from "@langchain/core/documents";

/**
* Interface for the arguments required to initialize an
* AmazonKnowledgeBaseRetriever instance.
*/
export interface AmazonKnowledgeBaseRetrieverArgs {
knowledgeBaseId: string;
topK: number;
region: string;
clientOptions?: BedrockAgentRuntimeClientConfig;
}

/**
* Class for interacting with Amazon Bedrock Knowledge Bases, a RAG workflow oriented service
* provided by AWS. Extends the BaseRetriever class.
* @example
* ```typescript
* const retriever = new AmazonKnowledgeBaseRetriever({
* topK: 10,
* knowledgeBaseId: "YOUR_KNOWLEDGE_BASE_ID",
* region: "us-east-2",
* clientOptions: {
* credentials: {
* accessKeyId: "YOUR_ACCESS_KEY_ID",
* secretAccessKey: "YOUR_SECRET_ACCESS_KEY",
* },
* },
* });
*
* const docs = await retriever.getRelevantDocuments("How are clouds formed?");
* ```
*/
export class AmazonKnowledgeBaseRetriever extends BaseRetriever {
static lc_name() {
return "AmazonKnowledgeBaseRetriever";
}

lc_namespace = ["langchain", "retrievers", "amazon_bedrock_knowledge_base"];

knowledgeBaseId: string;

topK: number;

bedrockAgentRuntimeClient: BedrockAgentRuntimeClient;

constructor({
knowledgeBaseId,
topK = 10,
clientOptions,
region,
}: AmazonKnowledgeBaseRetrieverArgs) {
super();

this.topK = topK;
this.bedrockAgentRuntimeClient = new BedrockAgentRuntimeClient({
region,
...clientOptions,
});
this.knowledgeBaseId = knowledgeBaseId;
}

/**
* Cleans the result text by replacing sequences of whitespace with a
* single space and removing ellipses.
* @param resText The result text to clean.
* @returns The cleaned result text.
*/
cleanResult(resText: string) {
const res = resText.replace(/\s+/g, " ").replace(/\.\.\./g, "");
return res;
}

async queryKnowledgeBase(query: string, topK: number) {
const retrieveCommand = new RetrieveCommand({
knowledgeBaseId: this.knowledgeBaseId,
retrievalQuery: {
text: query,
},
retrievalConfiguration: {
vectorSearchConfiguration: {
numberOfResults: topK,
},
},
});

const retrieveResponse = await this.bedrockAgentRuntimeClient.send(
retrieveCommand
);

return (
retrieveResponse.retrievalResults?.map((result) => ({
pageContent: this.cleanResult(result.content?.text || ""),
metadata: {
source: result.location?.s3Location?.uri,
score: result.score,
},
})) ?? ([] as Array<Document>)
);
}

async _getRelevantDocuments(query: string): Promise<Document[]> {
const docs = await this.queryKnowledgeBase(query, this.topK);
return docs;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,7 @@ test.skip("AmazonKendraRetriever", async () => {

const docs = await retriever.getRelevantDocuments("How are clouds formed?");

expect(docs.length).toBeGreaterThan(0);

console.log(docs);
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
/* eslint-disable no-process-env */
/* eslint-disable @typescript-eslint/no-non-null-assertion */
import { test } from "@jest/globals";
import { AmazonKnowledgeBaseRetriever } from "../amazon_knowledge_base.js";

test("AmazonKnowledgeBaseRetriever", async () => {
const retriever = new AmazonKnowledgeBaseRetriever({
topK: 10,
knowledgeBaseId: process.env.AMAZON_KNOWLEDGE_BASE_ID || "",
region: process.env.AWS_REGION || "us-east-1",
clientOptions: {
credentials: {
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
sessionToken: process.env.AWS_SESSION_TOKEN!,
},
},
});

const docs = await retriever.getRelevantDocuments("How are clouds formed?");
expect(docs.length).toBeGreaterThan(0);
});
Loading