-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding the multimodal RAG tutorial with Amazon Nova and LangChain #305
base: main
Are you sure you want to change the base?
Adding the multimodal RAG tutorial with Amazon Nova and LangChain #305
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@debnsuma
This is a a great addition, thanks for submitting this. Added a few suggestions to simplify some of the code and structure.
"source": [ | ||
"<h2 style=\"background: linear-gradient(to right, #ff6b6b, #4ecdc4, #1e90ff); \n", | ||
" color: white; \n", | ||
" padding: 15px; \n", | ||
" border-radius: 10px; \n", | ||
" text-align: center; \n", | ||
" font-family: 'Comic Sans MS', cursive, sans-serif; \n", | ||
" text-shadow: 2px 2px 4px rgba(0,0,0,0.5);\">\n", | ||
" Data Loading\n", | ||
"</h2>" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would suggest to use markdown headers, instead of any HTML elements to keep the notebook simple and consistent.
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"<h2 style=\"background: linear-gradient(to right, #ff6b6b, #4ecdc4, #1e90ff); \n", | ||
" color: white; \n", | ||
" padding: 15px; \n", | ||
" border-radius: 10px; \n", | ||
" text-align: center; \n", | ||
" font-family: 'Comic Sans MS', cursive, sans-serif; \n", | ||
" text-shadow: 2px 2px 4px rgba(0,0,0,0.5);\">\n", | ||
" Data Extraction\n", | ||
"</h2>" | ||
] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace with a markdown heading.
"overlap= 200\n", | ||
"\n", | ||
"# Process chunks with LangChain's RecursiveCharacterTextSplitter\n", | ||
"text_splitter = RecursiveCharacterTextSplitter(\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Formatting seems a bit off here.
"image_save_dir = \"data/processed_images\"\n", | ||
"text_save_dir = \"data/processed_text\"\n", | ||
"table_save_dir = \"data/processed_tables\"\n", | ||
"page_images_save_dir = \"data/processed_page_images\"\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can probably simplify and shorten the names here.
"image_save_dir = \"data/processed_images\"\n", | |
"text_save_dir = \"data/processed_text\"\n", | |
"table_save_dir = \"data/processed_tables\"\n", | |
"page_images_save_dir = \"data/processed_page_images\"\n", | |
"images_dir = \"data/images\"\n", | |
"texts_dir = \"data/texts\"\n", | |
"tables_dir = \"data/tables\"\n", | |
"page_images_dir = \"data/page_images\"\n", |
" page = doc[page_num]\n", | ||
" text = page.get_text()\n", | ||
"\n", | ||
" # Step 1: Get/extract all TABLES in the curremt page and store \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slight typo here.
" # Step 1: Get/extract all TABLES in the curremt page and store \n", | |
" # Step 1: Get/extract all TABLES in the current page and store \n", |
" response = client.invoke_model(\n", | ||
" modelId=model_id,\n", | ||
" body=json.dumps(body),\n", | ||
" accept=\"application/json\",\n", | ||
" contentType=\"application/json\"\n", | ||
" )\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can BedrockEmbeddings be used here, instead of invoking boto3 directly? Also, it seems like embeddings are being generated externally, but usually in a RAG app, this is possible by just passing the documents to the vector store.
vector_store.add_documents(all_splits)
"cell_type": "markdown", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"<h2 style=\"background: linear-gradient(to right, #ff6b6b, #4ecdc4, #1e90ff); \n", | ||
" color: white; \n", | ||
" padding: 15px; \n", | ||
" border-radius: 10px; \n", | ||
" text-align: center; \n", | ||
" font-family: 'Comic Sans MS', cursive, sans-serif; \n", | ||
" text-shadow: 2px 2px 4px rgba(0,0,0,0.5);\">\n", | ||
" Creating Vector Database/Index\n", | ||
"</h2>" | ||
] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update to markdown header.
"# Generating RAG response with Amazon Nova\n", | ||
"def invoke_nova_multimodal(prompt, matched_items):\n", | ||
" \"\"\"\n", | ||
" Invoke the Amazon Nova model using langchain-aws.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
" Invoke the Amazon Nova model using langchain-aws.\n", | |
" Invoke the Amazon Nova model.\n", |
"cell_type": "markdown", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"<h2 style=\"background: linear-gradient(to right, #ff6b6b, #4ecdc4, #1e90ff); \n", | ||
" color: white; \n", | ||
" padding: 15px; \n", | ||
" border-radius: 10px; \n", | ||
" text-align: center; \n", | ||
" font-family: 'Comic Sans MS', cursive, sans-serif; \n", | ||
" text-shadow: 2px 2px 4px rgba(0,0,0,0.5);\">\n", | ||
" Test the RAG Pipeline\n", | ||
"</h2>" | ||
] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace with markdown header.
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<h2 style=\"background: linear-gradient(to right, #ff6b6b, #4ecdc4, #1e90ff); \n", | ||
" color: white; \n", | ||
" padding: 15px; \n", | ||
" border-radius: 10px; \n", | ||
" text-align: center; \n", | ||
" font-family: 'Comic Sans MS', cursive, sans-serif; \n", | ||
" text-shadow: 2px 2px 4px rgba(0,0,0,0.5);\">\n", | ||
" Thank you!\n", | ||
"</h2>" | ||
] | ||
} | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace with a markdown header.
fixing the notebook based on code review for pr#305
Thanks so much @3coins for all your inputs. I fixed all of them and refactored the notebook and pushed the changes. |
This notebook demonstrates how to implement a multi-modal Retrieval-Augmented Generation (RAG) system using Amazon Bedrock with Amazon Nova and LangChain. Many documents contain a mixture of content types, including text and images. Traditional RAG applications often lose valuable information captured in images. With the emergence of Multimodal Large Language Models (MLLMs), we can now leverage both text and image data in our RAG systems.