-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Docs] HCD and DSE with RAGStack (#582)
* initial-content * add-langchain-hub * dse-69-example * typo
- Loading branch information
Showing
3 changed files
with
149 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
= RAGStack and DataStax Enterprise (DSE) 6.9 example | ||
|
||
. Pull the latest dse-server Docker image and confirm the container is in a running state. | ||
+ | ||
[source,bash] | ||
---- | ||
docker pull datastax/dse-server:6.9.0-rc.2 | ||
docker run -e DS_LICENSE=accept -p 9042:9042 -d datastax/dse-server:6.9.0-rc.2 | ||
---- | ||
+ | ||
. Install dependencies. | ||
+ | ||
[source,bash] | ||
---- | ||
pip install ragstack-ai-langchain python-dotenv langchainhub | ||
---- | ||
+ | ||
. Create a `.env` file in the root directory of the project and add the following environment variables. | ||
+ | ||
[source,bash] | ||
---- | ||
OPENAI_API_KEY="sk-..." | ||
---- | ||
+ | ||
. Create a Python script to embed and generate the results of a query. | ||
+ | ||
include::examples:partial$hcd-quickstart.adoc[] | ||
+ | ||
You should see output like this: | ||
+ | ||
[source,plain] | ||
---- | ||
Task decomposition involves breaking down a complex task into smaller and simpler steps to make it more manageable. Techniques like Chain of Thought and Tree of Thoughts help models decompose hard tasks and enhance performance by thinking step by step. This process allows for a better interpretation of the model's thinking process and can involve various methods such as simple prompting, task-specific instructions, or human inputs. | ||
---- | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
= RAGStack and Hyper Converged Database (HCD) example | ||
|
||
. Clone the HCD example repository. | ||
+ | ||
[source,bash] | ||
---- | ||
git clone git@github.com:datastax/astra-db-java.git | ||
cd astra-db-java | ||
---- | ||
+ | ||
. Build the Docker image and confirm the containers are in a running state. | ||
+ | ||
[source,bash] | ||
---- | ||
docker compose up -d | ||
docker compose ps | ||
---- | ||
+ | ||
. Install dependencies. | ||
+ | ||
[source,bash] | ||
---- | ||
pip install ragstack-ai-langchain python-dotenv langchainhub | ||
---- | ||
+ | ||
. Create a `.env` file in the root directory of the project and add the following environment variables. | ||
+ | ||
[source,bash] | ||
---- | ||
OPENAI_API_KEY="sk-..." | ||
---- | ||
+ | ||
. Create a Python script to embed and generate the results. | ||
+ | ||
include::examples:partial$hcd-quickstart.adoc[] | ||
+ | ||
You should see output like this: | ||
+ | ||
[source,plain] | ||
---- | ||
Task decomposition involves breaking down a complex task into smaller and simpler steps to make it more manageable. Techniques like Chain of Thought and Tree of Thoughts help models decompose hard tasks and enhance performance by thinking step by step. This process allows for a better interpretation of the model's thinking process and can involve various methods such as simple prompting, task-specific instructions, or human inputs. | ||
---- | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
.Python | ||
[%collapsible%open] | ||
==== | ||
[source,python] | ||
---- | ||
import os | ||
from dotenv import load_dotenv | ||
import bs4 | ||
from langchain import hub | ||
from langchain_openai import ChatOpenAI, OpenAIEmbeddings | ||
from langchain_community.document_loaders import WebBaseLoader | ||
from langchain_core.output_parsers import StrOutputParser | ||
from langchain_core.runnables import RunnablePassthrough | ||
from langchain_text_splitters import RecursiveCharacterTextSplitter | ||
import cassio | ||
from cassio.table import MetadataVectorCassandraTable | ||
from langchain_community.vectorstores import Cassandra | ||
# Load environment variables | ||
load_dotenv() | ||
openai_api_key = os.getenv("OPENAI_API_KEY") | ||
# Initialize Cassandra | ||
cassio.init(contact_points=['localhost'], username='cassandra', password='cassandra') | ||
cassio.config.resolve_session().execute( | ||
"create keyspace if not exists my_vector_keyspace with replication = {'class': 'SimpleStrategy', 'replication_factor': '1'};" | ||
) | ||
# Create metadata Vector Cassandra Table | ||
mvct = MetadataVectorCassandraTable(table='my_vector_table', vector_dimension=1536, keyspace='my_vector_keyspace') | ||
# Web loader configuration | ||
loader = WebBaseLoader( | ||
web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",), | ||
bs_kwargs=dict( | ||
parse_only=bs4.SoupStrainer( | ||
class_=("post-content", "post-title", "post-header") | ||
) | ||
), | ||
) | ||
docs = loader.load() | ||
# Document splitting | ||
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) | ||
splits = text_splitter.split_documents(docs) | ||
# Vector store setup | ||
vectorstore = Cassandra.from_documents(documents=splits, embedding=OpenAIEmbeddings(), table_name='my_vector_table', keyspace='my_vector_keyspace', vector_dimension=1024) | ||
retriever = vectorstore.as_retriever() | ||
# Language model setup | ||
llm = ChatOpenAI() | ||
# Chain components | ||
def format_docs(docs): | ||
return "\n\n".join(doc.page_content for doc in docs) | ||
rag_chain = ( | ||
{"context": retriever | format_docs, "question": RunnablePassthrough()} | ||
| hub.pull("rlm/rag-prompt") | ||
| llm | ||
| StrOutputParser() | ||
) | ||
# Invocation | ||
result = rag_chain.invoke("What is Task Decomposition?") | ||
print(result) | ||
---- | ||
==== |