Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Add initial documentation files and configuration #3126

Merged
merged 8 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 9 additions & 19 deletions .github/workflows/release-please-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,23 +43,13 @@ jobs:
working-directory: backend/core
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v3
- name: Install Rye
uses: eifinger/setup-rye@v2
with:
python-version: '3.11'
- name: Check working directory
run: pwd
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install poetry
- name: Install project dependencies
run: poetry install
- name: Build package
run: poetry build
- name: Publish package
uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
packages_dir: backend/core/dist
enable-cache: true
- name: Rye Sync
run: UV_INDEX_STRATEGY=unsafe-first-match rye sync --no-lock
- name: Rye Build
run: rye build
- name: Rye Publish
run: rye publish --token ${{ secrets.PYPI_API_TOKEN }} --yes
11 changes: 11 additions & 0 deletions backend/core/quivr_core/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@


class ChatHistory:
"""
Chat history is a list of ChatMessage.
It is used to store the chat history of a chat.
"""

def __init__(self, chat_id: UUID, brain_id: UUID | None) -> None:
self.id = chat_id
self.brain_id = brain_id
Expand All @@ -30,6 +35,9 @@ def __len__(self):
def append(
self, langchain_msg: AIMessage | HumanMessage, metadata: dict[str, Any] = {}
):
"""
Append a message to the chat history.
"""
chat_msg = ChatMessage(
chat_id=self.id,
message_id=uuid4(),
Expand All @@ -41,6 +49,9 @@ def append(
self._msgs.append(chat_msg)

def iter_pairs(self) -> Generator[Tuple[HumanMessage, AIMessage], None, None]:
"""
Iterate over the chat history as pairs of HumanMessage and AIMessage.
"""
# Reverse the chat_history, newest first
it = iter(self.get_chat_history(newest_first=True))
for ai_message, human_message in zip(it, it):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,19 @@


class MegaparseProcessor(ProcessorBase):
'''
Megaparse processor for PDF files.

It can be used to parse PDF files and split them into chunks.

It comes from the megaparse library.

## Installation
```bash
pip install megaparse
```

'''
supported_extensions = [FileExtension.pdf]

def __init__(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,11 @@ def recursive_character_splitter(


class SimpleTxtProcessor(ProcessorBase):
"""
SimpleTxtProcessor is a class that implements the ProcessorBase interface.
It is used to process the files with the Simple Txt parser.
"""

supported_extensions = [FileExtension.txt]

def __init__(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,16 @@


class TikaProcessor(ProcessorBase):
"""
TikaProcessor is a class that implements the ProcessorBase interface.
It is used to process the files with the Tika server.

To run it with docker you can do:
```bash
docker run -d -p 9998:9998 apache/tika
```
"""

supported_extensions = [FileExtension.pdf]

def __init__(
Expand Down
7 changes: 7 additions & 0 deletions backend/core/quivr_core/processor/splitter.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,12 @@


class SplitterConfig(BaseModel):
"""
This class is used to configure the chunking of the documents.

Chunk size is the number of characters in the chunk.
Chunk overlap is the number of characters that the chunk will overlap with the previous chunk.
"""

chunk_size: int = 400
chunk_overlap: int = 100
16 changes: 16 additions & 0 deletions backend/core/quivr_core/quivr_rag.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@ def compress_documents(


class QuivrQARAG:
"""
QuivrQA RAG is a class that provides a RAG interface to the QuivrQA system.
"""

def __init__(
self,
*,
Expand All @@ -60,6 +64,9 @@ def __init__(

@property
def retriever(self):
"""
Retriever is a function that retrieves the documents from the vector store.
"""
return self.vector_store.as_retriever()

def filter_history(
Expand Down Expand Up @@ -92,6 +99,9 @@ def filter_history(
return filtered_chat_history[::-1]

def build_chain(self, files: str):
"""
Builds the chain for the QuivrQA RAG.
"""
compression_retriever = ContextualCompressionRetriever(
base_compressor=self.reranker, base_retriever=self.retriever
)
Expand Down Expand Up @@ -149,6 +159,9 @@ def answer(
list_files: list[QuivrKnowledge],
metadata: dict[str, str] = {},
) -> ParsedRAGResponse:
"""
Answers a question using the QuivrQA RAG synchronously.
"""
concat_list_files = format_file_list(list_files, self.rag_config.max_files)
conversational_qa_chain = self.build_chain(concat_list_files)
raw_llm_response = conversational_qa_chain.invoke(
Expand All @@ -169,6 +182,9 @@ async def answer_astream(
list_files: list[QuivrKnowledge],
metadata: dict[str, str] = {},
) -> AsyncGenerator[ParsedRAGChunkResponse, ParsedRAGChunkResponse]:
"""
Answers a question using the QuivrQA RAG asynchronously.
"""
concat_list_files = format_file_list(list_files, self.rag_config.max_files)
conversational_qa_chain = self.build_chain(concat_list_files)

Expand Down
10 changes: 10 additions & 0 deletions backend/docs/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# python generated files
__pycache__/
*.py[oc]
build/
dist/
wheels/
*.egg-info

# venv
.venv
1 change: 1 addition & 0 deletions backend/docs/.python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.11.9
3 changes: 3 additions & 0 deletions backend/docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# docs

Describe your project here.
50 changes: 50 additions & 0 deletions backend/docs/docs/css/style.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
.md-container .jp-Cell-outputWrapper .jp-OutputPrompt.jp-OutputArea-prompt,
.md-container .jp-Cell-inputWrapper .jp-InputPrompt.jp-InputArea-prompt {
display: none !important;
}

/* CSS styles for side-by-side layout */
.container {
display: flex-col;
justify-content: space-between;
margin-bottom: 20px;
/* Adjust spacing between sections */
position: sticky;
top: 2.4rem;
z-index: 1000;
/* Ensure it's above other content */
background-color: white;
/* Match your page background */
padding: 0.2rem;
}

.example-heading {
margin: 0.2rem !important;
}

.usage-examples {
width: 100%;
/* Adjust the width as needed */
border: 1px solid var(--md-default-fg-color--light);
border-radius: 2px;
padding: 0.2rem;
}

/* Additional styling for the toggle */
.toggle-example {
cursor: pointer;
color: white;
text-decoration: underline;
background-color: var(--md-primary-fg-color);
padding: 0.2rem;
border-radius: 2px;
}

.hidden {
display: none;
}

/* mendable search styling */
#my-component-root>div {
bottom: 100px;
}
41 changes: 41 additions & 0 deletions backend/docs/docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Welcome to Quivr Documentation

Welcome to the documentation of Quivr! This is the place where you'll find help, guidance and support for collaborative software development. Whether you're involved in an open-source community or a large software team, these resources should get you up and running quickly!

[Quivr](https://quivr.app) is your **Second Brain** that can act as your **personal assistant**. Quivr is a platform that enables the creation of AI assistants, referred to as "Brain". These assistants are designed with specialized capabilities. Some can connect to specific data sources, allowing users to interact directly with the data. Others serve as specialized tools for particular use cases, powered by Rag technology. These tools process specific inputs to generate practical outputs, such as summaries, translations, and more.

## Quick Links

- [Video Installation](https://dub.sh/quivr-demo)

!!! note
**Our goal** is to make Quivr the **best personal assistant** that is powered by your knowledge and your applications 🔥

## What does it do?

<div style="text-align: center;">
<video width="640" height="480" controls>
<source src="https://quivr-cms.s3.eu-west-3.amazonaws.com/singlestore_demo_quivr_232893659c.mp4" type="video/mp4">
Your browser does not support the video tag.
</video>
</div>

## How to get started? 👀

!!! tip
It takes less than **5 seconds** to get started with Quivr. You can even use your Google account to sign up.

1. **Create an account**: Go to [Quivr](https://quivr.app) and create an account.
2. **Create a Brain**: Let us guide you to create your first brain!
3. **Feed this Brain**: Add documentation and/or URLs to feed your brain.
4. **Ask Questions to your Brain**: Ask your Brain questions about the knowledge that you provide.

## Empowering Innovation with Foundation Models & Generative AI

As a Leader in AI, Quivr leverages Foundation Models and Generative AI to empower businesses to achieve gains through Innovation.

- 50k+ users
- 6k+ companies
- 35k+ github stars
- Top 100 open-source

5 changes: 5 additions & 0 deletions backend/docs/docs/installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
## Installation

```bash
pip install quivr-core
```
3 changes: 3 additions & 0 deletions backend/docs/docs/parsers/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@

Quivr provides a suite of parsers to extract structured data from various sources.

5 changes: 5 additions & 0 deletions backend/docs/docs/parsers/megaparse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
## Megaparse

::: quivr_core.processor.implementations.megaparse_processor
options:
heading_level: 2
5 changes: 5 additions & 0 deletions backend/docs/docs/parsers/simple.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
## Simple Txt

::: quivr_core.processor.implementations.simple_txt_processor
options:
heading_level: 2
62 changes: 62 additions & 0 deletions backend/docs/mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
site_name: Quivr
extra_css:
- css/style.css

markdown_extensions:
- attr_list
- admonition
- pymdownx.details
- pymdownx.superfences
- md_in_html
- toc:
permalink: "#"

theme:
custom_dir: overrides
features:
- navigation.instant
- navigation.tabs
- navigation.indexes
- navigation.top
- navigation.footer
- toc.follow
- content.code.copy
- search.suggest
- search.highlight
name: material
palette:
- media: (prefers-color-scheme)
toggle:
icon: material/brightness-auto
name: Switch to light mode
- accent: purple
media: "(prefers-color-scheme: light)"
primary: white
scheme: default
toggle:
icon: material/brightness-7
name: Switch to dark mode
- accent: purple
media: "(prefers-color-scheme: dark)"
primary: black
scheme: slate
toggle:
icon: material/brightness-4
name: Switch to system preference

plugins:
- mkdocstrings:
default_handler: python


nav:
- Home:
- index.md
- installation.md
- Features:
- Parsers:
- parsers/index.md
- parsers/megaparse.md
- parsers/simple.md
- Enterprise: https://docs.quivr.app/

Loading
Loading