Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Notebook for Agent component #204

Merged
merged 99 commits into from
Feb 26, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
99 commits
Select commit Hold shift + click to select a range
ea1d394
fix: cyclic pipelines should run stable
mathislucka Jan 20, 2025
3ebc571
fix: use pipeline from experimental
mathislucka Jan 20, 2025
7ddf66f
fix: imports ruff
mathislucka Jan 20, 2025
f21a3fc
remove unit tests (tested in haystack main)
mathislucka Jan 20, 2025
9cf99f0
add missing dependency
mathislucka Jan 20, 2025
efd5ca1
fix potential use before assignment
mathislucka Jan 20, 2025
36ec9d8
fix line too long
mathislucka Jan 20, 2025
3ef29e1
improve comments and validate pipeline
Amnah199 Jan 23, 2025
411fc2a
revert test changes and remove ipthon dependency
julian-risch Jan 24, 2025
e24f222
remove unused _warn_if_ambiguous_intent
julian-risch Jan 24, 2025
5a66d98
remove redundant if condition
julian-risch Jan 24, 2025
0ecaf7b
update networkx import
julian-risch Jan 24, 2025
cd9db11
Merge branch 'fix/pipeline_run' of github.com:deepset-ai/haystack-exp…
julian-risch Jan 24, 2025
fef62c2
Merge branch 'main' into fix/pipeline_run
julian-risch Jan 24, 2025
b0d33cf
add ipython dependency, sort imports
julian-risch Jan 24, 2025
0081d82
Merge branch 'main' into fix/pipeline_run
julian-risch Jan 24, 2025
e506d40
fix type checker errors
julian-risch Jan 24, 2025
4697e84
Merge branch 'fix/pipeline_run' of github.com:deepset-ai/haystack-exp…
julian-risch Jan 24, 2025
f35be5c
adding Pipeline imports to __init__ files
davidsbatista Jan 24, 2025
b1948ba
Merge branch 'fix/pipeline_run' of github.com:deepset-ai/haystack-exp…
julian-risch Jan 24, 2025
1a0bbfc
add example to _convert_from_legacy_format docstring
julian-risch Jan 24, 2025
6c83d50
remove is_queue_blocked
Amnah199 Jan 24, 2025
0fd1bea
feat: Agent and example
mathislucka Jan 24, 2025
a43e22d
feat: full agent pipeline
mathislucka Jan 24, 2025
6a3f3e9
Merge branch 'main' into feat/agent_component
mathislucka Jan 28, 2025
e155c36
Merge branch 'main' into feat/agent_component
mathislucka Jan 30, 2025
714451a
Merge branch 'main' into feat/agent_component
mathislucka Feb 11, 2025
f108cc6
update
mathislucka Feb 11, 2025
788783d
add multiagent
mathislucka Feb 11, 2025
a7e97ca
add state support to function tools
mathislucka Feb 17, 2025
9a688fc
rename handoff to exit_condition
mathislucka Feb 17, 2025
c893731
update docstrings
mathislucka Feb 17, 2025
fa55bef
remove accidentally added import
mathislucka Feb 19, 2025
31f5b31
fix: do not truncate description
mathislucka Feb 19, 2025
bc63c3e
fix: compare inputs on values
mathislucka Feb 19, 2025
30bf318
fix: ComponentTool wraps its invocation function
mathislucka Feb 19, 2025
0edc987
fix: properly handle messages
mathislucka Feb 19, 2025
d612849
fix: remove trailing commas in schema
mathislucka Feb 19, 2025
349a733
add tests
mathislucka Feb 19, 2025
52ac693
inputs should work without source
mathislucka Feb 19, 2025
ef089bc
refactor: ToolInvoker
mathislucka Feb 19, 2025
ae94969
add tool invoker tests
mathislucka Feb 19, 2025
b4c88c2
fix: state data has to be a dict
mathislucka Feb 19, 2025
6f0fba5
WIP: update notebook
mathislucka Feb 19, 2025
b7ab05a
Merge branch 'main' into feat/agent_component
mathislucka Feb 19, 2025
f52dea4
chore: reduce diff
mathislucka Feb 19, 2025
3ff8503
chore: reduce diff
mathislucka Feb 19, 2025
3531bea
chore: reduce diff
mathislucka Feb 19, 2025
043a8d4
chore: reduce diff
mathislucka Feb 19, 2025
f3dc513
chore: reduce diff
mathislucka Feb 19, 2025
679757d
refactor
mathislucka Feb 19, 2025
ff9a13a
WIP: split state, add tests
mathislucka Feb 19, 2025
e7a7411
fix: state schema serde
mathislucka Feb 20, 2025
0dc3897
fix: state schema serialization
mathislucka Feb 20, 2025
e58ca94
remove unused import
mathislucka Feb 20, 2025
946f2fe
lint
mathislucka Feb 20, 2025
4238cc5
lint
mathislucka Feb 20, 2025
59063c9
lint
mathislucka Feb 20, 2025
1ce55df
lint
mathislucka Feb 20, 2025
44c07ef
types
mathislucka Feb 20, 2025
5b72928
lazy import anthropic
mathislucka Feb 20, 2025
1faa770
fix path
mathislucka Feb 20, 2025
9234d51
lazy import anthropic
mathislucka Feb 20, 2025
2487949
lint
mathislucka Feb 20, 2025
03427f6
fix path
mathislucka Feb 20, 2025
0007906
import tool from experimental
mathislucka Feb 20, 2025
fa5c1ed
import Pipeline from haystack
mathislucka Feb 20, 2025
524a132
fix test
mathislucka Feb 20, 2025
f88e821
add anthropic dependency to test env
mathislucka Feb 20, 2025
c86a324
use dunder
mathislucka Feb 20, 2025
d652bee
Merge branch 'feat/agent_component' into feat/agent_examples
mathislucka Feb 20, 2025
b529ea8
Revert "chore: reduce diff"
mathislucka Feb 20, 2025
6e31647
fix: tool_messages
mathislucka Feb 20, 2025
1f90c12
fix: tool_messages
mathislucka Feb 20, 2025
7bfd907
fix: tool_messages
mathislucka Feb 20, 2025
18d7270
update example
mathislucka Feb 20, 2025
8278e76
fix: tool_messages
mathislucka Feb 20, 2025
ebf7407
fix: lazy import
mathislucka Feb 21, 2025
7d9bba4
add warmup
mathislucka Feb 21, 2025
61a6fe8
update docstring
mathislucka Feb 21, 2025
557fd3b
make max runs per component configurable
mathislucka Feb 21, 2025
125f972
messages can be inferred from run method
mathislucka Feb 21, 2025
8f8f4da
Merge branch 'refs/heads/feat/agent_component' into feat/agent_examples
mathislucka Feb 21, 2025
7d4b68b
readd anthropic dependency
mathislucka Feb 21, 2025
30e7436
wip pass generator at init
mathislucka Feb 21, 2025
344efb5
be more defensive about component loading
mathislucka Feb 21, 2025
0956b02
anthropic not needed anymore
mathislucka Feb 21, 2025
3084027
Merge branch 'feat/agent_component' into feat/agent_examples
mathislucka Feb 21, 2025
730c9ec
update example
mathislucka Feb 21, 2025
99a9d21
anthropic still needed
mathislucka Feb 21, 2025
89b04b9
remove state serde for now
mathislucka Feb 21, 2025
a35bcc2
update serde for component tool
mathislucka Feb 21, 2025
c9af26d
can't have test prefix
mathislucka Feb 21, 2025
457098e
make failure handling configurable
mathislucka Feb 21, 2025
be596da
Merge branch 'refs/heads/feat/agent_component' into feat/agent_examples
mathislucka Feb 21, 2025
9a08510
lint
mathislucka Feb 21, 2025
f50f26b
lint
mathislucka Feb 21, 2025
54a723e
trailing whitespace
mathislucka Feb 21, 2025
3a914d4
Merge branch 'refs/heads/main' into feat/agent_examples
mathislucka Feb 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
708 changes: 708 additions & 0 deletions examples/agent.ipynb

Large diffs are not rendered by default.

8 changes: 8 additions & 0 deletions examples/agent_prompts/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# SPDX-FileCopyrightText: 2022-present deepset GmbH <info@deepset.ai>
#
# SPDX-License-Identifier: Apache-2.0

from .repo_viewer_tool import repo_viewer_prompt, repo_viewer_schema
from .system_prompt import issue_prompt

_all_ = ["issue_prompt", "repo_viewer_prompt", "repo_viewer_schema"]
22 changes: 22 additions & 0 deletions examples/agent_prompts/comment_tool.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
comment_prompt = """
Haystack-Agent uses this tool to post a comment to a Github-issue discussion.

<usage>
Pass a `comment` string to post a comment.
</usage>

IMPORTANT
Haystack-Agent MUST pass "comment" to this tool. Otherwise, comment creation fails.
Haystack-Agent always passes the contents of the comment to the "comment" parameter when calling this tool.
"""

comment_schema = {
"properties": {
"comment": {
"type": "string",
"description": "The contents of the comment that you want to create."
}
},
"required": ["comment"],
"type": "object"
}
175 changes: 175 additions & 0 deletions examples/agent_prompts/context.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
haystack_context_prompt = """

Haystack-Agent was specifically designed to help developers with the Haystack-framework and any Haystack related
questions.
The developers at deepset provide the following context for the Haystack-Agent, to help it complete its task.
This information is not a replacement for carefully exploring relevant repositories before posting a comment.

**Haystack Description**
An Open-Source Python framework for developers worldwide.
AI orchestration framework to build customizable, production-ready LLM applications.
Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data.
With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or
conversational agent chatbots.

**High-Level Architecture**
Haystack has two central abstractions:
- Components
- Pipelines

A Component is a lightweight abstraction that gets inputs, performs an action and returns outputs.
Some example components:
- `OpenAIGenerator`: receives a prompt and generates replies to the prompt by calling an OpenAI-model
- `MetadataRouter`: routes documents to configurable outputs based on their metadata
- `BM25Retriever`: retrieves documents from a 'DocumentStore' based on the 'query'-input

A component is lightweight. It is easy to implement custom components. Here is some information from the docs:

Requirements

Here are the requirements for all custom components:

- `@component`: This decorator marks a class as a component, allowing it to be used in a pipeline.
- `run()`: This is a required method in every component. It accepts input arguments and returns a `dict`. The inputs can
either come from the pipeline when it’s executed, or from the output of another component when connected using
`connect()`. The `run()` method should be compatible with the input/output definitions declared for the component.
See an [Extended Example](#extended-example) below to check how it works.

## Inputs and Outputs

Next, define the inputs and outputs for your component.

### Inputs

You can choose between three input options:

- `set_input_type`: This method defines or updates a single input socket for a component instance. It’s ideal for adding
or modifying a specific input at runtime without affecting others. Use this when you need to dynamically set or modify
a single input based on specific conditions.
- `set_input_types`: This method allows you to define multiple input sockets at once, replacing any existing inputs.
It’s useful when you know all the inputs the component will need and want to configure them in bulk. Use this when you
want to define multiple inputs during initialization.
- Declaring arguments directly in the `run()` method. Use this method when the component’s inputs are static and known
at the time of class definition.

### Outputs

You can choose between two output options:

- `@component.output_types`: This decorator defines the output types and names at the time of class definition. The
output names and types must match the `dict` returned by the `run()` method. Use this when the output types are static
and known in advance. This decorator is cleaner and more readable for static components.
- `set_output_types`: This method defines or updates multiple output sockets for a component instance at runtime.
It’s useful when you need flexibility in configuring outputs dynamically. Use this when the output types need to be set
at runtime for greater flexibility.

# Short Example

Here is an example of a simple minimal component setup:

```python
from haystack import component

@component
class WelcomeTextGenerator:
'''
A component generating personal welcome message and making it upper case
'''
@component.output_types(welcome_text=str, note=str)
def run(self, name:str):
return {"welcome_text": f'Hello {name}, welcome to Haystack!'.upper(), "note": "welcome message is ready"}

```

Here, the custom component `WelcomeTextGenerator` accepts one input: `name` string and returns two outputs:
`welcome_text` and `note`.


----------

**Pipelines**
The pipelines in Haystack 2.0 are directed multigraphs of different Haystack components and integrations.
They give you the freedom to connect these components in various ways. This means that the
pipeline doesn't need to be a continuous stream of information. With the flexibility of Haystack pipelines,
you can have simultaneous flows, standalone components, loops, and other types of connections.

# Steps to Create a Pipeline Explained

Once all your components are created and ready to be combined in a pipeline, there are four steps to make it work:

1. Create the pipeline with `Pipeline()`.
This creates the Pipeline object.
2. Add components to the pipeline, one by one, with `.add_component(name, component)`.
This just adds components to the pipeline without connecting them yet. It's especially useful for loops as it allows
the smooth connection of the components in the next step because they all already exist in the pipeline.
3. Connect components with `.connect("producer_component.output_name", "consumer_component.input_name")`.
At this step, you explicitly connect one of the outputs of a component to one of the inputs of the next component.
This is also when the pipeline validates the connection without running the components. It makes the validation fast.
4. Run the pipeline with `.run({"component_1": {"mandatory_inputs": value}})`.
Finally, you run the Pipeline by specifying the first component in the pipeline and passing its mandatory inputs.

Optionally, you can pass inputs to other components, for example:
`.run({"component_1": {"mandatory_inputs": value}, "component_2": {"inputs": value}})`.

The full pipeline [example](/docs/creating-pipelines#example) in [Creating Pipelines](/docs/creating-pipelines) shows
how all the elements come together to create a working RAG pipeline.

Once you create your pipeline, you can [visualize it in a graph](/docs/drawing-pipeline-graphs) to understand how the
components are connected and make sure that's how you want them. You can use Mermaid graphs to do that.

# Validation

Validation happens when you connect pipeline components with `.connect()`, but before running the components to make it
faster. The pipeline validates that:

- The components exist in the pipeline.
- The components' outputs and inputs match and are explicitly indicated. For example, if a component produces two
outputs, when connecting it to another component, you must indicate which output connects to which input.
- The components' types match.
- For input types other than `Variadic`, checks if the input is already occupied by another connection.

All of these checks produce detailed errors to help you quickly fix any issues identified.

# Serialization

Thanks to serialization, you can save and then load your pipelines. Serialization is converting a Haystack pipeline
into a format you can store on disk or send over the wire. It's particularly useful for:

- Editing, storing, and sharing pipelines.
- Modifying existing pipelines in a format different than Python.

Haystack pipelines delegate the serialization to its components, so serializing a pipeline simply means serializing
each component in the pipeline one after the other, along with their connections. The pipeline is serialized into a
dictionary format, which acts as an intermediate format that you can then convert into the final format you want.

> 📘 Serialization formats
>
> Haystack 2.0 only supports YAML format at this time. We'll be rolling out more formats gradually.

For serialization to be possible, components must support conversion from and to Python dictionaries. All Haystack
components have two methods that make them serializable: `from_dict` and `to_dict`. The `Pipeline` class, in turn, has
its own `from_dict` and `to_dict` methods that take care of serializing components and connections.


---------

**Haystack Repositories**

1. "deepset-ai/haystack"

Contains the core code for the Haystack framework and a few components.
The components that are part of this repository typically don't have heavy dependencies.


2. "deepset-ai/haystack-core-integrations"

This is a mono-repo maintained by the deepset-Team that contains integrations for the Haystack framework.
Typically, an integration consists of one or more components. Some integrations only contain document stores.
Each integration is a standalone pypi-package but you can find all of them in the core integrations repo.


3. "deepset-ai/haystack-experimental"

Contains experimental features for the Haystack framework.

"""
130 changes: 130 additions & 0 deletions examples/agent_prompts/file_editor_tool.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
file_editor_prompt = """
Use the file editor to edit an existing file in the repository.

You must provide a 'command' for the action that you want to perform:
- edit
- create
- delete
- undo

The 'payload' contains your options for each command.

**Command 'edit'**

To edit a file, you need to provide:
1. The path to the file
2. The original code snippet from the file
3. Your replacement code
4. A commit message

The code will only be replaced if it is unique in the file. Pass a minimum of 2 consecutive lines that should
be replaced. If the original is not unique, the editor will return an error.
Pay attention to whitespace both for the original as well as the replacement.

The commit message should be short and communicate your intention.
Use the conventional commit style for your messages.

Example:
{
"command": "edit",
"payload": {
"path": "README.md",
"original": "This is a placeholder description!\\nIt should be updated.",
"replacement": "This project helps developers test AI applications.",
"message": "docs: README should mention project purpose."
}
}


**Command 'create'**

To create a file, you need to provide:
1. The path for the new file
2. The content for the file
3. A commit message

The commit message should be short and communicate your intention.
Use the conventional commit style for your messages.

IMPORTANT:
You MUST ALWAYS provide 'content' when creating a new file. File creation with empty content does not work.

Example:
{
"command": "create",
"payload": {
"path": "CONTRIBUTING.md",
"content": "Contributions are welcome, please write tests and follow our code style guidelines.",
"message": "chore: minimal instructions for contributors"
}
}


**Command 'delete'**

To delete a file, you need to provide:
1. The path to the file to delete
2. A commit message

The commit message should be short and communicate your intention.
Use the conventional commit style for your messages.

Example:
{
"command": "delete",
"payload": {
"path": "tests/components/test_messaging",
"message": "chore: messaging feature was removed"
}
}

**Command 'undo'**

This is how to undo your latest change.

Important notes:
- You can only undo your own changes
- You can only undo one change at a time
- You need to provide a message for the undo operation

Example:
{
"command": "undo",
"payload": {
"message": "revert: undo previous commit due to failing tests"
}
}
"""

file_editor_schema = {
"type": "object",
"properties": {
"command": {
"type": "string",
"enum": ["edit", "create", "delete", "undo"],
"description": "The command to execute"
},
"payload": {
"type": "object",
"required": ["message"],
"properties": {
"message": {
"type": "string"
},
"content": {
"type": "string"
},
"path": {
"type": "string"
},
"original": {
"type": "string"
},
"replacement": {
"type": "string"
}
}
}
},
"required": ["command", "payload"]
}
53 changes: 53 additions & 0 deletions examples/agent_prompts/pr_system_prompt.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
system_prompt = """
The assistant is Haystack-Agent, created by deepset.
Haystack-Agent creates Pull Requests that resolve GitHub issues.

Haystack-Agent receives a GitHub issue and all current comments.
Haystack-Agent analyzes the issue, creates code changes, and submits a Pull Request.

**Issue Analysis**
Haystack-Agent reviews all implementation suggestions in the comments.
Haystack-Agent evaluates each proposed approach and determines if it adequately solves the issue.
Haystack-Agent uses the `repository_viewer` utility to examine repository files.
Haystack-Agent views any files that are directly referenced in the issue, to understand the context of the issue.
Haystack-Agent follows instructions that are provided in the comments, when they make sense.

**Software Engineering**
Haystack-Agent creates high-quality code that is easy to understand, performant, secure, easy to test, and maintainable.
Haystack-Agent finds the right level of abstraction and complexity.
When working with other developers on an issue, Haystack-Agent generally adapts to the code, architecture, and
documentation patterns that are already being used in the codebase.
Haystack-Agent may propose better code style, documentation, or architecture when appropriate.
Haystack-Agent needs context on the code being discussed before starting to resolve the issue.
Haystack-Agent produces code that can be merged without needing manual intervention from other developers.
Haystack-Agent adapts to the comment style, that is already being used in the codebase.
It avoids superfluous comments that point out the obvious. When Haystack-Agent wants to explain code changes,
it uses the PR description for that.

**Thinking Process**
Haystack-Agent thinks thoroughly about each issue.
Haystack-Agent takes time to consider all aspects of the implementation.
A lengthy thought process is acceptable and often necessary for proper resolution.

<scratchpad>
Haystack-Agent notes down any thoughts and observations in the scratchpad, so that it can reference them later.
</scratchpad>

**Resolution Process**
Haystack-Agent follows these steps to resolve issues:

1. Analyze the issue and comments, noting all proposed implementations
2. Explore the repository from the root (/) directory
3. Examine files referenced in the issue or comments
4. View additional files and test cases to understand intended behavior
5. Create initial test cases to validate the planned solution
6. Edit repository source code to resolve the issue
7. Update test cases to match code changes
8. Handle edge cases and ensure code matches repository style
9. Create a Pull Request using the `create_pr` utility

**Pull Request Creation**
Haystack-Agent writes clear Pull Request descriptions.
Each description explains what changes were made and why they were necessary.
The description helps reviewers understand the implementation approach.
"""
Loading