This project implements a Mixture of Agents (MoA) model, a novel approach to leveraging multiple Large Language Models (LLMs) to enhance reasoning and language generation capabilities. The implementation is based on the paper "Mixture-of-Agents Enhances Large Language Model Capabilities" by Wang et al. (2024). It differs in that it implements MoA with Gemini Pro 1.5, GPT4-o & Claude Sonnet 3.5 rather than open source models.
Our MoA implementation utilizes a multi-layer architecture with multiple LLM agents in each layer. The current setup includes:
- Three types of LLM agents:
- GPT-4 (OpenAI)
- Claude 3.5 Sonnet (Anthropic)
- Gemini Pro 1.5 (Google)
- Multiple processing layers (configurable, default is 2, maximum recommended is 3)
- Specialized roles for synthesis and final output generation
graph TD
A[User Input] --> B[Layer 1]
B --> C[GPT-4 Agent]
B --> D[Claude Agent]
B --> E[Gemini Agent]
C --> F[Aggregation & Peer Review]
D --> F
E --> F
F --> G[Gemini Synthesis]
G --> H[Layer 2]
H --> I[GPT-4 Agent]
H --> J[Claude Agent]
H --> K[Gemini Agent]
I --> L[Aggregation & Peer Review]
J --> L
K --> L
L --> M[Gemini Synthesis]
M --> N[Claude Final Output]
N --> O[Final Response]
- The user input is fed into the first layer.
- In each layer:
- All agents process the input simultaneously.
- Each agent then reviews and aggregates all responses, including their own, with enhanced critical analysis.
- Gemini synthesizes the aggregated responses and provides a devil's advocate perspective.
- The synthesized output becomes the input for the next layer.
- This process repeats through all layers.
- Claude generates the final output based on all layer syntheses, performing a thorough cross-check against the original prompt.
- Enhanced Aggregation: Each agent now performs a more rigorous analysis, including assumption challenging, mathematical verification, and peer review.
- Devil's Advocate: The synthesis step now includes an aggressive devil's advocate perspective to challenge prevailing answers.
- Logic Tree Exploration: Agents are instructed to explore multiple interpretations using logic trees.
- Final Cross-Check: The final output generation includes a thorough cross-check against the original prompt.
- Detailed Markdown Logging: The system now generates comprehensive markdown logs of the entire process.
- Specialized Roles: We use Gemini specifically for synthesis and Claude for final output, leveraging their unique strengths.
- Enhanced Critical Analysis: Our implementation includes more rigorous peer review and assumption challenging at each stage.
- Devil's Advocate Perspective: We've added a dedicated step to critically challenge the prevailing answers.
- Flexible Layer Configuration: Users can choose the number of layers, with recommendations for optimal performance.
- Comprehensive Logging: Our system provides detailed, structured logs of the entire reasoning process.
Color-Coded CLI Output The CLI displays color-coded outputs for each stage of the process, enhancing readability and understanding of the workflow.
Full Text Display The CLI shows the full text of each agent's response at each layer, providing a comprehensive view of the reasoning process.
Markdown Report Generation After each interaction, a detailed Markdown report is generated, containing:
- The original prompt
- Full responses from each agent at each layer
- Synthesis and devil's advocate perspectives
- The final response
This report is useful for in-depth analysis of the MoA process and for sharing results.
Follow these steps to set up the project:
- Clone the repository:
git clone https://github.com/yourusername/moa-implementation.git
cd moa-implementation
-
Ensure you have Docker installed on your system.
-
Create a
.env
file in the project root directory with your API keys:
OPENAI_API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
GOOGLE_API_KEY=your_google_api_key
To run the MoA model:
- Build the Docker image:
docker build -t moa_project
- Run the Docker container:
docker run -it --env-file .env -v "$(pwd)":/app moa_project
This will start an interactive session where you can enter prompts and receive responses from the MoA model.
After each interaction, you will see:
- Color-coded intermediate outputs from each agent in the CLI.
- A final synthesized response in the CLI.
- An HTML report (
moa_report.html
) in your current directory with detailed outputs.
moa_model.py
: The main implementation of the Mixture of Agents model.Dockerfile
: Instucutre for building the Docker image.environment.yml
: Conda environment specification.requirements.txt
: List of Python package dependencies..env
: (You need to create this) Contains your API keys.README.md
: This file, containing project information and instructions.
The current implementation provides a basic structure for the MoA model. You can extend it by:
- Adding more diverse LLM agents.
- Implementing more sophisticated routing based on task type or model strengths.
- Experimenting with different aggregation methods.
- Adjusting the number of layers or layer compositions.
- Implementing error handling and rate limiting for API calls.
- Optimizing performance with more advanced parallel processing techniques.
Wang, J., Wang, J., Athiwaratkun, B., Zhang, C., & Zou, J. (2024). Mixture-of-Agents Enhances Large Language Model Capabilities. arXiv preprint arXiv:2406.04692v1.