LLM Debate Argument Evaluator

This project implements a debate argument evaluation tool using Large Language Models (LLMs).

Project Structure

/llm_debate_argument_evaluator/
│
├── /main/
│   ├── main.py                             # Central orchestrator for user interactions, invoking services
│   ├── controller.py                       # Handles user requests, interacts with service layer
│   ├── dependency_injector.py              # Injects services and models into the system
│   └── user_interactions.py                # Encapsulates user commands like expand node or submit argument
│
├── /services/
│   ├── argument_generation_service.py      # Service layer for argument generation logic, ensuring argument variability across subcategories
│   ├── evaluation_service.py               # Coordinates evaluations across LLMs (e.g., ChatGPT, Claude)
│   ├── memoization_service.py              # Manages memoization and semantic caching
│   ├── priority_queue_service.py           # Manages BFS traversal and priority queue
│   ├── async_processing_service.py         # Handles asynchronous evaluations and processing
│   ├── score_aggregator_service.py         # Aggregates scores from multiple models (e.g., ChatGPT, Claude)
│   └── model_selection_service.py          # Dynamically selects and manages LLMs
│
├── /commands/
│   ├── expand_node_command.py              # Expands debate tree nodes
│   ├── submit_argument_command.py          # Handles user-submitted arguments
│   ├── generate_arguments_command.py       # Triggers argument generation with argument variability to capture diverse perspectives
│   └── evaluate_arguments_command.py       # Initiates argument evaluation
│
├── /config/
│   ├── logger_config.py              # Defines logging variables
│
├── /evaluation/
│   ├── model_factory.py                    # Initializes evaluation models and manages the instantiation and selection
│   ├── score_aggregator.py                 # Aggregates scores from multiple evaluation models
│   ├── /models/
│   │   ├── base_model.py                   # Abstract base class for LLM models
│   │   ├── chatgpt_model.py                # ChatGPT-specific implementation
│   │   ├── claude_model.py                 # Claude-specific implementation
│   │   └── model_injector.py               # Dynamically injects LLM models for evaluations
│   └── /api_clients/
│       ├── base_api_client.py              # Base API client for standardizing API interaction logic
│       ├── chatgpt_api_client.py           # API client handling ChatGPT API requests
│       └── claude_api_client.py            # API client handling Claude API requests
│
├── /memoization/
│   ├── semantic_similarity.py              # Calculates argument similarity using embeddings (e.g., Sentence-BERT)
│   └── cache_manager.py                    # Stores and retrieves cached evaluations
│
├── /debate_traversal/
│   ├── traversal_logic.py                  # Implements BFS traversal with priority queue
│   ├── priority_queue_manager.py           # Manages priority queue
│   └── traversal_injector.py               # Injects traversal services dynamically
│
├── /async_processing/
│   └── async_utils.py                      # Utility functions for async operations
│
├── /visualization/
│   ├── observer.py                         # Observer pattern for real-time updates
│   ├── tree_renderer.py                    # Renders debate tree with node structure representing arguments and branches for rebuttals
│   ├── node_expansion_handler.py           # Manages node expansion in the debate tree
│   ├── node_score_display.py               # Displays score breakdown for each node. Nodes are color-coded based on their evaluation scores
│   └── visualization_injector.py           # Injects visualization services dynamically
│
└── /utils/
    ├── constants.py                        # Stores constants, configuration values, thresholds
    ├── logger.py                          # Central logging mechanism
    └── dependency_registry.py              # Registers and manages dependency injection

Features

Argument generation with variability across subcategories
Evaluation of arguments using multiple LLMs (e.g., ChatGPT, Claude)
Memoization and semantic caching for efficient processing
Asynchronous evaluation and processing
Visualization of debate tree with color-coded nodes based on evaluation scores
Dynamic model selection and injection
Breadth-First Search (BFS) traversal with priority queue for debate exploration

Getting Started

Run 'pip install -r requirements.txt' to install the dependencies.

KEY

Setup env variables for CHATGPT_API_KEY and CHATGPT_API_ENDPOINT

CHATGPT_API_KEY = {secret} CHATGPT_API_ENDPOINT = https://api.openai.com/v1/chat/completions DEBUG_MODE = 1 for true (for debugging) MAX_TOKENS = 10 (Just seeing it functions)\

CLAUDE_API_KEY = {secret} CLAUDE_API_ENDPOINT = https = https://api.anthropic.com/v1/messages

EXAMPLE OUTPUT

Generated supporting argument 1: Access to abortion is essential for supporting women's bodily autonomy as it allows individuals to make decisions about their own bodies, health, and lives. Denying access to safe and legal abortion restricts women's ability to control their reproductive choices, infringing upon their fundamental right to autonomy and self-determination. By ensuring access to abortion, women can assert their bodily autonomy, have the freedom to make decisions regarding their own bodies, and take charge of their reproductive health without external interference.
Generated against argument 1: Unrestricted abortion access could potentially have negative implications for women's mental health by minimizing the decision-making process and overlooking the potential emotional consequences associated with terminating a pregnancy. Without ensuring thorough counseling and support services, women may experience feelings of guilt, regret, or psychological distress post-abortion. This lack of adequate mental health care support could lead to long-term emotional challenges and contribute to an increase in mental health issues among women who undergo abortions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Debate Argument Evaluator

Project Structure

Features

Getting Started

KEY

EXAMPLE OUTPUT

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
commands		commands
config		config
debate_traversal		debate_traversal
evaluation		evaluation
main		main
memoization		memoization
services		services
utils		utils
visualization		visualization
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt

spicy/llm-debate-argument-evaluator

Folders and files

Latest commit

History

Repository files navigation

LLM Debate Argument Evaluator

Project Structure

Features

Getting Started

KEY

EXAMPLE OUTPUT

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages