Skip to content

Latest commit

 

History

History
103 lines (90 loc) · 6.45 KB

README.md

File metadata and controls

103 lines (90 loc) · 6.45 KB

LLM Debate Argument Evaluator

This project implements a debate argument evaluation tool using Large Language Models (LLMs).

Project Structure

/llm_debate_argument_evaluator/
│
├── /main/
│   ├── main.py                             # Central orchestrator for user interactions, invoking services
│   ├── controller.py                       # Handles user requests, interacts with service layer
│   ├── dependency_injector.py              # Injects services and models into the system
│   └── user_interactions.py                # Encapsulates user commands like expand node or submit argument
│
├── /services/
│   ├── argument_generation_service.py      # Service layer for argument generation logic, ensuring argument variability across subcategories
│   ├── evaluation_service.py               # Coordinates evaluations across LLMs (e.g., ChatGPT, Claude)
│   ├── memoization_service.py              # Manages memoization and semantic caching
│   ├── priority_queue_service.py           # Manages BFS traversal and priority queue
│   ├── async_processing_service.py         # Handles asynchronous evaluations and processing
│   ├── score_aggregator_service.py         # Aggregates scores from multiple models (e.g., ChatGPT, Claude)
│   └── model_selection_service.py          # Dynamically selects and manages LLMs
│
├── /commands/
│   ├── expand_node_command.py              # Expands debate tree nodes
│   ├── submit_argument_command.py          # Handles user-submitted arguments
│   ├── generate_arguments_command.py       # Triggers argument generation with argument variability to capture diverse perspectives
│   └── evaluate_arguments_command.py       # Initiates argument evaluation
│
├── /config/
│   ├── logger_config.py              # Defines logging variables
│
├── /evaluation/
│   ├── model_factory.py                    # Initializes evaluation models and manages the instantiation and selection
│   ├── score_aggregator.py                 # Aggregates scores from multiple evaluation models
│   ├── /models/
│   │   ├── base_model.py                   # Abstract base class for LLM models
│   │   ├── chatgpt_model.py                # ChatGPT-specific implementation
│   │   ├── claude_model.py                 # Claude-specific implementation
│   │   └── model_injector.py               # Dynamically injects LLM models for evaluations
│   └── /api_clients/
│       ├── base_api_client.py              # Base API client for standardizing API interaction logic
│       ├── chatgpt_api_client.py           # API client handling ChatGPT API requests
│       └── claude_api_client.py            # API client handling Claude API requests
│
├── /memoization/
│   ├── semantic_similarity.py              # Calculates argument similarity using embeddings (e.g., Sentence-BERT)
│   └── cache_manager.py                    # Stores and retrieves cached evaluations
│
├── /debate_traversal/
│   ├── traversal_logic.py                  # Implements BFS traversal with priority queue
│   ├── priority_queue_manager.py           # Manages priority queue
│   └── traversal_injector.py               # Injects traversal services dynamically
│
├── /async_processing/
│   └── async_utils.py                      # Utility functions for async operations
│
├── /visualization/
│   ├── observer.py                         # Observer pattern for real-time updates
│   ├── tree_renderer.py                    # Renders debate tree with node structure representing arguments and branches for rebuttals
│   ├── node_expansion_handler.py           # Manages node expansion in the debate tree
│   ├── node_score_display.py               # Displays score breakdown for each node. Nodes are color-coded based on their evaluation scores
│   └── visualization_injector.py           # Injects visualization services dynamically
│
└── /utils/
    ├── constants.py                        # Stores constants, configuration values, thresholds
    ├── logger.py                          # Central logging mechanism
    └── dependency_registry.py              # Registers and manages dependency injection

Features

  • Argument generation with variability across subcategories
  • Evaluation of arguments using multiple LLMs (e.g., ChatGPT, Claude)
  • Memoization and semantic caching for efficient processing
  • Asynchronous evaluation and processing
  • Visualization of debate tree with color-coded nodes based on evaluation scores
  • Dynamic model selection and injection
  • Breadth-First Search (BFS) traversal with priority queue for debate exploration

Getting Started

Run 'pip install -r requirements.txt' to install the dependencies.

KEY

Setup env variables for CHATGPT_API_KEY and CHATGPT_API_ENDPOINT

CHATGPT_API_KEY = {secret} CHATGPT_API_ENDPOINT = https://api.openai.com/v1/chat/completions DEBUG_MODE = 1 for true (for debugging) MAX_TOKENS = 10 (Just seeing it functions)\

CLAUDE_API_KEY = {secret} CLAUDE_API_ENDPOINT = https = https://api.anthropic.com/v1/messages

EXAMPLE OUTPUT

  • Generated supporting argument 1: Access to abortion is essential for supporting women's bodily autonomy as it allows individuals to make decisions about their own bodies, health, and lives. Denying access to safe and legal abortion restricts women's ability to control their reproductive choices, infringing upon their fundamental right to autonomy and self-determination. By ensuring access to abortion, women can assert their bodily autonomy, have the freedom to make decisions regarding their own bodies, and take charge of their reproductive health without external interference.
  • Generated against argument 1: Unrestricted abortion access could potentially have negative implications for women's mental health by minimizing the decision-making process and overlooking the potential emotional consequences associated with terminating a pregnancy. Without ensuring thorough counseling and support services, women may experience feelings of guilt, regret, or psychological distress post-abortion. This lack of adequate mental health care support could lead to long-term emotional challenges and contribute to an increase in mental health issues among women who undergo abortions.