Add `autojudge` for Dynamic Rubric Generation #7

jamesliounis · 2024-12-16T14:50:37Z

Description:
Implement a new type of judge called Autojudge, which can dynamically generate an evaluation rubric based on a labeled dataset and associated feedback. This feature will automate the creation of task-specific LLM evaluation rubrics. Autojudge will enable users to leverage labeled data and feedback to fine-tune or guide LLM-based evaluators effectively.

Proposed Workflow:

Input:
- A labeled dataset containing:
  - input_text: User prompt or query.
  - completion: AI-generated response.
  - label: Binary evaluation (1 for acceptable, 0 for unacceptable).
  - feedback: Detailed explanations for why a response is unacceptable (mandatory for label=0).
- A task_description providing context for rubric generation.
Process:
- Use the input data to generate a rubric for evaluation, considering the feedback provided for negative labels.
- The generated rubric should detail scoring criteria (e.g., factuality, relevance, tone) and decision rules.
Output:
- A structured rubric that can be used by other judges in the library.
- Optionally, structured feedback and evaluation metrics for the dataset (accuracy, precision, recall, etc.).

Motivation:

Automating rubric generation will streamline the development of evaluation workflows.
It reduces manual effort in crafting rubrics while ensuring consistency and adaptability for domain-specific tasks.

Example Use Case:
A user wants to evaluate AI-generated responses for empathy in customer service. They provide labeled examples of good and bad responses with feedback for improvement. Autojudge processes the data and generates an evaluation rubric focused on empathy-related criteria.

Tasks:

Create the Autojudge class with:
- Methods for processing labeled data and generating rubrics.
- Integration with the existing judge architecture.
Add unit tests to validate functionality.
Update the documentation to include Autojudge under the Types of Judges section.

The text was updated successfully, but these errors were encountered:

jamesliounis self-assigned this Dec 16, 2024

freddiev4 added the enhancement New feature or request label Dec 17, 2024

freddiev4 changed the title ~~[FEATURE] Add a New Judge: Autojudge for Dynamic Rubric Generation~~ Add autojudge for Dynamic Rubric Generation Dec 17, 2024

freddiev4 mentioned this issue Dec 20, 2024

Add new AutoJudge class for custom judges #11

Merged

freddiev4 closed this as completed in #11 Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `autojudge` for Dynamic Rubric Generation #7

Add `autojudge` for Dynamic Rubric Generation #7

jamesliounis commented Dec 16, 2024

Add autojudge for Dynamic Rubric Generation #7

Add autojudge for Dynamic Rubric Generation #7

Comments

jamesliounis commented Dec 16, 2024

Add `autojudge` for Dynamic Rubric Generation #7

Add `autojudge` for Dynamic Rubric Generation #7