Skip to content

Latest commit

 

History

History
10 lines (7 loc) · 528 Bytes

README.md

File metadata and controls

10 lines (7 loc) · 528 Bytes

Safety360

Welcome to the safety360/ directory! This folder contains various AI safety implementations for LLM360 models.

We currently include the following folders:

  1. bold/ provides sentiment analysis with BOLD dataset.
  2. toxic_detection/ measures model's capability to identify toxic text.
  3. toxigen/ evaluate model's toxicity on text generation.
  4. wmdp/ evaluate model's hazardous knowledge.