Skip to content

Latest commit

 

History

History

safety360

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Safety360

Welcome to the safety360/ directory! This folder contains various AI safety implementations for LLM360 models.

We currently include the following folders:

  1. bold/ provides sentiment analysis with BOLD dataset.
  2. toxic_detection/ measures model's capability to identify toxic text.
  3. toxigen/ evaluate model's toxicity on text generation.
  4. wmdp/ evaluate model's hazardous knowledge.