Name		Name	Last commit message	Last commit date
parent directory ..
bold		bold
toxic_detection		toxic_detection
toxigen		toxigen
wmdp		wmdp
README.md		README.md

README.md

Safety360

Welcome to the safety360/ directory! This folder contains various AI safety implementations for LLM360 models.

We currently include the following folders:

bold/ provides sentiment analysis with BOLD dataset.
toxic_detection/ measures model's capability to identify toxic text.
toxigen/ evaluate model's toxicity on text generation.
wmdp/ evaluate model's hazardous knowledge.