SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors

This is the project page of the paper: SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
heatmap		heatmap
resources		resources
sorry_bench_files		sorry_bench_files
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
index.html		index.html

Provide feedback