Skip to content

Hongji1001/llm-bias

Repository files navigation

EBench

Released code for the paper A Unified Framework for Benchmarking Social Group Bias in Large Language Models

Installation

We assume conda / miniconda has been download. Install the dependencies through conda:

# clone our repo
git clone https://github.com/Hongji1001/llm-bias.git
cd llm-bias
# remove old env if necessary
conda deactivate; conda env remove --name bias-benchmark
conda env create -f env.yaml --name bias-benchmark

Note: Before evaluation generation outputs, you should download the hard debiased version of the Google's Word2Vec embedding trained on Google News here.

Benchmark on Classification

# You can modify the setting.py to benchmark your own models.
python3 classification_benchmark.py  # for few shot evaluation
python3 classification_benchmark.py --eval_only # for zero-shot shot evaluation

Benchmark on Generation

# You can modify the setting.py to benchmark your own models.
python3 generation_benchmark.py --gen_only # generate completions only
python3 generation_benchmark.py --eval_only # evaluate the completions at --completions_path

# To reproduce the mitigation experiment, you should run the following:
bash scripts/debias_generation_model.sh meta-llama/Llama-2-7b-chat-hf checkpoint/meta-llama/Llama-2-7b-chat-hf data/Ebench_cnn_dailymail.jsonl 0,1,2,3
python3 generation_benchmark.py --model_name_or_path checkpoint/meta-llama/Llama-2-7b-chat-hf

Dataset Collection

# You can easily re-generate the datasets mentioned in our paper by following:
bash scripts/generate_dataset.py 20 10 40 bookcorpus|cnn_dailymail text

Debais

cd llm-unlearning
conda create --name unlearn python=3.11
conda activate unlearn
pip install -r requirements.txt
mkdir logs checkpoint
# run OPT-1.3b
bash scripts/debias_generation_model.sh facebook/opt-1.3b checkpoint/facebook-opt-1.3b_unlearn
# run llama-7b
bash scripts/debias_generation_model.sh meta-llama/Llama-2-7b-chat-hf checkpoint/meta-llama-Llama-2-7b-chat-hf_unlearn

Benchmark

python benchmark.py --model_name_or_path facebook/opt-1.3b --gen_only --dataset_paths "data/Ebench_bookcorpus.jsonl" "data/Ebench_cnn_dailymail.jsonl" "BOLD.jsonl"
python benchmark.py --model_name_or_path lmsys/vicuna-7b-v1.5 --gen_only --dataset_paths "data/Ebench_bookcorpus.jsonl" "data/Ebench_cnn_dailymail.jsonl" "BOLD.jsonl"
python benchmark.py --model_name_or_path mistralai/Mistral-7B-Instruct-v0.3 --gen_only --dataset_paths "data/Ebench_bookcorpus.jsonl" "data/Ebench_cnn_dailymail.jsonl" "BOLD.jsonl"

# after debias
python benchmark.py --model_name_or_path checkpoint/checkpoint/facebook-opt-1.3b_unlearn --gen_only --dataset_paths "data/Ebench_bookcorpus.jsonl" "data/Ebench_cnn_dailymail.jsonl" "BOLD.jsonl"
python benchmark.py --model_name_or_path checkpoint/meta-llama-Llama-2-7b-chat-hf_unlearn --gen_only --dataset_paths "data/Ebench_bookcorpus.jsonl" "data/Ebench_cnn_dailymail.jsonl" "BOLD.jsonl"

About

LLM Bias Benchmark

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published