-
make raw data dir by running Data/make_raw_files_dir.sh
-
Download raw files. Download raw/auto_judge/pairwise_traindata.jsonl at pairwise_traindata.jsonl.
Download raw/auto_judge/testdata_pairwise.jsonl at testdata_pairwise.jsonl.
Download raw/llm_bar/GPTInst/dataset.json at result.json.
Download raw/llm_bar/GPTOut/dataset.json at result.json.
Download raw/llm_bar/Manual/dataset.json at result.json.
Download raw/llm_bar/Natural/dataset.json at result.json.
Download raw/llm_bar/Neighbor/dataset.json at result.json.
Download raw/mt_bench/gpt-4_pair.jsonl at gpt-4_pair.jsonl.
Download raw/panda_lm/pandalm_test.json at pandalm_test.json.
-
format the raw data into unified data format by running corresponding scripts in Data/scripts/process.
-
combine the formatted data to openai format by running Data/scripts/prompts/combine.py and generate the same format test data by running Data/scripts/prompts/combine_test.py.
-
soft link the output openai format training file to LLaMA-Factory/data and add an entry in dataset_info.json to fit LLaMA-Factory training pipeline, here is an example:
"train_openai": {
"file_name": "train_openai.json",
"formatting": "sharegpt",
"columns": {
"messages": "messages"
},
"tags": {
"role_tag": "role",
"content_tag": "content",
"user_tag": "user",
"assistant_tag": "assistant",
"system_tag": "system"
}
},
See LLaMA-Factory for supported pretrain large language models, create a sub-folder Models
and put the pretrained checkpoints in it.
Reference our training examples in Train scripts, note that configs of these examples are saved in Configs.
See test scripts in Test scripts for more details, Judge Models are tested on PandaLM, LLM bar, MT bench and Auto-j test sets with respect to human preference.