Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

Commit

Permalink
update readme for merge model
Browse files Browse the repository at this point in the history
  • Loading branch information
zhouyu5 committed Oct 10, 2023
1 parent 104c211 commit 1b65756
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 13 deletions.
19 changes: 6 additions & 13 deletions tests/deltatuner/finetune/merge_model/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ python instruction_tuning_pipeline/finetune_clm.py \
--output_dir "$DATA_PATH/llama2-7b-ssf-denas-bf16" \
--delta ssf \
--denas True \
--bf16 True \
| tee llama2-7b-ssf-denas-bf16-1epoch.log
```

Expand All @@ -39,13 +38,10 @@ python instruction_tuning_pipeline/finetune_clm.py \
--model_name_or_path "$DATA_PATH/Llama-2-7b-hf" \
--train_file "$DATA_PATH/alpaca_data.json" \
--dataset_concatenation \
--per_device_train_batch_size 8 \
--per_device_eval_batch_size 8 \
--gradient_accumulation_steps 1 \
--validation_split_percentage 30 \
--do_eval \
--learning_rate 1e-4 \
--num_train_epochs 1 \
--logging_steps 100 \
--save_total_limit 1 \
--log_level info \
Expand All @@ -54,29 +50,27 @@ python instruction_tuning_pipeline/finetune_clm.py \
--no_cuda \
--output_dir "$DATA_PATH/llama2-7b-ssf-denas-bf16-merge" \
--delta ssf \
--bf16 True \
--resume_peft "$DATA_PATH/llama2-7b-ssf-denas-bf16" \
--save_merged_model True \
--denas "$DATA_PATH/llama2-7b-ssf-denas-bf16/best_model_structure.txt" \
--merge_model_code_dir "instruction_tuning_pipeline/models/llama2-ssf" \
--debugs
```

### 3. Evaluate merged model
As ssf will enable bias, while the default Llama2 disable all the bias, to enable the full parameters of the adapter, we need to change the model definition.

First copy the updated model codes with the merged weights.
```shell
cp instruction_tuning_pipeline/models/llama2-ssf/* $DATA_PATH/llama2-7b-ssf-denas-bf16-merge/merged_model
```
First, specify the `merge_model_code_dir` args, it will copy the updated model codes with the merged weights.

Then update the "best_model_structure" and "target_modules" setting in config.json. if not enable denas and not change "target_modules" default settings, can skip correpsonding setting.
Then it will automatically update the "best_model_structure" and "target_modules" setting in `config.json`. if "denas" or "target_modules" are not enabled/changed, it will skip the corresponding setting.
The changed `config.json` looks like this:
```shell
...
"best_model_structure": {"num_hidden_layers": [1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1]}, # change to your best structure, skip to keep default
"target_modules": ["q_proj", "v_proj"], #change to your setting, skip to keep default
...
```

### 3. Evaluate merged model

Finally we can directly evalute the merged model.
```shell
python instruction_tuning_pipeline/finetune_clm.py \
Expand All @@ -93,5 +87,4 @@ python instruction_tuning_pipeline/finetune_clm.py \
--trust_remote_code True \
--no_cuda \
--output_dir "$DATA_PATH/llama2-7b-ssf-denas-bf16-merge/eval_merge" \
--bf16 True
```
1 change: 1 addition & 0 deletions tests/deltatuner/finetune/merge_model/ssf-merge-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ python instruction_tuning_pipeline/finetune_clm.py \
--delta ssf \
--resume_peft "$DATA_PATH/mpt-7b-ssf-allmodules-denas-bf16" \
--save_merged_model True \
--merge_model_code_dir instruction_tuning_pipeline/models/llama2-ssf \
--debugs


Expand Down

0 comments on commit 1b65756

Please sign in to comment.