Llama 2 Fine-Tuning on Databricks Dolly 15k Subset

The code in this repository fine-tunes a Llama 2 model on a 1000-sample subset of the Databricks Dolly 15k instruction dataset using Supervised Fine-Tuning (SFT) with QLoRA 4-bit precision.

Overview

Clone this repository:

git clone https://github.com/golkir/llama2-7b-minidatabricks.git
cd llama2-7b-minidatabricks

Install dependencies
```
pip install .
```
Run the dataset subset creation script which fetches the Dolly 15k dataset and processes it in Llama 2 instruction format.
```
python load-databricks.py
```
Run the fine-tuning script:
```
python finetuning.py
```

Acknowledgments

The Dolly 15k dataset is originally provided by Databricks. Link to Databricks Dolly 15k dataset.
The Llama 2 model can be found in HuggingFace repository.

License

This code is licensed under the Apache 2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
finetuning.py		finetuning.py
load-databricks.py		load-databricks.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llama 2 Fine-Tuning on Databricks Dolly 15k Subset

Overview

Acknowledgments

License

About

Releases

Packages

Languages

License

golkir/llama2-7b-minidatabricks

Folders and files

Latest commit

History

Repository files navigation

Llama 2 Fine-Tuning on Databricks Dolly 15k Subset

Overview

Acknowledgments

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages