From 299677f58bbb8886f2744cb37c6973892c105fb3 Mon Sep 17 00:00:00 2001 From: Wonhyeong Seo Date: Wed, 16 Aug 2023 16:17:24 +0900 Subject: [PATCH 1/5] docs: feat: model resources for llama2 Co-authored-by: Woojun Jung --- docs/source/en/model_doc/llama2.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/docs/source/en/model_doc/llama2.md b/docs/source/en/model_doc/llama2.md index 212b8dfad5ef84..7c8e1c6114fb8c 100644 --- a/docs/source/en/model_doc/llama2.md +++ b/docs/source/en/model_doc/llama2.md @@ -55,6 +55,24 @@ come in several checkpoints they each contain a part of each weight of the model This model was contributed by [Arthur Zucker](https://huggingface.co/ArthurZ) with contributions from [Lysandre Debut](https://huggingface.co/lysandre). The code of the implementation in Hugging Face is based on GPT-NeoX [here](https://github.com/EleutherAI/gpt-neox). The original code of the authors can be found [here](https://github.com/facebookresearch/llama). +## Resources + +A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with LLaMA2. If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource. + + + +- A blog on [Llama 2 is here - get it on Hugging Face](https://huggingface.co/blog/llama2) which introduces Llama 2, a family of state-of-the-art open-access large language models released by Meta. This release is fully supported by Hugging Face with comprehensive integration, including models on the Hub, Transformers integration, and examples for fine-tuning. +- A blog on [DPO and Llama 2](https://huggingface.co/blog/dpo-trl) (Link to be updated after extracting content). + +⚡️ Inference + +- 🌎 A blog on [How to use SageMaker with Llama2 and QLora](https://www.philschmid.de/sagemaker-llama2-qlora) by Phil Schmid. +- 🌎 A blog on [Introduction to Llama 2](https://www.philschmid.de/llama-2) by Phil Schmid. +- 🌎 A blog on [How to instruction-tune Llama 2](https://www.philschmid.de/instruction-tune-llama-2) by Phil Schmid. + +🚀 Deploy + +- 🌎 A blog on [SageMaker, Llama, and LLM](https://www.philschmid.de/sagemaker-llama-llm) by Phil Schmid (Deployment of Llama models using SageMaker). ## LlamaConfig From 92defe8740fd2e4bc2d102ea367e96b9513f5293 Mon Sep 17 00:00:00 2001 From: Wonhyeong Seo Date: Wed, 16 Aug 2023 17:09:52 +0900 Subject: [PATCH 2/5] fix: add description for dpo and rearrange posts --- docs/source/en/model_doc/llama2.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/docs/source/en/model_doc/llama2.md b/docs/source/en/model_doc/llama2.md index 7c8e1c6114fb8c..9d946a538bf4f0 100644 --- a/docs/source/en/model_doc/llama2.md +++ b/docs/source/en/model_doc/llama2.md @@ -62,17 +62,20 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h - A blog on [Llama 2 is here - get it on Hugging Face](https://huggingface.co/blog/llama2) which introduces Llama 2, a family of state-of-the-art open-access large language models released by Meta. This release is fully supported by Hugging Face with comprehensive integration, including models on the Hub, Transformers integration, and examples for fine-tuning. -- A blog on [DPO and Llama 2](https://huggingface.co/blog/dpo-trl) (Link to be updated after extracting content). + +⚗️ Optimization + +- A blog on [Fine-tune Llama 2 with DPO](https://huggingface.co/blog/dpo-trl) which discusses the Direct Preference Optimization (DPO) method, now available in the TRL library. The post demonstrates how to fine-tune the Llama v2 7B-parameter model using the stack-exchange preference dataset. +- A blog on [How to instruction-tune Llama 2](https://www.philschmid.de/instruction-tune-llama-2) by Phil Schmid, detailing the methods and techniques for optimizing the performance of Llama 2. 🌎 ⚡️ Inference -- 🌎 A blog on [How to use SageMaker with Llama2 and QLora](https://www.philschmid.de/sagemaker-llama2-qlora) by Phil Schmid. -- 🌎 A blog on [Introduction to Llama 2](https://www.philschmid.de/llama-2) by Phil Schmid. -- 🌎 A blog on [How to instruction-tune Llama 2](https://www.philschmid.de/instruction-tune-llama-2) by Phil Schmid. +- A blog on [How to use SageMaker with Llama2 and QLora](https://www.philschmid.de/sagemaker-llama2-qlora) by Phil Schmid. 🌎 +- A blog on [Introduction to Llama 2](https://www.philschmid.de/llama-2) by Phil Schmid. 🌎 🚀 Deploy -- 🌎 A blog on [SageMaker, Llama, and LLM](https://www.philschmid.de/sagemaker-llama-llm) by Phil Schmid (Deployment of Llama models using SageMaker). +- A blog on [SageMaker, Llama, and LLM](https://www.philschmid.de/sagemaker-llama-llm) by Phil Schmid (Deployment of Llama models using SageMaker). 🌎 ## LlamaConfig From f9558b009bc9d31f25f90d6c6bc71eaba804180e Mon Sep 17 00:00:00 2001 From: Wonhyeong Seo Date: Thu, 17 Aug 2023 08:05:27 +0900 Subject: [PATCH 3/5] docs: feat: add llama2 notebook resources * style: one liners for each resource Co-Authored-By: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-Authored-By: Kihoon Son <75935546+kihoon71@users.noreply.github.com> --- docs/source/en/model_doc/llama2.md | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/docs/source/en/model_doc/llama2.md b/docs/source/en/model_doc/llama2.md index 9d946a538bf4f0..a2aadcaf21d2a4 100644 --- a/docs/source/en/model_doc/llama2.md +++ b/docs/source/en/model_doc/llama2.md @@ -59,23 +59,27 @@ This model was contributed by [Arthur Zucker](https://huggingface.co/ArthurZ) wi A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with LLaMA2. If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource. +- A blog on how to [understand Llama 2's features and its collaboration with Hugging Face](https://huggingface.co/blog/llama2). +- A blog on [an introduction to Llama 2](https://www.philschmid.de/llama-2) by Phil Schmid. + -- A blog on [Llama 2 is here - get it on Hugging Face](https://huggingface.co/blog/llama2) which introduces Llama 2, a family of state-of-the-art open-access large language models released by Meta. This release is fully supported by Hugging Face with comprehensive integration, including models on the Hub, Transformers integration, and examples for fine-tuning. +- A notebook on how to [fine-tune Llama 2 in Google Colab using QLoRA and 4-bit precision](https://colab.research.google.com/drive/1PEQyJO1-f6j0S_XJ8DV50NkpzasXkrzd?usp=sharing). 🌎 +- A notebook on how to [fine-tune the "Llama-v2-7b-guanaco" model with 4-bit qlora and generate Q&A datasets from PDFs](https://colab.research.google.com/drive/134o_cXcMe_lsvl15ZE_4Y75Kstepsntu?usp=sharing). 🌎 ⚗️ Optimization - -- A blog on [Fine-tune Llama 2 with DPO](https://huggingface.co/blog/dpo-trl) which discusses the Direct Preference Optimization (DPO) method, now available in the TRL library. The post demonstrates how to fine-tune the Llama v2 7B-parameter model using the stack-exchange preference dataset. -- A blog on [How to instruction-tune Llama 2](https://www.philschmid.de/instruction-tune-llama-2) by Phil Schmid, detailing the methods and techniques for optimizing the performance of Llama 2. 🌎 +- A blog on how to [fine-tune Llama 2 using the Direct Preference Optimization (DPO) method](https://huggingface.co/blog/dpo-trl). +- A blog on how to [instruction-tune Llama 2 for optimized performance](https://www.philschmid.de/instruction-tune-llama-2) by Phil Schmid. +- A notebook on how to [fine-tune the Llama 2 model on a personal computer using QLoRa and TRL](https://colab.research.google.com/drive/1SYpgFpcmtIUzdE7pxqknrM4ArCASfkFQ?usp=sharing). 🌎 ⚡️ Inference - -- A blog on [How to use SageMaker with Llama2 and QLora](https://www.philschmid.de/sagemaker-llama2-qlora) by Phil Schmid. 🌎 -- A blog on [Introduction to Llama 2](https://www.philschmid.de/llama-2) by Phil Schmid. 🌎 +- A notebook on how to [quantize the Llama 2 model using GPTQ and the AutoGPTQ library](https://colab.research.google.com/drive/1TC56ArKerXUpbgRy5vM3woRsbTEVNq7h?usp=sharing). 🌎 +- A notebook on how to [run the Llama 2 Chat Model with 4-bit quantization on a local computer or Google Colab](https://colab.research.google.com/drive/1X1z9Q6domMKl2CnEM0QGHNwidLfR4dW2?usp=sharing). 🌎 🚀 Deploy +- A blog on how to [use SageMaker with Llama2 and QLora for efficient model deployment](https://www.philschmid.de/sagemaker-llama2-qlora) by Phil Schmid. +- A blog on how to [deploy Llama models using SageMaker for scalable applications](https://www.philschmid.de/sagemaker-llama-llm) by Phil Schmid. -- A blog on [SageMaker, Llama, and LLM](https://www.philschmid.de/sagemaker-llama-llm) by Phil Schmid (Deployment of Llama models using SageMaker). 🌎 ## LlamaConfig From 0feb365bf8655d4237eece75c9a9c654b7af15b6 Mon Sep 17 00:00:00 2001 From: Wonhyeong Seo Date: Tue, 22 Aug 2023 09:00:36 +0900 Subject: [PATCH 4/5] Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --- docs/source/en/model_doc/llama2.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/source/en/model_doc/llama2.md b/docs/source/en/model_doc/llama2.md index a2aadcaf21d2a4..da8e3b0a0250c9 100644 --- a/docs/source/en/model_doc/llama2.md +++ b/docs/source/en/model_doc/llama2.md @@ -59,26 +59,26 @@ This model was contributed by [Arthur Zucker](https://huggingface.co/ArthurZ) wi A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with LLaMA2. If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource. -- A blog on how to [understand Llama 2's features and its collaboration with Hugging Face](https://huggingface.co/blog/llama2). -- A blog on [an introduction to Llama 2](https://www.philschmid.de/llama-2) by Phil Schmid. +- [Llama 2 is here - get it on Hugging Face](https://huggingface.co/blog/llama2), a blog post about Llama 2 and how to use it with 🤗 Transformers and 🤗 PEFT. +- [LLaMA 2 - Every Resource you need](https://www.philschmid.de/llama-2), a compilation of relevant resources to learn about LLaMA 2 and how to get started quickly. -- A notebook on how to [fine-tune Llama 2 in Google Colab using QLoRA and 4-bit precision](https://colab.research.google.com/drive/1PEQyJO1-f6j0S_XJ8DV50NkpzasXkrzd?usp=sharing). 🌎 -- A notebook on how to [fine-tune the "Llama-v2-7b-guanaco" model with 4-bit qlora and generate Q&A datasets from PDFs](https://colab.research.google.com/drive/134o_cXcMe_lsvl15ZE_4Y75Kstepsntu?usp=sharing). 🌎 +- A [notebook](https://colab.research.google.com/drive/1PEQyJO1-f6j0S_XJ8DV50NkpzasXkrzd?usp=sharing) on how to fine-tune Llama 2 in Google Colab using QLoRA and 4-bit precision. 🌎 +- A [notebook](https://colab.research.google.com/drive/134o_cXcMe_lsvl15ZE_4Y75Kstepsntu?usp=sharing) on how to fine-tune the "Llama-v2-7b-guanaco" model with 4-bit QLoRA and generate Q&A datasets from PDFs. 🌎 ⚗️ Optimization -- A blog on how to [fine-tune Llama 2 using the Direct Preference Optimization (DPO) method](https://huggingface.co/blog/dpo-trl). -- A blog on how to [instruction-tune Llama 2 for optimized performance](https://www.philschmid.de/instruction-tune-llama-2) by Phil Schmid. -- A notebook on how to [fine-tune the Llama 2 model on a personal computer using QLoRa and TRL](https://colab.research.google.com/drive/1SYpgFpcmtIUzdE7pxqknrM4ArCASfkFQ?usp=sharing). 🌎 +- [Fine-tune Llama 2 with DPO](https://huggingface.co/blog/dpo-trl), a guide to using the TRL library's DPO method to fine tune Llama 2 on a specific dataset. +- [Extended Guide: Instruction-tune Llama 2](https://www.philschmid.de/instruction-tune-llama-2), a guide to training Llama 2 to generate instructions from inputs, transforming the model from instruction-following to instruction-giving. +- A [notebook](https://colab.research.google.com/drive/1SYpgFpcmtIUzdE7pxqknrM4ArCASfkFQ?usp=sharing) on how to fine-tune the Llama 2 model on a personal computer using QLoRa and TRL. 🌎 ⚡️ Inference -- A notebook on how to [quantize the Llama 2 model using GPTQ and the AutoGPTQ library](https://colab.research.google.com/drive/1TC56ArKerXUpbgRy5vM3woRsbTEVNq7h?usp=sharing). 🌎 -- A notebook on how to [run the Llama 2 Chat Model with 4-bit quantization on a local computer or Google Colab](https://colab.research.google.com/drive/1X1z9Q6domMKl2CnEM0QGHNwidLfR4dW2?usp=sharing). 🌎 +- A [notebook]((https://colab.research.google.com/drive/1TC56ArKerXUpbgRy5vM3woRsbTEVNq7h?usp=sharing) on how to quantize the Llama 2 model using GPTQ from the AutoGPTQ library. 🌎 +- A [notebook](https://colab.research.google.com/drive/1X1z9Q6domMKl2CnEM0QGHNwidLfR4dW2?usp=sharing) on how to run the Llama 2 Chat Model with 4-bit quantization on a local computer or Google Colab. 🌎 🚀 Deploy -- A blog on how to [use SageMaker with Llama2 and QLora for efficient model deployment](https://www.philschmid.de/sagemaker-llama2-qlora) by Phil Schmid. -- A blog on how to [deploy Llama models using SageMaker for scalable applications](https://www.philschmid.de/sagemaker-llama-llm) by Phil Schmid. +- [Fine-tune LLaMA 2 (7-70B) on Amazon SageMaker](https://www.philschmid.de/sagemaker-llama2-qlora), a complete guide from setup to QLoRA fine-tuning and deployment on Amazon SageMaker. +- [Deploy Llama 2 7B/13B/70B on Amazon SageMaker](https://www.philschmid.de/sagemaker-llama-llm), a guide on using Hugging Face's LLM DLC container for secure and scalable deployment. ## LlamaConfig From 479affa3358345718448ca00d925e72ba6acaade Mon Sep 17 00:00:00 2001 From: Wonhyeong Seo Date: Wed, 23 Aug 2023 08:13:31 +0900 Subject: [PATCH 5/5] Fix typo Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --- docs/source/en/model_doc/llama2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/en/model_doc/llama2.md b/docs/source/en/model_doc/llama2.md index da8e3b0a0250c9..73ca0dc6e32f85 100644 --- a/docs/source/en/model_doc/llama2.md +++ b/docs/source/en/model_doc/llama2.md @@ -73,7 +73,7 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h - A [notebook](https://colab.research.google.com/drive/1SYpgFpcmtIUzdE7pxqknrM4ArCASfkFQ?usp=sharing) on how to fine-tune the Llama 2 model on a personal computer using QLoRa and TRL. 🌎 ⚡️ Inference -- A [notebook]((https://colab.research.google.com/drive/1TC56ArKerXUpbgRy5vM3woRsbTEVNq7h?usp=sharing) on how to quantize the Llama 2 model using GPTQ from the AutoGPTQ library. 🌎 +- A [notebook](https://colab.research.google.com/drive/1TC56ArKerXUpbgRy5vM3woRsbTEVNq7h?usp=sharing) on how to quantize the Llama 2 model using GPTQ from the AutoGPTQ library. 🌎 - A [notebook](https://colab.research.google.com/drive/1X1z9Q6domMKl2CnEM0QGHNwidLfR4dW2?usp=sharing) on how to run the Llama 2 Chat Model with 4-bit quantization on a local computer or Google Colab. 🌎 🚀 Deploy