diff --git a/3p-integrations/crusoe/vllm-fp8/README.md b/3p-integrations/crusoe/vllm-fp8/README.md index 1c26f9413..8024dfacf 100644 --- a/3p-integrations/crusoe/vllm-fp8/README.md +++ b/3p-integrations/crusoe/vllm-fp8/README.md @@ -23,8 +23,8 @@ source $HOME/.cargo/env Now, clone the recipes and navigate to this tutorial. Initialize the virtual environment and install dependencies: ```bash -git clone https://github.com/meta-llama/llama-recipes.git -cd llama-recipes/recipes/3p_integrations/crusoe/vllm-fp8/ +git clone https://github.com/meta-llama/llama-cookbook.git +cd llama-cookbook/recipes/3p_integrations/crusoe/vllm-fp8/ uv add vllm setuptools ``` diff --git a/3p-integrations/llama_on_prem.md b/3p-integrations/llama_on_prem.md index fea53cf05..e725f5702 100644 --- a/3p-integrations/llama_on_prem.md +++ b/3p-integrations/llama_on_prem.md @@ -1,6 +1,6 @@ # Llama 3 On-Prem Inference Using vLLM and TGI -Enterprise customers may prefer to deploy Llama 3 on-prem and run Llama in their own servers. This tutorial shows how to use Llama 3 with [vLLM](https://github.com/vllm-project/vllm) and Hugging Face [TGI](https://github.com/huggingface/text-generation-inference), two leading open-source tools to deploy and serve LLMs, and how to create vLLM and TGI hosted Llama 3 instances with [LangChain](https://www.langchain.com/), an open-source LLM app development framework which we used for our other demo apps: [Getting to Know Llama](../getting-started/build_with_Llama_3_2.ipynb), Running Llama 3 [locally](https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/Running_Llama3_Anywhere/Running_Llama_on_Mac_Windows_Linux.ipynb) and [in the cloud](https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/RAG/hello_llama_cloud.ipynb). See [here](https://medium.com/@rohit.k/tgi-vs-vllm-making-informed-choices-for-llm-deployment-37c56d7ff705) for a detailed comparison of vLLM and TGI. +Enterprise customers may prefer to deploy Llama 3 on-prem and run Llama in their own servers. This tutorial shows how to use Llama 3 with [vLLM](https://github.com/vllm-project/vllm) and Hugging Face [TGI](https://github.com/huggingface/text-generation-inference), two leading open-source tools to deploy and serve LLMs, and how to create vLLM and TGI hosted Llama 3 instances with [LangChain](https://www.langchain.com/), an open-source LLM app development framework which we used for our other demo apps: [Getting to Know Llama](../getting-started/build_with_Llama_3_2.ipynb), Running Llama 3 [locally](https://github.com/meta-llama/llama-cookbook/blob/main/recipes/quickstart/Running_Llama3_Anywhere/Running_Llama_on_Mac_Windows_Linux.ipynb) and [in the cloud](https://github.com/meta-llama/llama-cookbook/blob/main/recipes/quickstart/RAG/hello_llama_cloud.ipynb). See [here](https://medium.com/@rohit.k/tgi-vs-vllm-making-informed-choices-for-llm-deployment-37c56d7ff705) for a detailed comparison of vLLM and TGI. For [Ollama](https://ollama.com) based on-prem inference with Llama 3, see the Running Llama 3 locally notebook above. diff --git a/3p-integrations/tgi/README.md b/3p-integrations/tgi/README.md index d167bd204..02fadcc5a 100644 --- a/3p-integrations/tgi/README.md +++ b/3p-integrations/tgi/README.md @@ -9,7 +9,7 @@ In case the model was fine tuned with LoRA method we need to merge the weights o The script takes the base model, the peft weight folder as well as an output as arguments: ``` -python -m llama_recipes.recipes.3p_integration.tgi.merge_lora_weights --base_model llama-7B --peft_model ft_output --output_dir data/merged_model_output +python -m llama_cookbook.recipes.3p_integration.tgi.merge_lora_weights --base_model llama-7B --peft_model ft_output --output_dir data/merged_model_output ``` ## Step 1: Serving the model diff --git a/3p-integrations/using_externally_hosted_llms.ipynb b/3p-integrations/using_externally_hosted_llms.ipynb index 1f9e4dd4a..37d628459 100644 --- a/3p-integrations/using_externally_hosted_llms.ipynb +++ b/3p-integrations/using_externally_hosted_llms.ipynb @@ -1,63 +1,66 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![Meta---Logo@1x.jpg]()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# **Using externally-hosted LLMs**\n", - "Use llama_recipes.inference.llm to perform inference using Llama and other models using third party services. At the moment, three services have been incorporated:\n", - "- Together.ai\n", - "- Anyscale\n", - "- OpenAI\n", - "\n", - "An API token for each service must be obtained and provided to the method before running. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from llama_recipes.inference.llm import TOGETHER, OPENAI, ANYSCALE\n", - "\n", - "together_example = TOGETHER(\"togethercomputer/llama-2-7b-chat\",\"09e45...\")\n", - "print( together_example.query(prompt=\"Why is the sky blue?\"))\n", - "\n", - "\n", - "openai_example = OPENAI(\"gpt-3.5-turbo\",\"sk-LIz9zL3cYp...\")\n", - "print( openai_example.query(prompt=\"Why is the sky blue?\"))\n", - "\n", - "\n", - "anyscale_example = ANYSCALE(\"meta-llama/Llama-2-7b-chat-hf\",\"esecret_c3u4x7...\")\n", - "print( anyscale_example.query(prompt=\"Why is the sky blue?\"))" - ] - } - ], - "metadata": { - "custom": { - "cells": [], - "metadata": { - "fileHeader": "", - "fileUid": "9af50647-0f34-423b-936e-6950218a612f", - "isAdHoc": false, - "language_info": { - "name": "plaintext" - }, - "orig_nbformat": 4 - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![Meta---Logo@1x.jpg]()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# **Using externally-hosted LLMs**\n", + "Use llama_cookbook.inference.llm to perform inference using Llama and other models using third party services. At the moment, three services have been incorporated:\n", + "- Together.ai\n", + "- Anyscale\n", + "- OpenAI\n", + "\n", + "An API token for each service must be obtained and provided to the method before running. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from llama_cookbook.inference.llm import TOGETHER, OPENAI, ANYSCALE\n", + "\n", + "together_example = TOGETHER(\"togethercomputer/llama-2-7b-chat\",\"09e45...\")\n", + "print( together_example.query(prompt=\"Why is the sky blue?\"))\n", + "\n", + "\n", + "openai_example = OPENAI(\"gpt-3.5-turbo\",\"sk-LIz9zL3cYp...\")\n", + "print( openai_example.query(prompt=\"Why is the sky blue?\"))\n", + "\n", + "\n", + "anyscale_example = ANYSCALE(\"meta-llama/Llama-2-7b-chat-hf\",\"esecret_c3u4x7...\")\n", + "print( anyscale_example.query(prompt=\"Why is the sky blue?\"))" + ] + } + ], + "metadata": { + "custom": { + "cells": [], + "metadata": { + "fileHeader": "", + "fileUid": "9af50647-0f34-423b-936e-6950218a612f", + "isAdHoc": false, + "language_info": { + "name": "plaintext" }, - "indentAmount": 2 + "orig_nbformat": 4 + }, + "nbformat": 4, + "nbformat_minor": 2 }, - "nbformat": 4, - "nbformat_minor": 2 + "indentAmount": 2, + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/end-to-end-use-cases/RAFT-Chatbot/README.md b/end-to-end-use-cases/RAFT-Chatbot/README.md index 2f5160da6..0eb66325f 100644 --- a/end-to-end-use-cases/RAFT-Chatbot/README.md +++ b/end-to-end-use-cases/RAFT-Chatbot/README.md @@ -116,7 +116,7 @@ As shown in the above example, we have a "question" section for the generated qu To create a reliable evaluation set, it's ideal to use human-annotated question and answer pairs. This ensures that the questions are relevant and the answers are accurate. However, human annotation is time-consuming and costly. For demonstration purposes, we'll use a subset of the validation set, which will never be used in the fine-tuning. We only need to keep the "question" section and the final answer section, marked by the `` tag in "cot_answer". We'll manually check each example and select only the good ones. We want to ensure that the questions are general enough to be used for web search engine queries and are related to Llama. We'll also use some QA pairs from our FAQ page, with modifications. This will result in 72 question and answer pairs as our evaluation set, saved as `eval_llama.json`. ## Fine-Tuning Steps -Once the RAFT dataset is ready in JSON format, we can start fine-tuning. Unfortunately, the LORA method didn't produce good results, so we'll use the full fine-tuning method. We can use the following commands as an example in the llama-recipes main folder: +Once the RAFT dataset is ready in JSON format, we can start fine-tuning. Unfortunately, the LORA method didn't produce good results, so we'll use the full fine-tuning method. We can use the following commands as an example in the llama-cookbook main folder: ```bash export PATH_TO_ROOT_FOLDER=./raft-8b @@ -129,7 +129,7 @@ For more details on multi-GPU fine-tuning, please refer to the [multigpu_finetun Next, we need to convert the FSDP checkpoint to a HuggingFace checkpoint using the following command: ```bash -python src/llama_recipes/inference/checkpoint_converter_fsdp_hf.py --fsdp_checkpoint_path "$PATH_TO_ROOT_FOLDER/fine-tuned-meta-Llama/Meta-Llama-3-8B-Instruct" --consolidated_model_path "$PATH_TO_ROOT_FOLDER" +python src/llama_cookbook/inference/checkpoint_converter_fsdp_hf.py --fsdp_checkpoint_path "$PATH_TO_ROOT_FOLDER/fine-tuned-meta-Llama/Meta-Llama-3-8B-Instruct" --consolidated_model_path "$PATH_TO_ROOT_FOLDER" ``` For more details on FSDP to HuggingFace checkpoint conversion, please refer to the [readme](../../getting-started/finetuning/multigpu_finetuning.md) in the inference/local_inference recipe. diff --git a/end-to-end-use-cases/benchmarks/llm_eval_harness/meta_eval/README.md b/end-to-end-use-cases/benchmarks/llm_eval_harness/meta_eval/README.md index edf27bc6d..7c8fd75fb 100644 --- a/end-to-end-use-cases/benchmarks/llm_eval_harness/meta_eval/README.md +++ b/end-to-end-use-cases/benchmarks/llm_eval_harness/meta_eval/README.md @@ -25,8 +25,8 @@ Given those differences, the numbers from this recipe can not be compared to the Please install lm-evaluation-harness and our llama-recipe repo by following: ``` -git clone git@github.com:meta-llama/llama-recipes.git -cd llama-recipes +git clone git@github.com:meta-llama/llama-cookbook.git +cd llama-cookbook pip install -U pip setuptools pip install -e . pip install lm-eval[math,ifeval,sentencepiece,vllm]==0.4.3 diff --git a/end-to-end-use-cases/coding/text2sql/quickstart.ipynb b/end-to-end-use-cases/coding/text2sql/quickstart.ipynb index 89fe6b796..41f1a6d66 100644 --- a/end-to-end-use-cases/coding/text2sql/quickstart.ipynb +++ b/end-to-end-use-cases/coding/text2sql/quickstart.ipynb @@ -1,5 +1,5 @@ { - "cells": [ + "cells": [llama-cookbook { "cell_type": "markdown", "id": "e8cba0b6", diff --git a/end-to-end-use-cases/customerservice_chatbots/RAG_chatbot/RAG_Chatbot_Example.ipynb b/end-to-end-use-cases/customerservice_chatbots/RAG_chatbot/RAG_Chatbot_Example.ipynb index 81765ec34..07b18e5f1 100644 --- a/end-to-end-use-cases/customerservice_chatbots/RAG_chatbot/RAG_Chatbot_Example.ipynb +++ b/end-to-end-use-cases/customerservice_chatbots/RAG_chatbot/RAG_Chatbot_Example.ipynb @@ -402,7 +402,7 @@ "In this example, we will be deploying a Meta Llama 3 8B chat HuggingFace model with the Text-generation-inference framework on-permises. \n", "This would allow us to directly wire the API server with our chatbot. \n", "There are alternative solutions to deploy Meta Llama 3 models on-permises as your local API server. \n", - "You can find our complete guide [here](https://github.com/meta-llama/llama-recipes/blob/main/recipes/inference/model_servers/llama-on-prem.md)." + "You can find our complete guide [here](https://github.com/meta-llama/llama-cookbook/blob/main/recipes/inference/model_servers/llama-on-prem.md)." ] }, { diff --git a/end-to-end-use-cases/github_triage/README.md b/end-to-end-use-cases/github_triage/README.md index 4d003507f..0ce7b1b2a 100644 --- a/end-to-end-use-cases/github_triage/README.md +++ b/end-to-end-use-cases/github_triage/README.md @@ -32,7 +32,7 @@ pip install -r requirements.txt ### Running the Tool ```bash -python triage.py --repo_name='meta-llama/llama-recipes' --start_date='2024-08-14' --end_date='2024-08-27' +python triage.py --repo_name='meta-llama/llama-cookbook' --start_date='2024-08-14' --end_date='2024-08-27' ``` ### Output diff --git a/end-to-end-use-cases/github_triage/walkthrough.ipynb b/end-to-end-use-cases/github_triage/walkthrough.ipynb index f96413739..91dcf6967 100644 --- a/end-to-end-use-cases/github_triage/walkthrough.ipynb +++ b/end-to-end-use-cases/github_triage/walkthrough.ipynb @@ -1,4 +1,4 @@ -{ +{llama-cookbookllama-cookbook "cells": [ { "cell_type": "code", diff --git a/end-to-end-use-cases/responsible_ai/code_shield_usage_demo.ipynb b/end-to-end-use-cases/responsible_ai/code_shield_usage_demo.ipynb index 2866bac90..1ef932b49 100644 --- a/end-to-end-use-cases/responsible_ai/code_shield_usage_demo.ipynb +++ b/end-to-end-use-cases/responsible_ai/code_shield_usage_demo.ipynb @@ -151,7 +151,7 @@ "import os\n", "import getpass\n", "\n", - "from llama_recipes.inference.llm import TOGETHER, OPENAI, ANYSCALE\n", + "from llama_cookbook.inference.llm import TOGETHER, OPENAI, ANYSCALE\n", "\n", "if \"EXTERNALLY_HOSTED_LLM_TOKEN\" not in os.environ:\n", " os.environ[\"EXTERNALLY_HOSTED_LLM_TOKEN\"] = getpass.getpass(prompt=\"Provide token for LLM provider\")\n", diff --git a/end-to-end-use-cases/responsible_ai/llama_guard/README.md b/end-to-end-use-cases/responsible_ai/llama_guard/README.md index 2e1ca11c7..0600ed5e4 100644 --- a/end-to-end-use-cases/responsible_ai/llama_guard/README.md +++ b/end-to-end-use-cases/responsible_ai/llama_guard/README.md @@ -6,7 +6,7 @@ This [notebook](llama_guard_text_and_vision_inference.ipynb) shows how to load t ## Requirements 1. Access to Llama guard model weights on Hugging Face. To get access, follow the steps described in the top of the model card in [Hugging Face](https://huggingface.co/meta-llama/Llama-Guard-3-1B) -2. Llama recipes package and its dependencies [installed](https://github.com/meta-llama/llama-recipes?tab=readme-ov-file#installing) +2. Llama recipes package and its dependencies [installed](https://github.com/meta-llama/llama-cookbook?tab=readme-ov-file#installing) 3. Pillow package installed ## Inference Safety Checker diff --git a/end-to-end-use-cases/responsible_ai/llama_guard/llama_guard_customization_via_prompting_and_fine_tuning.ipynb b/end-to-end-use-cases/responsible_ai/llama_guard/llama_guard_customization_via_prompting_and_fine_tuning.ipynb index 5e37ad35b..1a7ca795f 100644 --- a/end-to-end-use-cases/responsible_ai/llama_guard/llama_guard_customization_via_prompting_and_fine_tuning.ipynb +++ b/end-to-end-use-cases/responsible_ai/llama_guard/llama_guard_customization_via_prompting_and_fine_tuning.ipynb @@ -33,7 +33,7 @@ "\n", "Llama Guard is provided with a reference taxonomy explained on [this page](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-guard-3), where the prompting format is also explained. \n", "\n", - "The functions below combine already existing [prompt formatting code in llama-recipes](https://github.com/meta-llama/llama-recipes/blob/main/src/llama_recipes/inference/prompt_format_utils.py) with custom code to aid in the custimization of the taxonomy. " + "The functions below combine already existing [prompt formatting code in llama-recipes](https://github.com/meta-llama/llama-recipes/blob/main/src/llama_cookbook/inference/prompt_format_utils.py) with custom code to aid in the custimization of the taxonomy. " ] }, { @@ -80,7 +80,7 @@ ], "source": [ "from enum import Enum\n", - "from llama_recipes.inference.prompt_format_utils import LLAMA_GUARD_3_CATEGORY, SafetyCategory, AgentType\n", + "from llama_cookbook.inference.prompt_format_utils import LLAMA_GUARD_3_CATEGORY, SafetyCategory, AgentType\n", "from typing import List\n", "\n", "class LG3Cat(Enum):\n", @@ -158,7 +158,7 @@ } ], "source": [ - "from llama_recipes.inference.prompt_format_utils import build_custom_prompt, create_conversation, PROMPT_TEMPLATE_3, LLAMA_GUARD_3_CATEGORY_SHORT_NAME_PREFIX\n", + "from llama_cookbook.inference.prompt_format_utils import build_custom_prompt, create_conversation, PROMPT_TEMPLATE_3, LLAMA_GUARD_3_CATEGORY_SHORT_NAME_PREFIX\n", "from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig\n", "from typing import List, Tuple\n", "from enum import Enum\n", @@ -463,13 +463,13 @@ "\n", "To add additional datasets\n", "\n", - "1. Copy llama-recipes/src/llama_recipes/datasets/toxicchat_dataset.py \n", + "1. Copy llama-recipes/src/llama_cookbook/datasets/toxicchat_dataset.py \n", "2. Modify the file to change the dataset used\n", "3. Add references to the new dataset in \n", - " - llama-recipes/src/llama_recipes/configs/datasets.py\n", - " - llama_recipes/datasets/__init__.py\n", - " - llama_recipes/datasets/toxicchat_dataset.py\n", - " - llama_recipes/utils/dataset_utils.py\n", + " - llama-recipes/src/llama_cookbook/configs/datasets.py\n", + " - llama_cookbook/datasets/__init__.py\n", + " - llama_cookbook/datasets/toxicchat_dataset.py\n", + " - llama_cookbook/utils/dataset_utils.py\n", "\n", "\n", "## Evaluation\n", @@ -484,7 +484,7 @@ "source": [ "from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig\n", "\n", - "from llama_recipes.inference.prompt_format_utils import build_default_prompt, create_conversation, LlamaGuardVersion\n", + "from llama_cookbook.inference.prompt_format_utils import build_default_prompt, create_conversation, LlamaGuardVersion\n", "from llama.llama.generation import Llama\n", "\n", "from typing import List, Optional, Tuple, Dict\n", @@ -726,7 +726,7 @@ "# \"unsafe_content\": [\"O1\"]\n", "# }\n", "# ```\n", - "from llama_recipes.datasets.toxicchat_dataset import get_llamaguard_toxicchat_dataset\n", + "from llama_cookbook.datasets.toxicchat_dataset import get_llamaguard_toxicchat_dataset\n", "validation_data = get_llamaguard_toxicchat_dataset(None, None, \"train\", return_jsonl = True)[0:100]\n", "run_validation(validation_data, AgentType.USER, Type.HF, load_in_8bit = False, load_in_4bit = True)" ] @@ -757,7 +757,7 @@ "outputs": [], "source": [ "model_id = \"meta-llama/Llama-Guard-3-8B\"\n", - "from llama_recipes import finetuning\n", + "from llama_cookbook import finetuning\n", "\n", "finetuning.main(\n", " model_name = model_id,\n", diff --git a/end-to-end-use-cases/responsible_ai/prompt_guard/README.md b/end-to-end-use-cases/responsible_ai/prompt_guard/README.md index f95da8ad7..43b5972f1 100644 --- a/end-to-end-use-cases/responsible_ai/prompt_guard/README.md +++ b/end-to-end-use-cases/responsible_ai/prompt_guard/README.md @@ -8,4 +8,4 @@ This is a very small model and inference and fine-tuning are feasible on local C ## Requirements 1. Access to Prompt Guard model weights on Hugging Face. To get access, follow the steps described [here](https://github.com/facebookresearch/PurpleLlama/tree/main/Prompt-Guard#download) -2. Llama recipes package and it's dependencies [installed](https://github.com/meta-llama/llama-recipes?tab=readme-ov-file#installing) +2. Llama recipes package and it's dependencies [installed](https://github.com/meta-llama/llama-cookbook?tab=readme-ov-file#installing) diff --git a/getting-started/README.md b/getting-started/README.md index bd294c3c7..c8ee77ded 100644 --- a/getting-started/README.md +++ b/getting-started/README.md @@ -1,4 +1,4 @@ -## Llama-Recipes Getting Started +## Llama-cookbook Getting Started If you are new to developing with Meta Llama models, this is where you should start. This folder contains introductory-level notebooks across different techniques relating to Meta Llama. diff --git a/getting-started/finetuning/finetune_vision_model.md b/getting-started/finetuning/finetune_vision_model.md index 6d0c1e021..c45a97202 100644 --- a/getting-started/finetuning/finetune_vision_model.md +++ b/getting-started/finetuning/finetune_vision_model.md @@ -1,7 +1,7 @@ ## Llama 3.2 Vision Models Fine-Tuning Recipe This recipe steps you through how to finetune a Llama 3.2 vision model on the OCR VQA task using the [OCRVQA](https://huggingface.co/datasets/HuggingFaceM4/the_cauldron/viewer/ocrvqa?row=0) dataset. -**Disclaimer**: As our vision models already have a very good OCR ability, here we use the OCRVQA dataset only for demonstration purposes of the required steps for fine-tuning our vision models with llama-recipes. +**Disclaimer**: As our vision models already have a very good OCR ability, here we use the OCRVQA dataset only for demonstration purposes of the required steps for fine-tuning our vision models with llama-cookbook. ### Fine-tuning steps diff --git a/getting-started/finetuning/finetuning.py b/getting-started/finetuning/finetuning.py index 3ddbf30de..1f455f124 100644 --- a/getting-started/finetuning/finetuning.py +++ b/getting-started/finetuning/finetuning.py @@ -2,7 +2,7 @@ # This software may be used and distributed according to the terms of the Llama 2 Community License Agreement. import fire -from llama_recipes.finetuning import main +from llama_cookbook.finetuning import main if __name__ == "__main__": fire.Fire(main) \ No newline at end of file diff --git a/getting-started/finetuning/quickstart_peft_finetuning.ipynb b/getting-started/finetuning/quickstart_peft_finetuning.ipynb index 560cc1729..19044bdd3 100644 --- a/getting-started/finetuning/quickstart_peft_finetuning.ipynb +++ b/getting-started/finetuning/quickstart_peft_finetuning.ipynb @@ -31,17 +31,17 @@ "source": [ "### Step 0: Install pre-requirements and convert checkpoint\n", "\n", - "We need to have llama-recipes and its dependencies installed for this notebook. Additionally, we need to log in with the huggingface_cli and make sure that the account is able to to access the Meta Llama weights." + "We need to have llama-cookbook and its dependencies installed for this notebook. Additionally, we need to log in with the huggingface_cli and make sure that the account is able to to access the Meta Llama weights." ] }, { "cell_type": "code", - "execution_count": 1, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# uncomment if running from Colab T4\n", - "# ! pip install llama-recipes ipywidgets\n", + "# ! pip install llama-cookbook ipywidgets\n", "\n", "# import huggingface_hub\n", "# huggingface_hub.login()" @@ -59,7 +59,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -80,7 +80,7 @@ "source": [ "import torch\n", "from transformers import LlamaForCausalLM, AutoTokenizer\n", - "from llama_recipes.configs import train_config as TRAIN_CONFIG\n", + "from llama_cookbook.configs import train_config as TRAIN_CONFIG\n", "\n", "train_config = TRAIN_CONFIG()\n", "train_config.model_name = \"meta-llama/Meta-Llama-3.1-8B\"\n", @@ -221,8 +221,8 @@ "metadata": {}, "outputs": [], "source": [ - "from llama_recipes.configs.datasets import samsum_dataset\n", - "from llama_recipes.utils.dataset_utils import get_dataloader\n", + "from llama_cookbook.configs.datasets import samsum_dataset\n", + "from llama_cookbook.utils.dataset_utils import get_dataloader\n", "\n", "samsum_dataset.trust_remote_code = True\n", "\n", @@ -248,7 +248,7 @@ "source": [ "from peft import get_peft_model, prepare_model_for_kbit_training, LoraConfig\n", "from dataclasses import asdict\n", - "from llama_recipes.configs import lora_config as LORA_CONFIG\n", + "from llama_cookbook.configs import lora_config as LORA_CONFIG\n", "\n", "lora_config = LORA_CONFIG()\n", "lora_config.r = 8\n", @@ -278,7 +278,7 @@ "outputs": [], "source": [ "import torch.optim as optim\n", - "from llama_recipes.utils.train_utils import train\n", + "from llama_cookbook.utils.train_utils import train\n", "from torch.optim.lr_scheduler import StepLR\n", "\n", "model.train()\n", diff --git a/getting-started/inference/local_inference/inference.py b/getting-started/inference/local_inference/inference.py index 6e73b116a..277b744fa 100644 --- a/getting-started/inference/local_inference/inference.py +++ b/getting-started/inference/local_inference/inference.py @@ -10,9 +10,9 @@ import torch from accelerate.utils import is_xpu_available -from llama_recipes.inference.model_utils import load_model, load_peft_model +from llama_cookbook.inference.model_utils import load_model, load_peft_model -from llama_recipes.inference.safety_utils import AgentType, get_safety_checker +from llama_cookbook.inference.safety_utils import AgentType, get_safety_checker from transformers import AutoTokenizer @@ -176,7 +176,7 @@ def inference( ) ], title="Meta Llama3 Playground", - description="https://github.com/meta-llama/llama-recipes", + description="https://github.com/meta-llama/llama-cookbook", ).queue().launch(server_name="0.0.0.0", share=share_gradio) diff --git a/getting-started/inference/mobile_inference/android_inference/README.md b/getting-started/inference/mobile_inference/android_inference/README.md index 50ec467dc..123b882da 100644 --- a/getting-started/inference/mobile_inference/android_inference/README.md +++ b/getting-started/inference/mobile_inference/android_inference/README.md @@ -103,7 +103,7 @@ Connect your phone to your development machine. On OSX, you'll be prompted on th ## Building the Android Package with MLC -First edit the file under `android/MLCChat/mlc-package-config.json` and with the [mlc-package-config.json](./mlc-package-config.json) in llama-recipes. +First edit the file under `android/MLCChat/mlc-package-config.json` and with the [mlc-package-config.json](./mlc-package-config.json) in llama-cookbook. To understand what these JSON fields mean you can refer to this [documentation](https://llm.mlc.ai/docs/deploy/android.html#step-2-build-runtime-and-model-libraries). diff --git a/pyproject.toml b/pyproject.toml index de9c88548..f687ac648 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "hatchling.build" [project] name = "llama-cookbook" -version = "0.0.5" +version = "0.0.5.post1" authors = [ { name="Hamid Shojanazeri", email="hamidnazeri@meta.com" }, { name="Matthias Reso", email="mreso@meta.com" }, diff --git a/src/llama_cookbook/configs/datasets.py b/src/llama_cookbook/configs/datasets.py index 478321dde..ff84b331a 100644 --- a/src/llama_cookbook/configs/datasets.py +++ b/src/llama_cookbook/configs/datasets.py @@ -14,8 +14,8 @@ class samsum_dataset: @dataclass class grammar_dataset: dataset: str = "grammar_dataset" - train_split: str = "src/llama_recipes/datasets/grammar_dataset/gtrain_10k.csv" - test_split: str = "src/llama_recipes/datasets/grammar_dataset/grammar_validation.csv" + train_split: str = "src/llama_cookbook/datasets/grammar_dataset/gtrain_10k.csv" + test_split: str = "src/llama_cookbook/datasets/grammar_dataset/grammar_validation.csv" @dataclass @@ -23,7 +23,7 @@ class alpaca_dataset: dataset: str = "alpaca_dataset" train_split: str = "train" test_split: str = "val" - data_path: str = "src/llama_recipes/datasets/alpaca_data.json" + data_path: str = "src/llama_cookbook/datasets/alpaca_data.json" @dataclass class custom_dataset: @@ -32,7 +32,7 @@ class custom_dataset: train_split: str = "train" test_split: str = "validation" data_path: str = "" - + @dataclass class llamaguard_toxicchat_dataset: dataset: str = "llamaguard_toxicchat_dataset" diff --git a/src/llama_cookbook/configs/wandb.py b/src/llama_cookbook/configs/wandb.py index 6a43ffec2..129760580 100644 --- a/src/llama_cookbook/configs/wandb.py +++ b/src/llama_cookbook/configs/wandb.py @@ -6,10 +6,10 @@ @dataclass class wandb_config: - project: str = 'llama_recipes' # wandb project name + project: str = 'llama_cookbook' # wandb project name entity: Optional[str] = None # wandb entity name job_type: Optional[str] = None tags: Optional[List[str]] = None group: Optional[str] = None notes: Optional[str] = None - mode: Optional[str] = None \ No newline at end of file + mode: Optional[str] = None diff --git a/src/llama_cookbook/data/llama_guard/README.md b/src/llama_cookbook/data/llama_guard/README.md index 91983da21..65fbcc416 100644 --- a/src/llama_cookbook/data/llama_guard/README.md +++ b/src/llama_cookbook/data/llama_guard/README.md @@ -10,9 +10,9 @@ The finetuning_data_formatter script provides classes and methods for formatting ## Running the script -1. Clone the llama-recipes repo +1. Clone the llama-cookbook repo 2. Install the dependencies -3. Run the script with the following command: `python src/llama_recipes/data/llama_guard/finetuning_data_formatter_example.py > sample.json` +3. Run the script with the following command: `python src/llama_cookbook/data/llama_guard/finetuning_data_formatter_example.py > sample.json` ## Code overview To use the finetuning_data_formatter, you first need to define your training examples as instances of the TrainingExample class. For example: diff --git a/src/llama_cookbook/finetuning.py b/src/llama_cookbook/finetuning.py index 2c95ea0ea..8673f08a0 100644 --- a/src/llama_cookbook/finetuning.py +++ b/src/llama_cookbook/finetuning.py @@ -74,7 +74,7 @@ def setup_wandb(train_config, fsdp_config, **kwargs): "You are trying to use wandb which is not currently installed. " "Please install it using pip install wandb" ) - from llama_recipes.configs import wandb_config as WANDB_CONFIG + from llama_cookbook.configs import wandb_config as WANDB_CONFIG wandb_config = WANDB_CONFIG() update_config(wandb_config, **kwargs) @@ -196,7 +196,7 @@ def main(**kwargs): model.resize_token_embeddings(len(tokenizer)) print_model_size(model, train_config, rank if train_config.enable_fsdp else 0) - + # Convert the model to bfloat16 if fsdp and pure_bf16 is enabled if ( train_config.enable_fsdp @@ -239,12 +239,12 @@ def main(**kwargs): freeze_transformer_layers(model, train_config.num_freeze_layers) # print model size and frozen layers after freezing layers print_frozen_model_status(model, train_config, rank if train_config.enable_fsdp else 0) - + if not train_config.use_peft and train_config.freeze_LLM_only and config.model_type == "mllama": freeze_LLM_only(model) # print model size and frozen layers after freezing layers print_frozen_model_status(model, train_config, rank if train_config.enable_fsdp else 0) - + mixed_precision_policy, wrapping_policy = get_policies(fsdp_config, rank) # Create the FSDP wrapper for MllamaSelfAttentionDecoderLayer,MllamaCrossAttentionDecoderLayer,MllamaVisionEncoderLayer in vision models if is_vision: @@ -264,7 +264,7 @@ def main(**kwargs): device_id = torch.xpu.current_device() elif torch.cuda.is_available(): device_id = torch.cuda.current_device() - + if train_config.freeze_LLM_only: use_orig_params = True else: @@ -312,7 +312,7 @@ def main(**kwargs): dataset_processer = processor else: dataset_processer = tokenizer - + # Load and preprocess the dataset for training and validation dataset_train = get_preprocessed_dataset( diff --git a/src/llama_cookbook/inference/safety_utils.py b/src/llama_cookbook/inference/safety_utils.py index fd1f23a67..b22159afd 100644 --- a/src/llama_cookbook/inference/safety_utils.py +++ b/src/llama_cookbook/inference/safety_utils.py @@ -24,7 +24,7 @@ def __call__(self, output_text, **kwargs): except ImportError as e: print("Could not import optional dependency: auditnlg\nPlease install manually with:\n pip install auditnlg\nFollowed by:\npip install -r requirements.txt") raise e - + data = [{"output": output_text}] @@ -37,8 +37,8 @@ def __call__(self, output_text, **kwargs): report += "|" + "|".join(f"{n:^10}" for n in [list(k.keys())[0] for k in scores["class_scores"]]) + "|\n" report += "|" + "|".join(f"{n:^10.5}" for n in [list(k.values())[0] for k in scores["class_scores"]]) + "|\n" return "Sensitive Topics", is_safe, report - - + + class SalesforceSafetyChecker(object): def __init__(self): pass @@ -47,13 +47,13 @@ def __call__(self, output_text, **kwargs): from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, AutoConfig config = AutoConfig.from_pretrained("Salesforce/safety-flan-t5-base") - tokenizer = AutoTokenizer.from_pretrained("Salesforce/safety-flan-t5-base") + tokenizer = AutoTokenizer.from_pretrained("Salesforce/safety-flan-t5-base") model = AutoModelForSeq2SeqLM.from_pretrained("Salesforce/safety-flan-t5-base", config=config) model.eval() prefix = "Is the field safe or unsafe?" input_ids = tokenizer(prefix + " " + output_text + " ", return_tensors="pt").input_ids - + if len(input_ids[0]) > 512: warnings.warn( "Input length is > 512 token. Safety check result could be incorrect." @@ -66,9 +66,9 @@ def __call__(self, output_text, **kwargs): return_dict_in_generate=True, max_new_tokens=20, ) - - is_safe = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True).split(" ")[0] == "safe" - + + is_safe = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True).split(" ")[0] == "safe" + report = "" if not is_safe: true_false_ids = tokenizer("true false").input_ids[:2] @@ -76,11 +76,11 @@ def __call__(self, output_text, **kwargs): scores = {} for k, i in zip(keys, range(3,20,2)): scores[k] = round(outputs.scores[i][0,true_false_ids].softmax(dim=0)[0].item(), 5) - + report += "|" + "|".join(f"{n:^10}" for n in scores.keys()) + "|\n" report += "|" + "|".join(f"{n:^10}" for n in scores.values()) + "|\n" return "Salesforce Content Safety Flan T5 Base", is_safe, report - + def get_total_length(self, data): prefix = "Is the field safe or unsafe " @@ -158,7 +158,7 @@ class LlamaGuardSafetyChecker(object): def __init__(self): from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig - from llama_recipes.inference.prompt_format_utils import build_default_prompt, create_conversation, LlamaGuardVersion + from llama_cookbook.inference.prompt_format_utils import build_default_prompt, create_conversation, LlamaGuardVersion model_id = "meta-llama/Llama-Guard-3-8B" @@ -168,7 +168,7 @@ def __init__(self): self.model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=quantization_config, device_map="auto") def __call__(self, output_text, **kwargs): - + agent_type = kwargs.get('agent_type', AgentType.USER) user_prompt = kwargs.get('user_prompt', "") @@ -194,14 +194,14 @@ def __call__(self, output_text, **kwargs): prompt_len = input_ids.shape[-1] output = self.model.generate(input_ids=input_ids, max_new_tokens=100, pad_token_id=0) result = self.tokenizer.decode(output[0][prompt_len:], skip_special_tokens=True) - + splitted_result = result.split("\n")[0]; - is_safe = splitted_result == "safe" + is_safe = splitted_result == "safe" report = result - + return "Llama Guard", is_safe, report - + # Function to load the PeftModel for performance optimization # Function to determine which safety checker to use based on the options selected diff --git a/src/llama_cookbook/tools/README.md b/src/llama_cookbook/tools/README.md index 95525f32c..66b5cbea2 100644 --- a/src/llama_cookbook/tools/README.md +++ b/src/llama_cookbook/tools/README.md @@ -7,7 +7,7 @@ This is the reverse conversion for `convert_llama_weights_to_hf.py` script from - Copy file params.json from the official llama download into that directory. - Run the conversion script. `model-path` can be a Hugging Face hub model or a local hf model directory. ``` -python -m llama_recipes.tools.convert_hf_weights_to_llama --model-path meta-llama/Meta-Llama-3.1-70B-Instruct --output-dir test70B --model-size 70B +python -m llama_cookbook.tools.convert_hf_weights_to_llama --model-path meta-llama/Meta-Llama-3.1-70B-Instruct --output-dir test70B --model-size 70B ``` ## Step 1: Run inference diff --git a/src/llama_cookbook/utils/config_utils.py b/src/llama_cookbook/utils/config_utils.py index 88a21d414..eb4510bb7 100644 --- a/src/llama_cookbook/utils/config_utils.py +++ b/src/llama_cookbook/utils/config_utils.py @@ -49,10 +49,10 @@ def generate_peft_config(train_config, kwargs): raise RuntimeError(f"Peft config not found: {train_config.peft_method}") if train_config.peft_method == "prefix": - raise RuntimeError("PrefixTuning is currently not supported (see https://github.com/meta-llama/llama-recipes/issues/359#issuecomment-2089350811)") + raise RuntimeError("PrefixTuning is currently not supported (see https://github.com/meta-llama/llama-cookbook/issues/359#issuecomment-2089350811)") if train_config.enable_fsdp and train_config.peft_method == "llama_adapter": - raise RuntimeError("Llama_adapter is currently not supported in combination with FSDP (see https://github.com/meta-llama/llama-recipes/issues/359#issuecomment-2089274425)") + raise RuntimeError("Llama_adapter is currently not supported in combination with FSDP (see https://github.com/meta-llama/llama-cookbook/issues/359#issuecomment-2089274425)") config = configs[names.index(train_config.peft_method)]()