convert.py : Handle null rope scaling value in HF config.json #2793

pnb · 2023-08-25T17:05:41Z

@TheBloke noted that config.json can have a null value for rope scaling, which was not handled in #2772. My bad! This PR fixes that.

Tested with three possibilities: null rope scaling, rope scaling defined with "type": "linear", "factor": 4.0, and no rope scaling defined at all in the JSON object.

TheBloke

Works fine and it would be if this could be merged soon, as it breaks most model conversions I attempt unless I fix it locally.

* master: (773 commits) server : add `/detokenize` endpoint (ggerganov#2802) convert.py : advanced option (ggerganov#2753) llama : use Unicode Escape Sequence to replace encoded characters (ggerganov#2814) flake.nix : add rocm support and cleanup (ggerganov#2808) llama : move #includes out of _GNU_SOURCE conditional (ggerganov#2817) main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (ggerganov#1528) llama : use std::abs in llama_sample_tail_free (ggerganov#2800) k-quants : remove unnecessary tensor shape restrictions (ggerganov#2811) Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (ggerganov#2807) Fix HellaSwag (ggerganov#2805) flake : build llama.cpp on Intel with nix (ggerganov#2795) Handle null rope scaling value (ggerganov#2793) Fix spm whitespaces (ggerganov#2806) examples : skip unnecessary external lib in server README.md how-to (ggerganov#2804) llama : fix struct decl (ggerganov#2790) Faster perplexity computation (ggerganov#2786) llama : add llama_beam_search() (ggerganov#2267) convert.py : Get rope scale from HuggingFace models (ggerganov#2772) llama-bench : add model sizes (ggerganov#2771) convert.py : export rope freq_base when converting CodeLlama from an HF model (ggerganov#2773) ...

Handle null rope scaling value

a14a033

TheBloke reviewed Aug 26, 2023

View reviewed changes

slaren approved these changes Aug 26, 2023

View reviewed changes

slaren merged commit a2ca4e9 into ggerganov:master Aug 26, 2023

akawrykow pushed a commit to akawrykow/llama.cpp that referenced this pull request Aug 29, 2023

Handle null rope scaling value (ggerganov#2793)

958e2b4

KerfuffleV2 mentioned this pull request Sep 9, 2023

[Oversight] -> Ideal Rope for CodeLLama 2 based models differs vastly from LLama 2. #3090

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert.py : Handle null rope scaling value in HF config.json #2793

convert.py : Handle null rope scaling value in HF config.json #2793

pnb commented Aug 25, 2023

TheBloke left a comment

convert.py : Handle null rope scaling value in HF config.json #2793

convert.py : Handle null rope scaling value in HF config.json #2793

Conversation

pnb commented Aug 25, 2023

TheBloke left a comment

Choose a reason for hiding this comment