Convert fixes #3967

lorddoskias · 2023-11-06T11:07:47Z

Fixes for two issues in the convert.py script:

First one is detecting LLAMA2 by considering the 1e-06 norm_eps value to also suggest presence of LLAMA2
Fix for handling vocab_size: -1 in params.json in LLAMA2.

In recent downloads of LLAMA2 dataset the norm_eps is set to 1e-06, this leads to convert.py erroneously considering the model to be LLAMA1 and setting the context to 2k tokens. Fix it by extending the existing hack to also check for the 1e-06 value.

When vocab_size is detected to be -1 simply remove its value from the parsed params.json and fallback to using the tok_embeddings.weight. Fixes ggerganov#3900

cebtenzzre · 2023-11-06T20:53:40Z

convert.py

@@ -250,9 +250,14 @@ def loadOriginalParamsJson(model: LazyModel, config_path: Path) -> Params:
        if config.get("rope_theta") == 1000000:
            # CodeLlama
            n_ctx = 16384
-        elif config["norm_eps"] == 1e-05:
+        elif config["norm_eps"] in (1e-05, 1e-06):


Then how do we detect LLaMA-1? See #2384

Well, LLAMA2 files with norm_eps equal to 1e-06.

Could you tell me where to get an original LLaMA-2 model with norm_eps=1e-6?

Well I got it via the LLAMA2's repository download script and utilizing the link that was sent to my email. Since it's tied to my identity and accepting the EULA I'd say it's not prudent to share it? I guess if you create a new registration just now and try to download llama 7b you'd see it?

I just downloaded LLaMA-2 7B and got this:

{"dim": 4096, "multiple_of": 256, "n_heads": 32, "n_layers": 32, "norm_eps": 1e-05, "vocab_size": -1}

So you'll have to be more specific.

Ok, so I have actually downloaded llama2-7b-chat :

cat llama-2-7b-chat/params.json
{"dim": 4096, "multiple_of": 256, "n_heads": 32, "n_layers": 32, "norm_eps": 1e-06, "vocab_size": -1}

cat llama-2-7b-chat/checklist.chk
0c4837f3ef97f648452f91faed308a07 consolidated.00.pth
1c39bc3c6b51079fd807cc105b86c9df params.json

I can confirm this.

@ggerganov Our "hack to determine LLaMA v1 vs v2" is apparently insufficient. Maybe we need a CLI arg to disambiguate?

Ugh, this is quite messy. I guess we do need to add a CLI arg that would be required in cases where the vocab_size is -1. In such cases we abort and ask the user to specify explicitly LLaMA v1 or LLaMA v2.

cebtenzzre · 2023-11-06T20:53:49Z

convert.py

            # LLaMA v2
            n_ctx = 4096
+            # For some reason FB writes -1 to vocab size for their LLAMA2 models
+            # simply remove this bogus value and let the return statement belo


cebtenzzre · 2023-11-30T22:25:10Z

Partially obsoleted by #4258

Nikolay Borisov added 2 commits November 6, 2023 13:49

convert: Fix detection of LLAMA2

d6d905b

In recent downloads of LLAMA2 dataset the norm_eps is set to 1e-06, this leads to convert.py erroneously considering the model to be LLAMA1 and setting the context to 2k tokens. Fix it by extending the existing hack to also check for the 1e-06 value.

convert: Fix handling of LLAMA2 vocab_size = -1

f36a777

When vocab_size is detected to be -1 simply remove its value from the parsed params.json and fallback to using the tok_embeddings.weight. Fixes ggerganov#3900

lorddoskias force-pushed the convert-fixes branch from 55a939d to f36a777 Compare November 6, 2023 11:49

cebtenzzre reviewed Nov 6, 2023

View reviewed changes

cebtenzzre mentioned this pull request Nov 30, 2023

convert.py : fix llama/llama2 conversion due to vocab_size=-1 #4258

Merged

ggerganov closed this Jan 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert fixes #3967

Convert fixes #3967

lorddoskias commented Nov 6, 2023

cebtenzzre Nov 6, 2023

lorddoskias Nov 7, 2023

cebtenzzre Nov 7, 2023 •

edited

Loading

lorddoskias Nov 7, 2023

cebtenzzre Nov 7, 2023

lorddoskias Nov 7, 2023 •

edited

Loading

cebtenzzre Nov 7, 2023

ggerganov Jan 13, 2024

cebtenzzre Nov 6, 2023

cebtenzzre commented Nov 30, 2023

Convert fixes #3967

Convert fixes #3967

Conversation

lorddoskias commented Nov 6, 2023

cebtenzzre Nov 6, 2023

Choose a reason for hiding this comment

lorddoskias Nov 7, 2023

Choose a reason for hiding this comment

cebtenzzre Nov 7, 2023 • edited Loading

Choose a reason for hiding this comment

lorddoskias Nov 7, 2023

Choose a reason for hiding this comment

cebtenzzre Nov 7, 2023

Choose a reason for hiding this comment

lorddoskias Nov 7, 2023 • edited Loading

Choose a reason for hiding this comment

cebtenzzre Nov 7, 2023

Choose a reason for hiding this comment

ggerganov Jan 13, 2024

Choose a reason for hiding this comment

cebtenzzre Nov 6, 2023

Choose a reason for hiding this comment

cebtenzzre commented Nov 30, 2023

cebtenzzre Nov 7, 2023 •

edited

Loading

lorddoskias Nov 7, 2023 •

edited

Loading