Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert_checkpoint.py failed with LLAMA 3.1 8B instruct #2105

Closed
1 of 4 tasks
ShuaiShao93 opened this issue Aug 9, 2024 · 6 comments
Closed
1 of 4 tasks

Convert_checkpoint.py failed with LLAMA 3.1 8B instruct #2105

ShuaiShao93 opened this issue Aug 9, 2024 · 6 comments
Labels
not a bug Some known limitation, but not a bug.

Comments

@ShuaiShao93
Copy link

ShuaiShao93 commented Aug 9, 2024

System Info

Debian 11

Who can help?

@byshiue @juney-nvidia

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

$ git clone https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct

$ python3 TensorRT-LLM/examples/llama/convert_checkpoint.py --model_dir ./Meta-Llama-3.1-8B-Instruct --output_dir ./tllm_checkpoint_1gpu_bf16 --dtype bfloat16

Expected behavior

convert_checkpoint.py should work with llama 3.1

actual behavior

[TensorRT-LLM] TensorRT-LLM version: 0.11.0
0.11.0
Traceback (most recent call last):
  File "/home/ss/TensorRT-LLM/examples/llama/convert_checkpoint.py", line 461, in <module>
    main()
  File "/home/ss/TensorRT-LLM/examples/llama/convert_checkpoint.py", line 453, in main
    convert_and_save_hf(args)
  File "/home/ss/TensorRT-LLM/examples/llama/convert_checkpoint.py", line 378, in convert_and_save_hf
    execute(args.workers, [convert_and_save_rank] * world_size, args)
  File "/home/ss/TensorRT-LLM/examples/llama/convert_checkpoint.py", line 402, in execute
    f(args, rank)
  File "/home/ss/TensorRT-LLM/examples/llama/convert_checkpoint.py", line 367, in convert_and_save_rank
    llama = LLaMAForCausalLM.from_hugging_face(
  File "/opt/conda/lib/python3.10/site-packages/tensorrt_llm/models/llama/model.py", line 328, in from_hugging_face
    model = LLaMAForCausalLM(config)
  File "/opt/conda/lib/python3.10/site-packages/tensorrt_llm/models/modeling_utils.py", line 361, in __call__
    obj = type.__call__(cls, *args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/tensorrt_llm/models/llama/model.py", line 267, in __init__
    transformer = LLaMAModel(config)
  File "/opt/conda/lib/python3.10/site-packages/tensorrt_llm/models/llama/model.py", line 211, in __init__
    self.layers = DecoderLayerList(LLaMADecoderLayer, config)
  File "/opt/conda/lib/python3.10/site-packages/tensorrt_llm/models/modeling_utils.py", line 289, in __init__
    super().__init__([cls(config, idx) for idx in self.layer_list])
  File "/opt/conda/lib/python3.10/site-packages/tensorrt_llm/models/modeling_utils.py", line 289, in <listcomp>
    super().__init__([cls(config, idx) for idx in self.layer_list])
  File "/opt/conda/lib/python3.10/site-packages/tensorrt_llm/models/llama/model.py", line 51, in __init__
    self.attention = Attention(
  File "/opt/conda/lib/python3.10/site-packages/tensorrt_llm/layers/attention.py", line 347, in __init__
    assert rotary_embedding_scaling["type"] in ["linear", "dynamic"]
KeyError: 'type'

additional notes

N/A

@ShuaiShao93 ShuaiShao93 added the bug Something isn't working label Aug 9, 2024
@KuntaiDu
Copy link

I am also experiencing this issue when running benchmarks.

@daulet
Copy link

daulet commented Aug 14, 2024

sync up, it's fixed on main branch, post v0.11.0 release

@byshiue
Copy link
Collaborator

byshiue commented Aug 19, 2024

The llama 3.1 is not supported on TRT-LLM 0.11. It is only supported on main branch now.

The first commit we support llama 3.1 is bca9a33b022dc6a924bf7913137feed3d28b602d, which is released on 23 July 2024.

@byshiue byshiue closed this as completed Aug 19, 2024
@byshiue byshiue added not a bug Some known limitation, but not a bug. and removed bug Something isn't working labels Aug 19, 2024
@yuhengxnv
Copy link

Discussed with @byshiue , Llama 3.1 models require transformer >= 4.43.0. Maybe a workaround is temporarily
pip install -U transformers
before running convert_checkpoint.py

@ivanbaldo
Copy link

Is this supported on a released version now?

@ivanbaldo
Copy link

Ouch sorry for the noise, I see that on https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.12.0 it's supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not a bug Some known limitation, but not a bug.
Projects
None yet
Development

No branches or pull requests

6 participants