We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert Qwen2.5-7B to Qwen2.5-7B.nemo. After tar xvf, the Qwen2.5-7B.nemo is as following:
Qwen2.5-7B.nemo
root@inp16075348439349544319-3-1:/mnt/tenant-home_speed/nemo_models# tar xvf Qwen2.5-7B.nemo ./ ./model_config.yaml ./model_weights/ ./model_weights/.metadata ./model_weights/__0_0.distcp ./model_weights/__0_1.distcp ./model_weights/common.pt ./model_weights/metadata.json
Then I tried to do inference using Qwen2.5-7B.nemo.
python3 megatron_gpt_eval.py \ gpt_model_file=Qwen2.5-7B.nemo \ inference.greedy=True \ inference.add_BOS=True \ trainer.devices=1 \ trainer.num_nodes=1 \ tensor_model_parallel_size=1 \ pipeline_model_parallel_size=1 \ prompts='["who are you?", "What is the captial of China?"]'
The response seems wrong. Part of the output:
[NeMo I 2024-12-04 21:57:07 nlp_overrides:1386] Model MegatronGPTModel was successfully restored from Qwen2.5-7B.nemo. prompt=========:['who are you?', 'What is the captial of China?'] setting number of microbatches to constant 1 *************************** {'sentences': ['who are you?1000000000000000000000000000000000', 'What is the captial of China?100000000000000000000000000000'], 'tokens': [['<|im_start|>', 'who', 'Ġare', 'Ġyou', '?', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'], ['<|im_start|>', 'What', 'Ġis', 'Ġthe', 'Ġcapt', 'ial', 'Ġof', 'ĠChina', '?', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']], 'logprob': None, 'full_logprob': None, 'token_ids': [[151644, 14623, 525, 498, 30, 16, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15], [151644, 3838, 374, 279, 6427, 530, 315, 5616, 30, 16, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15]], 'offsets': [[0, 0, 3, 7, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45], [0, 0, 4, 7, 11, 16, 19, 22, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58]]} *************************** [NeMo I 2024-12-04 21:57:14 megatron_gpt_model:1717] Pipeline model parallel rank: 0, Tensor model parallel rank: 0, Number of model parameters on device: 7.62e+09. Number of precise model parameters on device: 7615616512. LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7] Predicting DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]setting number of microbatches to constant 1 Predicting DataLoader 0: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00, 0.51it/s] *************************** [{'sentences': ['who are you?1000000000000000000000000000000000', 'What is the captial of China?100000000000000000000000000000'], 'tokens': [['<|im_start|>', 'who', 'Ġare', 'Ġyou', '?', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'], ['<|im_start|>', 'What', 'Ġis', 'Ġthe', 'Ġcapt', 'ial', 'Ġof', 'ĠChina', '?', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']], 'logprob': None, 'full_logprob': None, 'token_ids': [[151644, 14623, 525, 498, 30, 16, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15], [151644, 3838, 374, 279, 6427, 530, 315, 5616, 30, 16, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15]], 'offsets': [[0, 0, 3, 7, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45], [0, 0, 4, 7, 11, 16, 19, 22, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58]]}] ***************************
The text was updated successfully, but these errors were encountered:
same problem with llama3.1
Sorry, something went wrong.
same problem here.
No branches or pull requests
1. HF -> nemo
Convert Qwen2.5-7B to
Qwen2.5-7B.nemo
. After tar xvf, theQwen2.5-7B.nemo
is as following:2. Inference using
Qwen2.5-7B.nemo
Then I tried to do inference using
Qwen2.5-7B.nemo
.The response seems wrong. Part of the output:
The text was updated successfully, but these errors were encountered: