-
Notifications
You must be signed in to change notification settings - Fork 518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Deepseek-R1-Distill-32B 评测报错 #1914
Comments
另外,我是通过xinference启动的32B量化模型 |
The configuration you use requires an LLM as the Judger to verify the result. |
Okay, I will try again. By the way, how can I speed up the evaluation process?It takes too long for once eval. |
Inferencing: 17%|█▋ | 1/6 [02:51<14:16, 171.22s/it]�[A Inferencing: 33%|███▎ | 2/6 [32:34<1:14:37, 1119.50s/it]�[A From the log your provided, it looks to me that each of your question takes very long time (i.e. more than 10 minutes) ,try to request your vLLM server independently to see if it also takes that long time. |
It's fast to request the vLLM server independently. |
Prerequisite
Type
I have modified the code (config is not considered code), or I'm working on my own tasks/models/datasets.
Environment
{'CUDA available': True,
'CUDA_HOME': '/usr/local/cuda',
'GCC': 'gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0',
'GPU 0': 'NVIDIA H20',
'MMEngine': '0.10.6',
'MUSA available': False,
'NVCC': 'Cuda compilation tools, release 11.8, V11.8.89',
'OpenCV': '4.11.0',
'PyTorch': '2.5.1+cu124',
'PyTorch compiling details': 'PyTorch built with:\n'
' - GCC 9.3\n'
' - C++ Version: 201703\n'
' - Intel(R) oneAPI Math Kernel Library Version '
'2024.2-Product Build 20240605 for Intel(R) 64 '
'architecture applications\n'
' - Intel(R) MKL-DNN v3.5.3 (Git Hash '
'66f0cb9eb66affd2da3bf5f8d897376f04aae6af)\n'
' - OpenMP 201511 (a.k.a. OpenMP 4.5)\n'
' - LAPACK is enabled (usually provided by '
'MKL)\n'
' - NNPACK is enabled\n'
' - CPU capability usage: AVX512\n'
' - CUDA Runtime 12.4\n'
' - NVCC architecture flags: '
'-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90\n'
' - CuDNN 90.1\n'
' - Magma 2.6.1\n'
' - Build settings: BLAS_INFO=mkl, '
'BUILD_TYPE=Release, CUDA_VERSION=12.4, '
'CUDNN_VERSION=9.1.0, '
'CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, '
'CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 '
'-fabi-version=11 -fvisibility-inlines-hidden '
'-DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO '
'-DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON '
'-DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK '
'-DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE '
'-O2 -fPIC -Wall -Wextra -Werror=return-type '
'-Werror=non-virtual-dtor -Werror=bool-operation '
'-Wnarrowing -Wno-missing-field-initializers '
'-Wno-type-limits -Wno-array-bounds '
'-Wno-unknown-pragmas -Wno-unused-parameter '
'-Wno-strict-overflow -Wno-strict-aliasing '
'-Wno-stringop-overflow -Wsuggest-override '
'-Wno-psabi -Wno-error=old-style-cast '
'-Wno-missing-braces -fdiagnostics-color=always '
'-faligned-new -Wno-unused-but-set-variable '
'-Wno-maybe-uninitialized -fno-math-errno '
'-fno-trapping-math -Werror=format '
'-Wno-stringop-overflow, LAPACK_INFO=mkl, '
'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, '
'TORCH_VERSION=2.5.1, USE_CUDA=ON, USE_CUDNN=ON, '
'USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, '
'USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, '
'USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, '
'USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, '
'USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, \n',
'Python': '3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0]',
'TorchVision': '0.20.1+cu124',
'lmdeploy': '0.7.1',
'numpy_random_seed': 2147483648,
'opencompass': '0.4.1+',
'sys.platform': 'linux',
'transformers': '4.49.0'}
Reproduces the problem - code/configuration sample
eval_deepseek.py
from mmengine.config import read_base
with read_base():
# from opencompass.configs.datasets.ceval.ceval_gen import ceval_datasets
from opencompass.configs.datasets.aime2024.aime2024_llmverify_repeat8_gen_e8fcee import aime2024_datasets # 8 Run
datasets = aime2024_datasets
vllm_deepseek_r1_distill_qwen_32b_int4.py
from opencompass.models import OpenAISDK
api_meta_template = dict(
round=[
dict(role='HUMAN', api_role='HUMAN'),
dict(role='BOT', api_role='BOT', generate=True),
],
reserved_roles=[dict(role='SYSTEM', api_role='SYSTEM')],
)
models = [
dict(
abbr='deepseek-r1-32b-int4',
type=OpenAISDK,
key='EMPTY', # API key
openai_api_base='http://localhost:6006/v1', # 服务地址
path='Deepseek_32_int4', # 请求服务时的 model name
tokenizer_path='Deepseek_32_int4', # 请求服务时的 tokenizer name 或 path, 为None时使用默认tokenizer gpt-4
rpm_verbose=True, # 是否打印请求速率
meta_template=api_meta_template, # 服务请求模板
query_per_second=50, # 服务请求速率
max_out_len=32768, # 最大输出长度
max_seq_len=32768, # 最大输入长度
temperature=0.01, # 生成温度
# tok_p=0.95, # 生成温度
batch_size=8, # 批处理大小
retry=3, # 重试次数
)
]
Reproduces the problem - command or script
nohup python run.py /root/autodl-tmp/opencompass/examples/eval_deepseek.py --hf-num-gpus 2 --max-num-worker 2 --debug > DeepSeek-R1-Distill-Qwen-32B-bnb-4bit.log 2>&1 &
从日志可以看到,评估时间花了很长的时间,从 2025/03/04 22:05:57~2025/03/05 13:34:44
Reproduces the problem - error message
nohup: ignoring input
03/04 22:05:57 - OpenCompass - INFO - Current exp folder: outputs/default/20250304_220557
03/04 22:05:57 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
03/04 22:05:57 - OpenCompass - INFO - Partitioned into 2 tasks.
03/04 22:05:59 - OpenCompass - INFO - Task [deepseek-r1-32b-int4/aime2024-run0_0,deepseek-r1-32b-int4/aime2024-run1_0,deepseek-r1-32b-int4/aime2024-run2_0,deepseek-r1-32b-int4/aime2024-run3_0,deepseek-r1-32b-int4/aime2024-run4_0,deepseek-r1-32b-int4/aime2024-run5_0,deepseek-r1-32b-int4/aime2024-run6_0,deepseek-r1-32b-int4/aime2024-run7_0]
03/04 22:05:59 - OpenCompass - INFO - Try to load the data from /root/.cache/opencompass/./data/aime.jsonl
03/04 22:05:59 - OpenCompass - INFO - Start inferencing [deepseek-r1-32b-int4/aime2024-run0_0]
03/04 22:05:59 - OpenCompass - WARNING - 'Could not automatically map Deepseek_32_int4 to a tokeniser. Please use
tiktoken.get_encoding
to explicitly get the tokeniser you expect.', tiktoken encoding cannot load Deepseek_32_int403/04 22:06:00 - OpenCompass - WARNING - Can not get tokenizer automatically, will use default tokenizer gpt-4 for length calculation.
[2025-03-04 22:06:04,900] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting build dataloader
[2025-03-04 22:06:04,901] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
....
....
....
0%| | 0/2 [00:00<?, ?it/s]
Inferencing: 0%| | 0/8 [00:00<?, ?it/s]�[A03/05 12:29:35 - OpenCompass - INFO - Current RPM 1.
03/05 12:29:35 - OpenCompass - INFO - Current RPM 2.
03/05 12:29:35 - OpenCompass - INFO - Current RPM 3.
03/05 12:29:36 - OpenCompass - INFO - Current RPM 4.
03/05 12:29:36 - OpenCompass - INFO - Current RPM 5.
03/05 12:29:36 - OpenCompass - INFO - Current RPM 6.
03/05 12:29:36 - OpenCompass - INFO - Current RPM 7.
03/05 12:29:40 - OpenCompass - INFO - Current RPM 8.
Inferencing: 12%|█▎ | 1/8 [02:37<18:23, 157.68s/it]�[A
Inferencing: 50%|█████ | 4/8 [32:35<34:22, 515.69s/it]�[A
Inferencing: 100%|██████████| 8/8 [32:35<00:00, 244.42s/it]
50%|█████ | 1/2 [32:35<32:35, 1955.40s/it]
Inferencing: 0%| | 0/6 [00:00<?, ?it/s]�[A03/05 13:02:10 - OpenCompass - INFO - Current RPM 1.
03/05 13:02:10 - OpenCompass - INFO - Current RPM 2.
03/05 13:02:10 - OpenCompass - INFO - Current RPM 3.
03/05 13:02:10 - OpenCompass - INFO - Current RPM 4.
03/05 13:02:10 - OpenCompass - INFO - Current RPM 5.
03/05 13:02:11 - OpenCompass - INFO - Current RPM 6.
Inferencing: 17%|█▋ | 1/6 [02:51<14:16, 171.22s/it]�[A
Inferencing: 33%|███▎ | 2/6 [32:34<1:14:37, 1119.50s/it]�[A
Inferencing: 100%|██████████| 6/6 [32:34<00:00, 325.75s/it]
100%|██████████| 2/2 [1:05:09<00:00, 1954.89s/it]
100%|██████████| 2/2 [1:05:09<00:00, 1954.96s/it]
03/05 13:34:43 - OpenCompass - INFO - Partitioned into 8 tasks.
03/05 13:34:44 - OpenCompass - INFO - Try to load the data from /root/.cache/opencompass/./data/aime.jsonl
03/05 13:34:44 - OpenCompass - INFO - Set self.output_path to outputs/default/20250304_220557/results/deepseek-r1-32b-int4/aime2024-run0.json for current task
Traceback (most recent call last):
File "/root/autodl-tmp/opencompass/run.py", line 4, in
main()
File "/root/autodl-tmp/opencompass/opencompass/cli/main.py", line 349, in main
runner(tasks)
File "/root/autodl-tmp/opencompass/opencompass/runners/base.py", line 38, in call
status = self.launch(tasks)
File "/root/autodl-tmp/opencompass/opencompass/runners/local.py", line 136, in launch
task.run()
File "/root/autodl-tmp/opencompass/opencompass/tasks/openicl_eval.py", line 86, in run
self._score()
File "/root/autodl-tmp/opencompass/opencompass/tasks/openicl_eval.py", line 245, in _score
result = icl_evaluator.evaluate(k, n, copy.deepcopy(test_set),
File "/root/autodl-tmp/opencompass/opencompass/openicl/icl_evaluator/icl_base_evaluator.py", line 99, in evaluate
results = self.score(
File "/root/autodl-tmp/opencompass/opencompass/evaluator/generic_llm_evaluator.py", line 83, in score
self.build_inferencer()
File "/root/autodl-tmp/opencompass/opencompass/evaluator/generic_llm_evaluator.py", line 66, in build_inferencer
model = build_model_from_cfg(model_cfg=self.judge_cfg)
File "/root/autodl-tmp/opencompass/opencompass/utils/build.py", line 24, in build_model_from_cfg
return MODELS.build(model_cfg)
File "/root/autodl-tmp/.conda/envs/opencompass/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/autodl-tmp/.conda/envs/opencompass/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 74, in build_from_cfg
raise KeyError(
KeyError: '
cfg
ordefault_args
must contain the key "type", but got {}\nNone'Other information
No response
The text was updated successfully, but these errors were encountered: