You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, the infer task is able to infer correctly to the end, but the eval task will report an error.
But after adding the --debug mode, it can evaluate correctly again.
Note that base_Custom is some datasets for eval base model. When I add --debug parameter, the whole pipeline works.
Prerequisite
Type
I'm evaluating with the officially supported tasks/models/datasets.
Environment
{'CUDA available': True,
'GCC': 'gcc (GCC) 7.3.0',
'MMEngine': '0.10.6',
'MUSA available': False,
'OpenCV': '4.11.0',
'PyTorch': '2.1.0',
'PyTorch compiling details': 'PyTorch built with:\n'
' - GCC 10.2\n'
' - C++ Version: 201703\n'
' - Intel(R) MKL-DNN v3.1.1 (Git Hash '
'64f6bcbcbab628e96f33a62c3e975f8535a7bde4)\n'
' - OpenMP 201511 (a.k.a. OpenMP 4.5)\n'
' - LAPACK is enabled (usually provided by '
'MKL)\n'
' - NNPACK is enabled\n'
' - CPU capability usage: NO AVX\n'
' - Build settings: BLAS_INFO=open, '
'BUILD_TYPE=Release, '
'CXX_COMPILER=/opt/rh/devtoolset-10/root/usr/bin/c++, '
'CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 '
'-fabi-version=11 -fvisibility-inlines-hidden '
'-DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO '
'-DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER '
'-DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK '
'-DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE '
'-O2 -fPIC -Wall -Wextra -Werror=return-type '
'-Werror=non-virtual-dtor -Werror=bool-operation '
'-Wnarrowing -Wno-missing-field-initializers '
'-Wno-type-limits -Wno-array-bounds '
'-Wno-unknown-pragmas -Wno-unused-parameter '
'-Wno-unused-function -Wno-unused-result '
'-Wno-strict-overflow -Wno-strict-aliasing '
'-Wno-stringop-overflow -Wno-psabi '
'-Wno-error=pedantic -Wno-error=old-style-cast '
'-Wno-invalid-partial-specialization '
'-Wno-unused-private-field '
'-Wno-aligned-allocation-unavailable '
'-Wno-missing-braces -fdiagnostics-color=always '
'-faligned-new -Wno-unused-but-set-variable '
'-Wno-maybe-uninitialized -fno-math-errno '
'-fno-trapping-math -Werror=format '
'-Werror=cast-function-type '
'-Wno-stringop-overflow, LAPACK_INFO=open, '
'TORCH_DISABLE_GPU_ASSERTS=ON, '
'TORCH_VERSION=2.1.0, USE_CUDA=OFF, '
'USE_CUDNN=OFF, USE_EIGEN_FOR_BLAS=ON, '
'USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, '
'USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=ON, '
'USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=ON, '
'USE_OPENMP=ON, USE_ROCM=OFF, \n',
'Python': '3.10.16 (main, Dec 11 2024, 16:18:56) [GCC 11.2.0]',
'TorchVision': '0.16.0',
'lmdeploy': "not installed:No module named 'lmdeploy'",
'numpy_random_seed': 2147483648,
'opencompass': '0.3.9+',
'sys.platform': 'linux',
'transformers': '4.48.0'}
Reproduces the problem - code/configuration sample
When performing the eval task, use the following command.
python run.py --models hf_llama3_1_8b --datasets base_Custom --work-dir outputs/Llama3_1-8B-DP/ --summarizer base_Custom --max-num-workers 8
However, the infer task is able to infer correctly to the end, but the eval task will report an error.
But after adding the --debug mode, it can evaluate correctly again.
Note that base_Custom is some datasets for eval base model. When I add --debug parameter, the whole pipeline works.
Reproduces the problem - command or script
python run.py --models hf_llama3_1_8b --datasets base_Custom --work-dir outputs/Llama3_1-8B-DP/ --summarizer base_Custom --max-num-workers 8
Reproduces the problem - error message
Here are some examples of errors reported:
02/07 22:29:09 - OpenCompass - ERROR - /opencompass-main/opencompass/runners/local.py - _launch - 250 - task OpenICLEval[llama-3_1-8b-hf/sanitized_mbpp] fail, see
outputs/Llama3_1-8B-DP/20250207_162403/logs/eval/llama-3_1-8b-hf/sanitized_mbpp.out
99%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 307/309 [27:20<00:02, 1.43s/it]02/07 22:29:09 - OpenCompass - ERROR - /opencompass-main/opencompass/runners/local.py - _launch - 250 - task OpenICLEval[llama-3_1-8b-hf/race-high] fail, see
outputs/Llama3_1-8B-DP/20250207_162403/logs/eval/llama-3_1-8b-hf/race-high.out
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎| 308/309 [27:20<00:01, 1.06s/it]02/07 22:29:13 - OpenCompass - ERROR - /opencompass-main/opencompass/runners/local.py - _launch - 250 - task OpenICLEval[llama-3_1-8b-hf/GPQA_diamond] fail, see
outputs/Llama3_1-8B-DP/20250207_162403/logs/eval/llama-3_1-8b-hf/GPQA_diamond.out
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 309/309 [27:24<00:00, 5.32s/it]
02/07 22:29:13 - OpenCompass - ERROR - /opencompass-main/opencompass/runners/base.py - summarize - 64 - OpenICLEval[llama-3_1-8b-hf/agieval-gaokao-chinese] failed with code -11
02/07 22:29:13 - OpenCompass - ERROR - /opencompass-main/opencompass/runners/base.py - summarize - 64 - OpenICLEval[llama-3_1-8b-hf/agieval-gaokao-english] failed with code -11
02/07 22:29:13 - OpenCompass - ERROR - /opencompass-main/opencompass/runners/base.py - summarize - 64 - OpenICLEval[llama-3_1-8b-hf/agieval-gaokao-geography] failed with code -11
02/07 22:29:13 - OpenCompass - ERROR - /opencompass-main/opencompass/runners/base.py - summarize - 64 - OpenICLEval[llama-3_1-8b-hf/agieval-gaokao-history] failed with code -11
02/07 22:29:13 - OpenCompass - ERROR - /opencompass-main/opencompass/runners/base.py - summarize - 64 - OpenICLEval[llama-3_1-8b-hf/agieval-gaokao-biology] failed with code -11
Other information
No response
The text was updated successfully, but these errors were encountered: