Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] chinese simpleQA dataset is not working #1858

Open
2 tasks done
hailsham opened this issue Feb 7, 2025 · 0 comments
Open
2 tasks done

[Bug] chinese simpleQA dataset is not working #1858

hailsham opened this issue Feb 7, 2025 · 0 comments
Assignees

Comments

@hailsham
Copy link
Contributor

hailsham commented Feb 7, 2025

Prerequisite

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

{'CUDA available': True,
 'CUDA_HOME': None,
 'GCC': 'gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0',
 'GPU 0': 'NVIDIA GeForce GTX 1660',
 'MMEngine': '0.10.6',
 'MUSA available': False,
 'OpenCV': '4.11.0',
 'PyTorch': '2.5.1',
 'PyTorch compiling details': 'PyTorch built with:\n'
                              '  - GCC 9.3\n'
                              '  - C++ Version: 201703\n'
                              '  - Intel(R) oneAPI Math Kernel Library Version '
                              '2023.1-Product Build 20230303 for Intel(R) 64 '
                              'architecture applications\n'
                              '  - Intel(R) MKL-DNN v3.5.3 (Git Hash '
                              '66f0cb9eb66affd2da3bf5f8d897376f04aae6af)\n'
                              '  - OpenMP 201511 (a.k.a. OpenMP 4.5)\n'
                              '  - LAPACK is enabled (usually provided by '
                              'MKL)\n'
                              '  - NNPACK is enabled\n'
                              '  - CPU capability usage: AVX2\n'
                              '  - CUDA Runtime 12.4\n'
                              '  - NVCC architecture flags: '
                              '-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90\n'
                              '  - CuDNN 90.1\n'
                              '  - Magma 2.6.1\n'
                              '  - Build settings: BLAS_INFO=mkl, '
                              'BUILD_TYPE=Release, CUDA_VERSION=12.4, '
                              'CUDNN_VERSION=9.1.0, '
                              'CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, '
                              'CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 '
                              '-fabi-version=11 -fvisibility-inlines-hidden '
                              '-DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO '
                              '-DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON '
                              '-DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK '
                              '-DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE '
                              '-O2 -fPIC -Wall -Wextra -Werror=return-type '
                              '-Werror=non-virtual-dtor -Werror=bool-operation '
                              '-Wnarrowing -Wno-missing-field-initializers '
                              '-Wno-type-limits -Wno-array-bounds '
                              '-Wno-unknown-pragmas -Wno-unused-parameter '
                              '-Wno-strict-overflow -Wno-strict-aliasing '
                              '-Wno-stringop-overflow -Wsuggest-override '
                              '-Wno-psabi -Wno-error=old-style-cast '
                              '-Wno-missing-braces -fdiagnostics-color=always '
                              '-faligned-new -Wno-unused-but-set-variable '
                              '-Wno-maybe-uninitialized -fno-math-errno '
                              '-fno-trapping-math -Werror=format '
                              '-Wno-stringop-overflow, LAPACK_INFO=mkl, '
                              'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, '
                              'TORCH_VERSION=2.5.1, USE_CUDA=ON, USE_CUDNN=ON, '
                              'USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, '
                              'USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, '
                              'USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, '
                              'USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, '
                              'USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, \n',
 'Python': '3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0]',
 'TorchVision': '0.20.1',
 'lmdeploy': "not installed:No module named 'lmdeploy'",
 'numpy_random_seed': 2147483648,
 'opencompass': '0.4.0+',
 'sys.platform': 'linux',
 'transformers': '4.48.1'}

Reproduces the problem - code/configuration sample

chinese simpleQA is not working.

  • md5 is not correct, the md5 of chinese_simpleqa in datasets_info.py is 4bdf854b291fc0ee29da57dc47ac47b5 while the downloaded zip is 560ef92adbcd0cf698ac321aa87f73b2.
  • the downloaded zip has the following structure leading to uncorrect extracting
    fs-computility/llm/xiaolinchen/opencompass_fork/data/data/chinese_simpleqa/chinese_simpleqa.jsonl

Reproduces the problem - command or script

https://github.com/open-compass/opencompass/blob/main/opencompass/configs/datasets/chinese_simpleqa/chinese_simpleqa_gen.py

Reproduces the problem - error message

File not found or corrupted

Other information

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants