Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dpctl does not detect gpu devices after latest dppy/label/dev push ? #1006

Closed
fcharras opened this issue Dec 5, 2022 · 14 comments
Closed

dpctl does not detect gpu devices after latest dppy/label/dev push ? #1006

fcharras opened this issue Dec 5, 2022 · 14 comments

Comments

@fcharras
Copy link
Contributor

fcharras commented Dec 5, 2022

I think something broke regarding gpu detection after push from 2022-12-03

Here are the installation instructions I'm using: (tested both in fresh ubuntu 2004 and 2204 docker containers)

apt-get update --quiet
apt-get install -y wget
apt-get install -y intel-opencl-icd
# or
# apt-get install -y python3
# wget https://raw.githubusercontent.com/intel/llvm/sycl/devops/scripts/get_release.py
# wget https://raw.githubusercontent.com/intel/llvm/sycl/devops/scripts/install_drivers.sh
# compute_runtime_tag=latest igc_tag=latest cm_tag=latest tbb_tag=latest fpgaemu_tag=latest cpu_tag=latest bash install_drivers.sh --all
cd
wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
bash Mambaforge-Linux-x86_64.sh
source .bashrc
mamba create -n my-dpex-env numba-dpex dpcpp_linux-64 -c conda-forge -c dppy/label/dev -c intel
mamba env config vars set OCL_ICD_FILENAMES_RESET=1 OCL_ICD_FILENAMES=libintelocl.so -n my-dpex-env
mamba activate my-dpex-env
python -c "import dpctl; print(dpctl.get_devices())"

I'm almost certain this last python command would print both cpu and gpu devices when tested on 2022-12-02 but from today on I tried several variations of those commands and can't get any gpu device to be detected. On the other hand it seems that setting OCL_ICD_FILENAMES_RESET=1 OCL_ICD_FILENAMES=libintelocl.so is not required anymore for cpu to be detected.

I think @oleksandr-pavlyk was trying to bundle the drivers in the conda installation tree maybe this is related ? my guess would be that now dpctl is correctly using libintelocl.so in the conda prefix, but it stopped using ls /etc/OpenCL/vendors/intel.icd (or some other thing it was using before) to get to libigdrcl.so ?

@oleksandr-pavlyk
Copy link
Collaborator

@fcharras Could you please specify the output of python -c "import dpctl; dpctl.lsplatform(verbosity=2)"?

The changes to bundling of drivers in conda installation tree has been been published yet. Something else is going on here.

@fcharras
Copy link
Contributor Author

fcharras commented Dec 5, 2022

Platform  0 ::
    Name        Intel(R) OpenCL
    Version     OpenCL 3.0 LINUX
    Vendor      Intel(R) Corporation
    Backend     opencl
    Num Devices 1
      # 0
        Name                11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
        Version             2022.14.7.0.30_160000
        Filter string       opencl:cpu:0
Platform  1 ::
    Name        SYCL host platform
    Version     1.2
    Vendor      unknown
    Backend     unknown
    Num Devices 1
      # 0
        Name                SYCL host device
        Version             1.2
        Filter string       host:host:0

@oleksandr-pavlyk
Copy link
Collaborator

@fcharras It may be that DPC++ RT is encountering difficulties with drivers. This can be diagnosed with setting SYCL_PI_TRACE=1, e.g. use SYCL_PI_TRACE=1 python -c "import dpctl; dpctl.lsplatform()".

At the top of the output I am seeing

SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_opencl.so
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_level_zero.so

Also it would be helpful to ensure that no inconsistency exists in your environment. Can you please include the output of conda list --explicit, at least portions related to DPC++ runtimes and to dpctl.

@fcharras
Copy link
Contributor Author

fcharras commented Dec 5, 2022

(my-dpex-env) root@5d0bb7588cb5:~# SYCL_PI_TRACE=1 python -c "import dpctl; dpctl.lsplatform()"
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_opencl.so
Intel(R) OpenCL OpenCL 3.0 LINUX
SYCL host platform 1.2

@oleksandr-pavlyk
Copy link
Collaborator

@fcharras Further diagnostics depends on whether you use conda-forge's OpenCL loader, or the one obtained from Intel.

The difference is that conda-forge's will only use $PREFIX/etc/OpenCL/vendors, while Intel's will use /etc/OpenCL/vendors. Since GPU drivers are not presently available packages in conda environment, please make sure you use intel-opencl-rt from intel channel, not from conda-forge channel.

@fcharras
Copy link
Contributor Author

fcharras commented Dec 6, 2022

Using mamba create -n my-dpex-env numba-dpex dpcpp_linux-64 -c dppy/label/dev -c intel rather than mamba create -n my-dpex-env numba-dpex dpcpp_linux-64 -c conda-forge -c dppy/label/dev -c intel (so install everything from intel channel except for dppy/label/dev packages) gives the same outcome.

@fcharras
Copy link
Contributor Author

fcharras commented Dec 6, 2022

Output of conda list --explicit in both cases:

(my-dpex-env) root@0c86bca4508f:~# conda list --explicit
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
@EXPLICIT
https://conda.anaconda.org/intel/linux-64/_libgcc_mutex-0.1-main.tar.bz2
https://conda.anaconda.org/intel/linux-64/ca-certificates-2022.07.19-h06a4308_0.tar.bz2
https://conda.anaconda.org/intel/linux-64/intel-cmplr-lib-rt-2022.2.1-intel_16953.tar.bz2
https://conda.anaconda.org/intel/linux-64/intel-cmplr-lic-rt-2022.2.1-intel_16953.tar.bz2
https://conda.anaconda.org/intel/linux-64/intel-openmp-2022.2.1-intel_16953.tar.bz2
https://conda.anaconda.org/intel/linux-64/intelpython-2022.2.0-0.tar.bz2
https://conda.anaconda.org/intel/linux-64/libstdcxx-ng-11.2.0-h1234567_1.tar.bz2
https://conda.anaconda.org/intel/linux-64/tbb-2021.7.1-intel_15005.tar.bz2
https://conda.anaconda.org/intel/linux-64/intel-opencl-rt-2022.2.1-intel_16953.tar.bz2
https://conda.anaconda.org/intel/linux-64/libffi-3.3-14.tar.bz2
https://conda.anaconda.org/intel/linux-64/libgomp-11.2.0-h1234567_1.tar.bz2
https://conda.anaconda.org/intel/linux-64/mkl-2022.2.1-intel_16993.tar.bz2
https://conda.anaconda.org/intel/linux-64/_openmp_mutex-4.5-1_gnu.tar.bz2
https://conda.anaconda.org/intel/linux-64/libgcc-ng-11.2.0-h1234567_1.tar.bz2
https://conda.anaconda.org/intel/linux-64/bzip2-1.0.8-hb9a14ef_9.tar.bz2
https://conda.anaconda.org/intel/linux-64/ncurses-6.3-h5eee18b_3.tar.bz2
https://conda.anaconda.org/intel/linux-64/openssl-1.1.1q-h7f8727e_0.tar.bz2
https://conda.anaconda.org/intel/linux-64/spirv-tools-2020.5-h6bb024c_2.tar.bz2
https://conda.anaconda.org/intel/linux-64/xz-5.2.5-h74280d8_2.tar.bz2
https://conda.anaconda.org/intel/linux-64/zlib-1.2.12-h5eee18b_3.tar.bz2
https://conda.anaconda.org/intel/linux-64/libllvm11-11.0.0-h3826bc1_1.tar.bz2
https://conda.anaconda.org/intel/linux-64/readline-8.1.2-h7f8727e_1.tar.bz2
https://conda.anaconda.org/intel/linux-64/tk-8.6.12-h1ccaba5_0.tar.bz2
https://conda.anaconda.org/intel/linux-64/llvm-spirv-11.0.0-h4616538_1.tar.bz2
https://conda.anaconda.org/intel/linux-64/sqlite-3.39.2-h5082296_0.tar.bz2
https://conda.anaconda.org/intel/linux-64/python-3.9.13-hb9903ff_6.tar.bz2
https://conda.anaconda.org/intel/linux-64/certifi-2022.6.15-py39h06a4308_0.tar.bz2
https://conda.anaconda.org/intel/linux-64/llvmlite-0.38.1-py39h0ddac3c_0.tar.bz2
https://conda.anaconda.org/intel/linux-64/pyparsing-3.0.9-py39h06a4308_0.tar.bz2
https://conda.anaconda.org/intel/noarch/six-1.16.0-pyhd3eb1b0_1.tar.bz2
https://conda.anaconda.org/intel/linux-64/tbb4py-2021.7.1-py39_intel_15005.tar.bz2
https://conda.anaconda.org/intel/noarch/wheel-0.37.1-pyhd3eb1b0_0.tar.bz2
https://conda.anaconda.org/intel/linux-64/mkl-service-2.4.0-py39h7634626_12.tar.bz2
https://conda.anaconda.org/intel/noarch/packaging-21.3-pyhd3eb1b0_0.tar.bz2
https://conda.anaconda.org/intel/linux-64/setuptools-58.0.4-py39h06a4308_0.tar.bz2
https://conda.anaconda.org/intel/linux-64/pip-22.1.2-py39h06a4308_0.tar.bz2
https://conda.anaconda.org/intel/linux-64/dpcpp-cpp-rt-2022.2.1-intel_16953.tar.bz2
https://conda.anaconda.org/intel/linux-64/dpcpp_cpp_rt-2022.2.1-intel_16953.tar.bz2
https://conda.anaconda.org/intel/linux-64/dpcpp_impl_linux-64-2022.2.1-intel_16953.tar.bz2
https://conda.anaconda.org/intel/linux-64/icc_rt-2022.2.1-intel_16953.tar.bz2
https://conda.anaconda.org/intel/linux-64/mkl-dpcpp-2022.2.1-intel_16993.tar.bz2
https://conda.anaconda.org/intel/linux-64/dpcpp_linux-64-2022.2.1-intel_16953.tar.bz2
https://conda.anaconda.org/intel/linux-64/numpy-base-1.21.4-py39h97bc315_16.tar.bz2
https://conda.anaconda.org/intel/linux-64/mkl_fft-1.3.1-py39h1909d4f_16.tar.bz2
https://conda.anaconda.org/intel/linux-64/mkl_random-1.2.2-py39h94ca54a_16.tar.bz2
https://conda.anaconda.org/intel/linux-64/mkl_umath-0.1.1-py39h0348192_26.tar.bz2
https://conda.anaconda.org/intel/linux-64/numpy-1.21.4-py39h8dc10e9_16.tar.bz2
https://conda.anaconda.org/dppy/label/dev/linux-64/dpctl-0.14.0-py39h8c27c75_22.tar.bz2
https://conda.anaconda.org/intel/linux-64/numba-0.55.1-py39h0040107_1.tar.bz2
https://conda.anaconda.org/dppy/label/dev/linux-64/dpnp-0.11.0-py39h2bc3f7f_5.tar.bz2
https://conda.anaconda.org/dppy/label/dev/linux-64/numba-dpex-0.18.1-py39hfc4b9b4_50.tar.bz2
(my-dpex-env) root@0c86bca4508f:~# conda list --explicit
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
@EXPLICIT
https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/_sysroot_linux-64_curr_repodata_hack-3-h5bd9786_13.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/ca-certificates-2022.9.24-ha878542_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/intel-cmplr-lic-rt-2022.2.0-ha770c72_8734.tar.bz2
https://conda.anaconda.org/intel/linux-64/intel-openmp-2022.2.1-intel_16953.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/ld_impl_linux-64-2.39-hcc3a1bd_1.conda
https://conda.anaconda.org/conda-forge/linux-64/libgfortran5-12.2.0-h337968e_19.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-ng-12.2.0-h46fd767_19.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/python_abi-3.9-3_cp39.conda
https://conda.anaconda.org/conda-forge/noarch/tzdata-2022g-h191b570_0.conda
https://conda.anaconda.org/conda-forge/noarch/kernel-headers_linux-64-3.10.0-h4a8ded7_13.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libgfortran-ng-12.2.0-h69a702a_19.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/sysroot_linux-64-2.17-h4a8ded7_13.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-2_kmp_llvm.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-12.2.0-h65d4601_19.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-h7f98852_4.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/icu-70.1-h27087fc_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libffi-3.4.2-h7f98852_5.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libiconv-1.17-h166bdaf_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libnsl-2.0.0-h7f98852_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libopenblas-0.3.21-pthreads_h78a6416_3.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libuuid-2.32.1-h7f98852_1000.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libzlib-1.2.13-h166bdaf_4.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/ncurses-6.3-h27087fc_1.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/ocl-icd-2.3.1-h7f98852_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/openssl-3.0.7-h0b41bf4_1.conda
https://conda.anaconda.org/intel/linux-64/spirv-tools-2020.5-h6bb024c_2.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/tbb-2021.7.0-h924138e_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/xz-5.2.6-h166bdaf_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/intel-cmplr-lib-rt-2022.2.0-h6239696_8734.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libblas-3.9.0-16_linux64_openblas.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libsqlite-3.40.0-h753d276_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libxml2-2.10.3-h7463322_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/llvm-openmp-15.0.6-he0ac6c6_0.conda
https://conda.anaconda.org/intel/linux-64/mkl-2022.2.1-intel_16993.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/readline-8.1.2-h0f457ee_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/tk-8.6.12-h27826a3_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/zlib-1.2.13-h166bdaf_4.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/intel-opencl-rt-2022.2.0-hce74451_8734.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libcblas-3.9.0-16_linux64_openblas.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/liblapack-3.9.0-16_linux64_openblas.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/libllvm11-11.0.1-hf817b99_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/python-3.9.15-hba424b6_0_cpython.conda
https://conda.anaconda.org/conda-forge/linux-64/dpcpp-cpp-rt-2022.2.0-h27087fc_8734.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/llvm-spirv-11.0.0-h0efe328_0.tar.bz2
https://conda.anaconda.org/intel/linux-64/llvmlite-0.38.1-py39h0ddac3c_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/numpy-1.22.4-py39hc58783e_0.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/pyparsing-3.0.9-pyhd8ed1ab_0.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/setuptools-65.5.1-pyhd8ed1ab_0.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/wheel-0.38.4-pyhd8ed1ab_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/dpcpp_impl_linux-64-2022.2.0-hb01953d_8734.tar.bz2
https://conda.anaconda.org/dppy/label/dev/linux-64/dpctl-0.14.0-py39h8c27c75_22.tar.bz2
https://conda.anaconda.org/intel/linux-64/mkl-dpcpp-2022.2.1-intel_16993.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/numba-0.55.2-py39h66db6d7_0.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/packaging-21.3-pyhd8ed1ab_0.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/pip-22.3.1-pyhd8ed1ab_0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/dpcpp_linux-64-2022.2.0-h27087fc_8734.tar.bz2
https://conda.anaconda.org/dppy/label/dev/linux-64/dpnp-0.11.0-py39h2bc3f7f_5.tar.bz2
https://conda.anaconda.org/dppy/label/dev/linux-64/numba-dpex-0.18.1-py39hfc4b9b4_50.tar.bz2

@fcharras
Copy link
Contributor Author

fcharras commented Dec 6, 2022

I'm getting progress on the issue, I have it in a working state again with a more complicated procedure for installing gpu drivers and using

mamba create -n my-dpex-env numba-dpex "intel::dpcpp_linux-64" -c dppy/label/dev -c conda-forge -c intel

I'll update the guide when I understand more which driver package is necessary.

@fcharras
Copy link
Contributor Author

fcharras commented Dec 6, 2022

For the GPU, the driver issue comes down to what version of intel-icd-opencl is installed.

If using ubuntu 2204 repositories we have intel-opencl-icd (22.14.22890-1), gpu is not detected.

Using latest release:

apt-get remove  intel-opencl-icd
wget https://github.com/intel/compute-runtime/releases/download/22.43.24595.30/intel-opencl-icd_22.43.24595.30_amd64.deb 
dpkg -i intel-opencl-icd_22.43.24595.30_amd64.deb 

it works.

@fcharras
Copy link
Contributor Author

fcharras commented Dec 6, 2022

Digging deeper, it also works when installing the same version than in jammy repositories, but from the github repository.

Upon inspection, the files contain the same files. But the hashes are different:

  • from github:
b0a947fc745fabdc6fce6e2fc7afaddc  usr/bin/ocloc
f21f6c827d91b71d554237a201e4d944  usr/include/ocloc_api.h
9da139f061468f77082753f7ab2e9aed  usr/lib/x86_64-linux-gnu/intel-opencl/libigdrcl.so
93355007cf89208b64541f0ab93bd98b  usr/lib/x86_64-linux-gnu/libocloc.so
c6750f20b956645e48df4c5fa76700a8  usr/share/doc/intel-opencl-icd/changelog.gz
c80a9da2bc90e48c74a0ccf6e7112456  usr/share/doc/intel-opencl-icd/copyright
  • from jammy repositories:
608c6f47383c60c768e96e96769e73a5  usr/bin/ocloc
f21f6c827d91b71d554237a201e4d944  usr/include/ocloc_api.h
853931b595b11e4cb3fdac3d2c0e1071  usr/lib/x86_64-linux-gnu/intel-opencl/libigdrcl.so
eca15d887439a9a061071da366746c85  usr/lib/x86_64-linux-gnu/libocloc.so
a384878d401c06b1bbc82943da737160  usr/share/doc/intel-opencl-icd/changelog.Debian.gz
1318fcd717c589e69821ce57e74d44c7  usr/share/doc/intel-opencl-icd/copyright

So apparently those two are not built the same way and it's causing compatibility issues ? @oleksandr-pavlyk do you think this could be an issue on the dpctl/sycl or in the ubuntu distribution ? if ubuntu i'll forward the bug report.

@oleksandr-pavlyk
Copy link
Collaborator

Based on your last comment, this has to do with Intel Compute Runtime component (intel-opencl-icd) which implements OpenCL driver for Intel(R) UHD Graphics. Also since changing changing the libraries without changing DPC++ installation and dpctl package fixed the issue, the involvement of DPC++ runtime and dpctl libraries is unlikely.

There must be an issue with those particular libraries. I'd start with reporting to Ubuntu.

@fcharras
Copy link
Contributor Author

fcharras commented Dec 6, 2022

Maybe this issue could be closed and a new one opened with the actual issue. I think it should also be tracked as a dpctl issue, many users trying to install this stack are going to run into the same troubles, given that the apt-get based install is actually listed as recommended https://github.com/intel/compute-runtime#via-system-package-manager.

I must believe my initial report of behavior change with releases from 03/12 is wrong, and I got confused in combinations of instructions that are possible (regarding driver installation and channel parameters).

@oleksandr-pavlyk
Copy link
Collaborator

Thank you for the report and the investigations @fcharras. I am closing this one. Please open a separate issue instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants