-
Notifications
You must be signed in to change notification settings - Fork 0
License
sfilippone/psblas3-ext
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is the extenstions plugin for PSBLAS, version 1.3.1. Version 1.3.1 supports versions of CUDA up to 10, or from 11.3. In version 11, support for HYB has been dropped. This package contains: 1. Extended matrix formats: ELLPACK, Hacked ELLPACK, DIAgonals, Hacked DIAgonals. Note: DIA and HDIA have limited support. 2. A GPU plugin: gpu-enabled versions of the above, with the CUDA code from https://github.com/davidebarbieri/spgpu, plus interfaces to CSR and HYB formats available in the NVIDIA CuSPARSE lib. Note: DIAG and HDIAG have limited support. 3. RSB: an interface to http://sourceforge.net/projects/librsb The architectural ideas of the GPU plugin are explained in V. Cardellini, S. Filippone and D. Rouson Design Patterns for Scientific Computations on Sparse Matrices Scientific Computing, 22(2014), pp. 1--19. The architecture of the Fortran 2003 sparse BLAS is described in S. Filippone, A. Buttari: Object-Oriented Techniques for Sparse Matrix Computations in Fortran 2003, ACM Trans. on Math. Software, vol. 38, No. 4, 2012. PREREQUISITES To build this code you need to have PSBLAS 3.3.0 or later, together with its prerequisites. To make use of the NVIDIA GPU you'll need: 1. An installation of the CUDA toolkit (version 4.1 or later); 2. The SPGPU code from http://spgpu.googlecode.com INSTALLING ./configure --prefix=/path/to/install \ --with-psblas=/path/to/PSBLAS/install \ --with-cuda=/CUDA/install \ --with-spgpu=/SPGPU/install \ --with-librsb=/LIBRSB/install make; make install Note: we have only tested with GNU Fortran compiler. Note: CUDA nvcc typically lags behind the latest versions of GCC/GNU Fortran; currently nvcc supports GCC 4.8 so this is the preferred choice. Mixing SPGPU CUDA code compiled with an older version and the rest with e.g. 4.9 has worked fine so far: YMMV. TODO Improve MPI support. WHAT IS NOT HERE Good preconditioners for the GPU. Performance of triangular system solves on the GPU is very bad: we enable it in CSRG and HYBG, we do not even bother to implement it in ELG and HLG. So if you use the GPU, you are limited to no preconditioning, or diagonal scaling. We are working on an independent plugin for mld2p4 (www.mld2p4.it) that will deliver better alternatives based on approximate inverses. Report bugs to: https://github.com/sfilippone/psblas3-ext/issues Contributors Salvatore Filippone Alessandro Fanfarillo
About
No description, website, or topics provided.
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published