Skip to content

Latest commit

 

History

History
259 lines (209 loc) · 6.18 KB

2024-11-20.md

File metadata and controls

259 lines (209 loc) · 6.18 KB
marp theme headingDivider
true
ginkgo
2

Basic Concepts in Ginkgo

Early-Adopter Session 27.11.2024

Topics

  • Concept: Executors
  • Concept: LinOps
  • Hello World!
  • Creating Matrices
  • Concept: Factories

What is Ginkgo?

Numerical Sparse Linear Algebra Library https://github.com/ginkgo-project/ginkgo

Supports GPUs and Manycore architecture

Focus on Portability and Composability

Written in modern C++

Comprehensive test and benchmark framework

Integrations into MFEM, deal.II, OpenFOAM, ...

Available on Spack

Basic Concepts: Executor

Represents a single device to run HPC code on:

  • Singlecore CPU (Reference) auto exec = gko::ReferenceExecutor::create();
  • Multicore CPU (OpenMP) auto exec = gko::OmpExecutor::create();

Basic Concepts: Executor

Represents a single device to run HPC code on:

  • NVIDIA GPU (CUDA)
    auto host_exec = gko::OmpExecutor::create();
    auto exec = gko::CudaExecutor::create(device_id, host_exec);
  • AMD GPU (HIP/ROCm)
    auto host_exec = gko::OmpExecutor::create();
    auto exec = gko::HipExecutor::create(device_id, host_exec);
  • Intel GPU (SYCL)
    auto host_exec = gko::OmpExecutor::create();
    auto exec = gko::DpcppExecutor::create(device_id, host_exec);

Basic Concepts: Executor

Responsibilities:

  • Memory allocation
  • Copying data
  • Executing kernels

Passed to other functions, not used directly!

void gko::func(std::shared_ptr<const gko::Executor> exec, ...);

Basic Concepts: LinOp

<style scoped> section ul, section p { font-size: 27px } </style>

Linear Operator representing:

  • Matrices (Dense, Sparse CSR, COO, ELL, ...) $$y = A\cdot x$$
  • Solvers (Krylov CG, GMRES,..., Direct, Triangular) $$S = A^{-1}, y = S\cdot x$$
  • Preconditioners (Block-Jacobi, Multigrid, ILU(0), ...) $$M \approx A^{-1}, y = M\cdot x$$
  • Compositions and linear combination $$ C = A\cdot B, y = C\cdot x $$ $$ C = \alpha A + \beta B, y = C\cdot x$$
  • Factorizations (ILU, LU, Cholesky, ...) $$ L\cdot U = A, y = LU \cdot x$$

Basic Concepts: LinOp

  • Functionality:
    • Matrix (multi-)vector products
      A->apply(x, y);               // y = A * x
      A->apply(alpha, x, beta, y);  // y = alpha * A * x + beta * y
  • Properties:
    • Executor
      • Where is the data stored?
      • Where are operations executed?
    • Dimensions $n \times m$
  • Compatibility:
    • Ensures all inputs are on the operator's executor

Creating Matrices

Equivalent across all matrix formats:

  • From C++ initializer lists (scalars, tiny matrices) auto mtx = gko::initialize<Mtx>({{0.0, 1.0}, {1.0, 0.0}}, exec);
  • From MatrixMarket files auto mtx = gko::read<Mtx>(std::ifstream{"A.mtx"}, exec);
  • From an existing matrix
    auto mtx = other_mtx->clone();
    auto mtx = gko::clone(exec, mtx);
  • From a matrix in a different format
    auto mtx = Mtx::create(exec);
    other_mtx->convert_to(mtx);

Matrix Assembly

<style scoped> section pre, section ul { margin: 0.5em 0 0; } section pre > code { font-size: 18px; } section p, section li{ font-size: 24px } </style>

Three equivalent ways of specifying matrix values:

  • gko::matrix_data for sequential assembly (no duplicates)
    gko::matrix_data<> data{gko::dim<2>{num_rows, num_cols}};
    data.nonzeros.emplace_back(row, column, value);
    mtx->read(data);
  • gko::matrix_assembly_data for sequential assembly
    gko::matrix_assembly_data<> data{gko::dim<2>{num_rows, num_cols}};
    data.set_value(row, column, value);
    data.add_value(row, column, value);
    mtx->read(data);
  • gko::device_matrix_data for high-performance assembly
    device_matrix_data<> data{exec, dim<2>{nrows, ncols}, nnz};
    run_assembly_kernel(data.get_row_idxs(),
                        data.get_col_idxs(),
                        data.get_values()); // runs on device
    data.sum_duplicates(); // optional, e.g. for FEM edge DoFs
    mtx->read(data);

Hello World

#include <ginkgo/ginkgo.hpp>

int main() {
    using Mtx = gko::matrix::Csr<double, int>;
    using Vec = gko::matrix::Dense<double>;
    auto exec = gko::ReferenceExecutor::create();
    auto mtx = gko::initialize<Mtx>({{0.0, 1.0}, {1.0, 0.0}}, exec);
    auto vec = gko::initialize<Vec>({2.0, 3.0}, exec);
    auto result = vec->clone();
    mtx->apply(vec, result);
    gko::write(std::cout, result);
}

Changing Executors

#include <ginkgo/ginkgo.hpp>

int main() {
    using Mtx = gko::matrix::Csr<double, int>;
    using Vec = gko::matrix::Dense<double>;
    auto exec = gko::CudaExecutor(0, gko::OmpExecutor::create());
    auto mtx = gko::initialize<Mtx>({{0.0, 1.0}, {1.0, 0.0}}, exec);
    auto vec = gko::initialize<Vec>({2.0, 3.0}, exec);
    auto result = vec->clone();
    mtx->apply(vec, result);
    gko::write(std::cout, result);
}

Explicit Types

#include <ginkgo/ginkgo.hpp>

int main() {
    using Mtx = gko::matrix::Csr<double, int>;
    using Vec = gko::matrix::Dense<double>;
    std::shared_ptr<gko::CudaExecutor> exec = 
    	gko::CudaExecutor::create(0, gko::OmpExecutor::create());
    std::unique_ptr<Mtx> mtx = gko::initialize<Mtx>({{0.0, 1.0}, {1.0, 0.0}}, 
                                                    exec);
    std::unique_ptr<Vec> vec = gko::initialize<Vec>({2.0, 3.0}, exec);
    std::unique_ptr<Vec> result = vec->clone();
    mtx->apply(vec, result);
    gko::write(std::cout, result);
}

Matrix Assembly

#include <ginkgo/ginkgo.hpp>

int main() {
    using Mtx = gko::matrix::Csr<double, int>;
    using Vec = gko::matrix::Dense<double>;
    using dim = gko::dim<2>;
    using matrix_data = gko::matrix_data<double, int>;
    auto exec = gko::ReferenceExecutor::create();
    matrix_data data{dim{10, 10}};
    auto mtx = Mtx::create(exec);
    auto x = Vec::create(exec, dim{10, 1});
    auto y = Vec::create(exec, dim{10, 1});
    for (int row = 0; row < 10; row++) {
        data.nonzeros.emplace_back(row, row, 2.0);
        data.nonzeros.emplace_back(row, (row + 1) % 10, -1.0);
        data.nonzeros.emplace_back(row, (row + 9) % 10, -1.0);
        x->at(row, 0) = 1.0; // only works on CPU executors
    }
    mtx->read(data);
    mtx->apply(x, y);
    gko::write(std::cout, y);
}