Skip to content

Commit

Permalink
[lang] Use less gpu memory when building sparse matrix (taichi-dev#6781)
Browse files Browse the repository at this point in the history
Issue: taichi-dev#2906 

### Brief Summary
The `read_int` function of ndarray consumes more than 100M gpu memory.
It's better to use `memcpy_device_to_host` function to obtain
`num_triplets_`.
  • Loading branch information
FantasyVR authored and quadpixels committed May 13, 2023
1 parent 38d86f9 commit 924403d
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion taichi/program/sparse_matrix.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,8 @@ std::unique_ptr<SparseMatrix> SparseMatrixBuilder::build_cuda() {
built_ = true;
auto sm = make_cu_sparse_matrix(rows_, cols_, dtype_);
#ifdef TI_WITH_CUDA
num_triplets_ = ndarray_data_base_ptr_->read_int(std::vector<int>{0});
CUDADriver::get_instance().memcpy_device_to_host(
&num_triplets_, (void *)get_ndarray_data_ptr(), sizeof(int));
auto len = 3 * num_triplets_ + 1;
std::vector<float32> trips(len);
CUDADriver::get_instance().memcpy_device_to_host(
Expand Down

0 comments on commit 924403d

Please sign in to comment.