gunrock · neoblizz · Jun 26, 2022 · Apr 10, 2022 · Apr 16, 2022 · May 2, 2022
diff --git a/README.md b/README.md
@@ -1,11 +1,13 @@
-# **Essentials:** High-Performance C++ GPU Graph Analytics 
+# [Essentials: High-Performance C++ GPU Graph Analytics](https://github.com/gunrock/essentials/wiki)
 [![Ubuntu](https://github.com/gunrock/essentials/actions/workflows/ubuntu.yml/badge.svg)](https://github.com/gunrock/essentials/actions/workflows/ubuntu.yml) [![Windows](https://github.com/gunrock/essentials/actions/workflows/windows.yml/badge.svg)](https://github.com/gunrock/essentials/actions/workflows/windows.yml) [![Code Quality](https://github.com/gunrock/essentials/actions/workflows/codeql-analysis.yml/badge.svg)](https://github.com/gunrock/essentials/actions/workflows/codeql-analysis.yml) [![Ubuntu: Testing](https://github.com/gunrock/essentials/actions/workflows/ubuntu-tests.yml/badge.svg)](https://github.com/gunrock/essentials/actions/workflows/ubuntu-tests.yml)
 
 **Gunrock/Essentials** is a CUDA library for graph-processing designed specifically for the GPU. It uses a **high-level**, **bulk-synchronous**, **data-centric abstraction** focused on operations on vertex or edge frontiers. Gunrock achieves a balance between performance and expressiveness by coupling high-performance GPU computing primitives and optimization strategies, particularly in the area of fine-grained load balancing, with a high-level programming model that allows programmers to quickly develop new graph primitives that scale from one to many GPUs on a node with small code size and minimal GPU programming knowledge.
 
 ## Quick Start Guide
 
-Before building Gunrock make sure you have **CUDA Toolkit**[^1] installed on your system. Other external dependencies such as `NVIDIA/thrust`, `NVIDIA/cub`, etc. are automatically fetched using `cmake`.
+- [Gunrock's Documentation](https://github.com/gunrock/essentials/wiki)
+
+Before building Gunrock make sure you have **CUDA Toolkit**[<sup>[1]</sup>](#footnotes) installed on your system. Other external dependencies such as `NVIDIA/thrust`, `NVIDIA/cub`, etc. are automatically fetched using `cmake`.
 
 ```shell
 git clone https://github.com/gunrock/essentials.git
@@ -15,20 +17,8 @@ cmake ..
 make sssp # or for all algorithms, use: make -j$(nproc)
 bin/sssp ../datasets/chesapeake/chesapeake.mtx
 ```
-[^1]: Preferred **CUDA v11.5.1 or higher** due to support for stream ordered memory allocators (e.g. `cudaFreeAsync()`).
-
-## Getting Started with Gunrock
-
-- [👻 (GitHub Template) `essentials` project example](https://github.com/gunrock/applications)
-- [Gunrock's documentation](https://github.com/gunrock/essentials/wiki)
-- [Gunrock's overview](https://github.com/gunrock/essentials/wiki/Overview)
-- [Gunrock's programming model](https://github.com/gunrock/essentials/wiki/Programming-Model)
-- [Publications](https://github.com/gunrock/essentials/wiki/Publications) and [presentations](https://github.com/gunrock/essentials/wiki/Presentations)
-- [Essentials](https://github.com/gunrock/essentials) versus [Gunrock](https://github.com/gunrock/gunrock)[^2]
 
-[^2]: Essentials is the future of Gunrock. The idea is to take the lessons learned from Gunrock to a new design, which simplifies the effort it takes to **(1)** implement graph algorithms, **(2)** add internal optimizations, **(3)** conduct future research. One example is Gunrock's SSSP, implemented in 4-5 files with 1000s of lines of code versus in essentials, it is a single file with less than 200 lines of code. Our end goal with essentials is possibly releasing it as a `v2.0.0` for Gunrock.
-
-## How to Cite Gunrock
+## How to Cite Gunrock & Essentials
 Thank you for citing our work.
 
 ```tex
@@ -53,6 +43,26 @@ Thank you for citing our work.
 }
 ```
 
+```tex
+@InProceedings{Osama:2022:EOP,
+  author =	 {Muhammad Osama and Serban D. Porumbescu and John D. Owens},
+  title =	 {Essentials of Parallel Graph Analytics},
+  booktitle =	 {Proceedings of the Workshop on Graphs,
+                  Architectures, Programming, and Learning},
+  year =	 2022,
+  series =	 {GrAPL 2022},
+  month =	 may,
+  pages =	 {314--317},
+  doi =		 {10.1109/IPDPSW55747.2022.00061},
+  url =          {https://escholarship.org/uc/item/2p19z28q},
+}
+```
+
 ## Copyright and License
 
 Gunrock is copyright The Regents of the University of California. The library, examples, and all source code are released under [Apache 2.0](https://github.com/gunrock/essentials/blob/master/LICENSE).
+
+<a class="anchor" id="1"></a>
+## Footnotes
+1. Preferred **CUDA v11.5.1 or higher** due to support for stream ordered memory allocators (e.g. `cudaFreeAsync()`).
+2. Essentials is intended as a future release of Gunrock. You can read more about in our vision paper: [Essentials of Parallel Graph Analytics](https://escholarship.org/content/qt2p19z28q/qt2p19z28q_noSplash_38a658bccc817ba025517311a776840f.pdf).
diff --git a/benchmarks/CMakeLists.txt b/benchmarks/CMakeLists.txt
@@ -1,19 +1,39 @@
 set(BENCHMARK_SOURCES
   for.cu
+  bc_bench.cu
+  bfs_bench.cu
+  color_bench.cu
+  geo_bench.cu
+  hits_bench.cu
+  kcore_bench.cu
+  mst_bench.cu
+  ppr_bench.cu
+  pr_bench.cu
+  spgemm_bench.cu
+  spmv_bench.cu
+  sssp_bench.cu
+  tc_bench.cu
 )
 
 foreach(SOURCE IN LISTS BENCHMARK_SOURCES)
   get_filename_component(BENCHMARK_NAME ${SOURCE} NAME_WLE)
   add_executable(${BENCHMARK_NAME} ${SOURCE})
-  target_link_libraries(${BENCHMARK_NAME} 
-    PRIVATE essentials
-    PRIVATE nvbench::main
-  )
-  get_target_property(ESSENTIALS_ARCHITECTURES 
+  if(SOURCE MATCHES "for.cu")
+    target_link_libraries(${BENCHMARK_NAME}
+      PRIVATE essentials
+      PRIVATE nvbench::main
+    )
+  else()
+    target_link_libraries(${BENCHMARK_NAME}
+      PRIVATE essentials
+      PRIVATE nvbench::nvbench
+    )
+  endif()
+  get_target_property(ESSENTIALS_ARCHITECTURES
     essentials CUDA_ARCHITECTURES
   )
-  set_target_properties(${BENCHMARK_NAME} 
-    PROPERTIES 
+  set_target_properties(${BENCHMARK_NAME}
+    PROPERTIES
       CUDA_ARCHITECTURES ${ESSENTIALS_ARCHITECTURES}
   )
   message(STATUS "Benchmark Added: ${BENCHMARK_NAME}")

diff --git a/benchmarks/bc_bench.cu b/benchmarks/bc_bench.cu
@@ -0,0 +1,130 @@
+#include <nvbench/nvbench.cuh>
+#include <cxxopts.hpp>
+#include <gunrock/algorithms/algorithms.hxx>
+#include <gunrock/algorithms/bc.hxx>
+
+using namespace gunrock;
+using namespace memory;
+
+using vertex_t = int;
+using edge_t = int;
+using weight_t = float;
+
+std::string filename;
+
+struct parameters_t {
+  std::string filename;
+  bool help = false;
+  cxxopts::Options options;
+
+  /**
+   * @brief Construct a new parameters object and parse command line arguments.
+   *
+   * @param argc Number of command line arguments.
+   * @param argv Command line arguments.
+   */
+  parameters_t(int argc, char** argv) : options(argv[0], "BC Benchmarking") {
+    options.allow_unrecognised_options();
+    // Add command line options
+    options.add_options()("h,help", "Print help")  // help
+        ("m,market", "Matrix file",
+         cxxopts::value<std::string>());  // mtx
+
+    // Parse command line arguments
+    auto result = options.parse(argc, argv);
+
+    if (result.count("help")) {
+      help = true;
+      std::cout << options.help({""});
+      std::cout << "  [optional nvbench args]" << std::endl << std::endl;
+      // Do not exit so we also print NVBench help.
+    } else {
+      if (result.count("market") == 1) {
+        filename = result["market"].as<std::string>();
+        if (!util::is_market(filename)) {
+          std::cout << options.help({""});
+          std::cout << "  [optional nvbench args]" << std::endl << std::endl;
+          std::exit(0);
+        }
+      } else {
+        std::cout << options.help({""});
+        std::cout << "  [optional nvbench args]" << std::endl << std::endl;
+        std::exit(0);
+      }
+    }
+  }
+};
+
+void bc_bench(nvbench::state& state) {
+  // --
+  // Add metrics
+  state.collect_dram_throughput();
+  state.collect_l1_hit_rates();
+  state.collect_l2_hit_rates();
+  state.collect_loads_efficiency();
+  state.collect_stores_efficiency();
+
+  // --
+  // Define types
+  using csr_t =
+      format::csr_t<memory_space_t::device, vertex_t, edge_t, weight_t>;
+
+  // --
+  // Build graph + metadata
+  csr_t csr;
+  io::matrix_market_t<vertex_t, edge_t, weight_t> mm;
+  csr.from_coo(mm.load(filename));
+
+  thrust::device_vector<vertex_t> row_indices(csr.number_of_nonzeros);
+  thrust::device_vector<vertex_t> column_indices(csr.number_of_nonzeros);
+  thrust::device_vector<edge_t> column_offsets(csr.number_of_columns + 1);
+
+  auto G = graph::build::from_csr<memory_space_t::device,
+                                  graph::view_t::csr>(
+      csr.number_of_rows,               // rows
+      csr.number_of_columns,            // columns
+      csr.number_of_nonzeros,           // nonzeros
+      csr.row_offsets.data().get(),     // row_offsets
+      csr.column_indices.data().get(),  // column_indices
+      csr.nonzero_values.data().get(),  // values
+      row_indices.data().get(),         // row_indices
+      column_offsets.data().get()       // column_offsets
+  );
+
+  // --
+  // Params and memory allocation
+  vertex_t n_vertices = G.get_number_of_vertices();
+  thrust::device_vector<weight_t> bc_values(n_vertices);
+
+  // --
+  // Run BC with NVBench
+  state.exec(nvbench::exec_tag::sync, [&](nvbench::launch& launch) {
+    gunrock::bc::run(G, bc_values.data().get());
+  });
+}
+
+int main(int argc, char** argv) {
+  parameters_t params(argc, argv);
+  filename = params.filename;
+
+  if (params.help) {
+    // Print NVBench help.
+    const char* args[1] = {"-h"};
+    NVBENCH_MAIN_BODY(1, args);
+  } else {
+    // Create a new argument array without matrix filename to pass to NVBench.
+    char* args[argc - 2];
+    int j = 0;
+    for (int i = 0; i < argc; i++) {
+      if (strcmp(argv[i], "--market") == 0 || strcmp(argv[i], "-m") == 0) {
+        i++;
+        continue;
+      }
+      args[j] = argv[i];
+      j++;
+    }
+
+    NVBENCH_BENCH(bc_bench);
+    NVBENCH_MAIN_BODY(argc - 2, args);
+  }
+}
diff --git a/benchmarks/bfs_bench.cu b/benchmarks/bfs_bench.cu
@@ -0,0 +1,137 @@
+#include <nvbench/nvbench.cuh>
+#include <cxxopts.hpp>
+#include <gunrock/algorithms/algorithms.hxx>
+#include <gunrock/algorithms/bfs.hxx>
+
+using namespace gunrock;
+using namespace memory;
+
+using vertex_t = int;
+using edge_t = int;
+using weight_t = float;
+
+std::string filename;
+
+struct parameters_t {
+  std::string filename;
+  bool help = false;
+  cxxopts::Options options;
+
+  /**
+   * @brief Construct a new parameters object and parse command line arguments.
+   *
+   * @param argc Number of command line arguments.
+   * @param argv Command line arguments.
+   */
+  parameters_t(int argc, char** argv) : options(argv[0], "BFS Benchmarking") {
+    options.allow_unrecognised_options();
+    // Add command line options
+    options.add_options()("h,help", "Print help")  // help
+        ("m,market", "Matrix file",
+         cxxopts::value<std::string>());  // mtx
+
+    // Parse command line arguments
+    auto result = options.parse(argc, argv);
+
+    if (result.count("help")) {
+      help = true;
+      std::cout << options.help({""});
+      std::cout << "  [optional nvbench args]" << std::endl << std::endl;
+      // Do not exit so we also print NVBench help.
+    } else {
+      if (result.count("market") == 1) {
+        filename = result["market"].as<std::string>();
+        if (!util::is_market(filename)) {
+          std::cout << options.help({""});
+          std::cout << "  [optional nvbench args]" << std::endl << std::endl;
+          std::exit(0);
+        }
+      } else {
+        std::cout << options.help({""});
+        std::cout << "  [optional nvbench args]" << std::endl << std::endl;
+        std::exit(0);
+      }
+    }
+  }
+};
+
+void bfs_bench(nvbench::state& state) {
+  // --
+  // Add metrics
+  state.collect_dram_throughput();
+  state.collect_l1_hit_rates();
+  state.collect_l2_hit_rates();
+  state.collect_loads_efficiency();
+  state.collect_stores_efficiency();
+
+  // --
+  // Define types
+  using csr_t =
+      format::csr_t<memory_space_t::device, vertex_t, edge_t, weight_t>;
+
+  // --
+  // IO
+  csr_t csr;
+
+  io::matrix_market_t<vertex_t, edge_t, weight_t> mm;
+  csr.from_coo(mm.load(filename));
+
+  thrust::device_vector<vertex_t> row_indices(csr.number_of_nonzeros);
+  thrust::device_vector<vertex_t> column_indices(csr.number_of_nonzeros);
+  thrust::device_vector<edge_t> column_offsets(csr.number_of_columns + 1);
+
+  // --
+  // Build graph + metadata
+  auto G = graph::build::from_csr<memory_space_t::device,
+                                  graph::view_t::csr>(
+      csr.number_of_rows,               // rows
+      csr.number_of_columns,            // columns
+      csr.number_of_nonzeros,           // nonzeros
+      csr.row_offsets.data().get(),     // row_offsets
+      csr.column_indices.data().get(),  // column_indices
+      csr.nonzero_values.data().get(),  // values
+      row_indices.data().get(),         // row_indices
+      column_offsets.data().get()       // column_offsets
+  );
+
+  // --
+  // Params and memory allocation
+  vertex_t single_source = 0;
+
+  vertex_t n_vertices = G.get_number_of_vertices();
+  thrust::device_vector<vertex_t> distances(n_vertices);
+  thrust::device_vector<vertex_t> predecessors(n_vertices);
+
+  // --
+  // Run BFS with NVBench
+  state.exec(nvbench::exec_tag::sync, [&](nvbench::launch& launch) {
+    gunrock::bfs::run(G, single_source, distances.data().get(),
+                      predecessors.data().get());
+  });
+}
+
+int main(int argc, char** argv) {
+  parameters_t params(argc, argv);
+  filename = params.filename;
+
+  if (params.help) {
+    // Print NVBench help.
+    const char* args[1] = {"-h"};
+    NVBENCH_MAIN_BODY(1, args);
+  } else {
+    // Create a new argument array without matrix filename to pass to NVBench.
+    char* args[argc - 2];
+    int j = 0;
+    for (int i = 0; i < argc; i++) {
+      if (strcmp(argv[i], "--market") == 0 || strcmp(argv[i], "-m") == 0) {
+        i++;
+        continue;
+      }
+      args[j] = argv[i];
+      j++;
+    }
+
+    NVBENCH_BENCH(bfs_bench);
+    NVBENCH_MAIN_BODY(argc - 2, args);
+  }
+}