Merge branch 'dev-v0.6.0' into ahehn/approximate_alignments_bugfix

NVIDIA-Genomics-Research · Dec 4, 2020 · 8ccae4b · 8ccae4b
2 parents 7bab9e5 + fe07172
commit 8ccae4b
Show file tree

Hide file tree

Showing 16 changed files with 210 additions and 136 deletions.
diff --git a/README.md b/README.md
@@ -6,86 +6,17 @@ GenomeWorks is a GPU-accelerated library for biological sequence analysis. This
 For more detailed API documentation please refer to the [documentation](#enable-doc-generation).
 
 * Modules
-    * [cudamapper](#cudamapper) - CUDA-accelerated sequence to sequence mapping
-    * [cudapoa](#cudapoa) - CUDA-accelerated partial order alignment
-    * [cudaaligner](#cudaaligner) - CUDA-accelerated pairwise sequence alignment
-    * [cudaextender](#cudaextender) - CUDA-accelerated seed extension
+    * [cudamapper](cudamapper/README.md) - CUDA-accelerated sequence to sequence mapping
+    * [cudapoa](cudapoa/README.md) - CUDA-accelerated partial order alignment
+    * [cudaaligner](cudaaligner/README.md) - CUDA-accelerated pairwise sequence alignment
+    * [cudaextender](cudaextender/README.md) - CUDA-accelerated seed extension
 * Setup GenomeWorks
     * [Clone GenomeWorks](#clone-genomeworks)
     * [System Requirements](#system-requirements)
     * [GenomeWorks Installation](#genomeworks-setup)
 * [Python API](#genomeworks-python-api)
 * [Development Support](#development-support)
 
-### cudamapper
-
-The `cudamapper` package provides minimizer-based GPU-accelerated approximate mapping.
-
-#### Tool - *cudamapper*
-
-`cudamapper` is an end-to-end command line to for sequence to sequence mapping. `cudamapper` outputs
-mappings in the PAF format and is currently optimised for all-vs-all long read (ONT, Pacific Biosciences) sequences.
-
-To run all-vs all overlaps use the following command:
-
-`cudamapper in.fasta in.fasta`
-
-A query fasta can be mapped to a reference as follows:
-
-`cudamapper query.fasta target.fasta`
-
-To access more information about running cudamapper, run `cudamapper --help`.
-
-#### Library - *libcudamapper.so*
-
-* `Indexer` module to generate an index of minimizers from a list of sequences.
-* `Matcher` module to find locations of matching pairs of minimizers between sequences using minimizer indices.
-* `Overlapper` module to generate overlaps from sequence of minimizer matches generated by matcher.
-
-#### Sample - *sample_cudamapper*
-
-A prototypical binary highlighting the usage of `libcudamapper.so` APIs (indexer, matcher and overlapper) and
-techniques to tie them into an application.
-
-### cudapoa
-
-The `cudapoa` package provides a GPU-accelerated implementation of the [Partial Order Alignment](https://simpsonlab.github.io/2015/05/01/understanding-poa/)
-algorithm. It is heavily influenced by [SPOA](https://github.com/rvaser/spoa) and in many cases can be considered a GPU-accelerated replacement. Features include:
-
-#### Tool - *cudapoa*
-
-A command line tool for generating consensus and MSA from a list of `fasta`/`fastq` files. The tool
-is built on top of `libcudapoa.so` and showcases optimization strategies for writing high performance
-applications with `libcudapoa.so`.
-
-#### Library - *libcudapoa.so*
-
-* Generation of consensus sequences
-* Generation of multi-sequence alignments (MSAs)
-* Custom adaptive band implementation of POA
-* Support for long and short read sequences
-
-#### Sample - *sample_cudapoa*
-
-A prototypical binary to showcase the use of `libcudapoa.so` APIs.
-
-### cudaaligner
-
-The `cudaaligner` package provides GPU-accelerated global alignment. Features include:
-
-#### Library - *libcudaaligner.so*
-
-* Short and long read support
-* Banded implementation with configurable band width for flexible performance and accuracy trade-off
-
-#### Sample - *sample_cudaaligner*
-
-A prototypical binary to showcase the use of `libcudaaligner.so` APIs.
-
-### cudaextender
-The `cudaextender` package provides GPU-accelerated seed-extension. Details can be found in
-the package's readme.
-
 ## Clone GenomeWorks 
 
 ### Latest released version

diff --git a/cudaaligner/CMakeLists.txt b/cudaaligner/CMakeLists.txt
@@ -65,6 +65,7 @@ target_include_directories(${MODULE_NAME}
 target_compile_options(${MODULE_NAME} PRIVATE -Werror)
 
 add_doxygen_source_dir(${CMAKE_CURRENT_SOURCE_DIR}/include)
+add_doxygen_source_dir(${CMAKE_CURRENT_SOURCE_DIR}/README.md)
 
 # Add tests folder
 add_subdirectory(tests)

diff --git a/cudaaligner/README.md b/cudaaligner/README.md
@@ -0,0 +1,15 @@
+# cudaaligner
+
+The `cudaaligner` package provides GPU-accelerated global alignment.
+
+## Library
+Built as `libcudaaligner.[so|a]`.
+
+* Short and long read support
+* Banded implementation with configurable band width for flexible performance and accuracy trade-off
+
+APIs documented in [include](include/claraparabricks/genomeworks/cudaaligner) folder.
+
+## Sample
+[sample_cudaaligner](samples/sample_cudaaligner.cpp) - A prototypical binary to showcase the use of `libcudaaligner` APIs.
+
diff --git a/cudaextender/CMakeLists.txt b/cudaextender/CMakeLists.txt
@@ -59,6 +59,7 @@ target_include_directories(${MODULE_NAME}
         )
 
 add_doxygen_source_dir(${CMAKE_CURRENT_SOURCE_DIR}/include)
+add_doxygen_source_dir(${CMAKE_CURRENT_SOURCE_DIR}/README.md)
 
 install(TARGETS ${MODULE_NAME}
         COMPONENT gwlogging

diff --git a/cudaextender/README.md b/cudaextender/README.md
@@ -1,27 +1,26 @@
 # cudaextender
 
-## Overview
 This package implements CUDA-accelerated seed-extension algorithms that use seed positions in 
 encoded input strands to extend and compute the alignment between the strands. 
 Currently this module implements the ungapped X-drop algorithm, adapted from 
 [SegAlign's](https://github.com/gsneha26/SegAlign) Ungapped Extender authored by 
 Sneha Goenka (gsneha@stanford.edu) and Yatish Turakhia (yturakhi@uscs.edu).
 
-### Encoded Input
-`cudaextender` expects the input strands to be encoded as integer sequences. 
-This encoding scheme is documented here: [utils.hpp](include/claraparabricks/genomeworks/cudaextender/utils.hpp)
-file. The provided `encode_sequence()` helper function will encode the input strands on CPU with
-the expected scheme. 
+## Library
+Built as `libcudaextender.[so|a]`
+
+* Ungapped X-Drop extension
 
-### API
 `cudaextender` provides host and device pointer APIs to enable ease of integration with other
 producer/consumer modules. The user is expected to handle all memory transactions and device
 sychronizations for the device pointer API. The host pointer API abstracts those operations away.
 Both APIs are documented here: [extender.hpp](include/claraparabricks/genomeworks/cudaextender/extender.hpp)
 
-### Library - *libcudaextender.so*
-Features:
-* Ungapped X-Drop extension
+### Encoded Input
+`cudaextender` expects the input strands to be encoded as integer sequences. 
+This encoding scheme is documented here: [utils.hpp](include/claraparabricks/genomeworks/cudaextender/utils.hpp)
+file. The provided `encode_sequence()` helper function will encode the input strands on CPU with
+the expected scheme. 
 
-### Sample - *[sample_cudaextender.cpp](samples/sample_cudaextender.cpp)*
-Protoype to show the usage of host and device pointer APIs on FASTA sequences.
+## Sample
+[sample_cudaextender](samples/sample_cudaextender.cpp) - Protoype to show the usage of host and device pointer APIs on FASTA sequences.
diff --git a/cudamapper/CMakeLists.txt b/cudamapper/CMakeLists.txt
@@ -83,6 +83,7 @@ if (gw_optimize_for_native_cpu)
 endif()
 
 add_doxygen_source_dir(${CMAKE_CURRENT_SOURCE_DIR}/include)
+add_doxygen_source_dir(${CMAKE_CURRENT_SOURCE_DIR}/README.md)
 
 cuda_add_executable(${MODULE_NAME}-bin
         src/main.cu

diff --git a/cudamapper/README.md b/cudamapper/README.md
@@ -0,0 +1,31 @@
+# cudamapper
+
+The `cudamapper` package provides minimizer-based GPU-accelerated approximate mapping.
+
+## Library
+Built as `libcudamapper.[so|a]`
+
+* `Indexer` module to generate an index of minimizers from a list of sequences.
+* `Matcher` module to find locations of matching pairs of minimizers between sequences using minimizer indices.
+* `Overlapper` module to generate overlaps from sequence of minimizer matches generated by matcher.
+
+APIs documented in [include](include/claraparabricks/genomeworks/cudamapper) folder.
+
+## Sample
+[sample_cudamapper](samples/sample_cudamapper.cpp) - A prototypical binary highlighting the usage of `libcudamapper` APIs (indexer, matcher and overlapper) and
+techniques to tie them into an application.
+
+## Tool
+
+`cudamapper` is an end-to-end command line to for sequence to sequence mapping. `cudamapper` outputs
+mappings in the PAF format and is currently optimised for all-vs-all long read (ONT, Pacific Biosciences) sequences.
+
+To run all-vs all overlaps use the following command:
+
+`cudamapper in.fasta in.fasta`
+
+A query fasta can be mapped to a reference as follows:
+
+`cudamapper query.fasta target.fasta`
+
+To access more information about running cudamapper, run `cudamapper --help`.
diff --git a/cudapoa/CMakeLists.txt b/cudapoa/CMakeLists.txt
@@ -74,6 +74,7 @@ target_include_directories(${MODULE_NAME}
 )
 
 add_doxygen_source_dir(${CMAKE_CURRENT_SOURCE_DIR}/include)
+add_doxygen_source_dir(${CMAKE_CURRENT_SOURCE_DIR}/README.md)
 
 add_executable(${MODULE_NAME}-bin
         src/main.cpp

diff --git a/cudapoa/README.md b/cudapoa/README.md
@@ -0,0 +1,24 @@
+# CUDAPOA
+
+The `cudapoa` package provides a GPU-accelerated implementation of the [Partial Order Alignment](https://simpsonlab.github.io/2015/05/01/understanding-poa/)
+algorithm. It is heavily influenced by [SPOA](https://github.com/rvaser/spoa) and in many cases can be considered a GPU-accelerated replacement.
+
+## Library
+Built as `libcudapoa.[so|a]`
+
+* Generation of consensus sequences
+* Generation of multi-sequence alignments (MSAs)
+* Custom adaptive band implementation of POA
+* Support for long and short read sequences
+
+APIs documented in [include](include/claraparabricks/genomeworks/cudapoa) folder.
+
+## Sample
+[sample_cudapoa](samples/sample_cudapoa.cpp) - A prototypical binary to showcase the use of `libcudapoa` APIs.
+
+## Tool
+
+A command line tool for generating consensus and MSA from a list of `fasta`/`fastq` files. The tool
+is built on top of `libcudapoa` and showcases optimization strategies for writing high performance
+applications with `libcudapoa`.
+
diff --git a/cudapoa/include/claraparabricks/genomeworks/cudapoa/cudapoa.hpp b/cudapoa/include/claraparabricks/genomeworks/cudapoa/cudapoa.hpp
@@ -16,6 +16,8 @@
 
 #pragma once
 
+#include <string>
+
 namespace claraparabricks
 {
 
@@ -38,13 +40,18 @@ enum StatusType
     node_count_exceeded_maximum_graph_size,
     edge_count_exceeded_maximum_graph_size,
     exceeded_adaptive_banded_matrix_size,
-    seq_len_exceeded_maximum_nodes_per_window,
+    exceeded_maximum_predecessor_distance,
     loop_count_exceeded_upper_bound,
     output_type_unavailable,
-    generic_error,
-    exceeded_maximum_predecessor_distance
+    generic_error
 };
 
+/// Generate corresponding error message for a given error type
+/// \param [in] error_type input error code
+/// \param [out] error_message corresponding error message
+/// \param [out] error_hint possible hint to resolve the error
+void decode_error(StatusType error_type, std::string& error_message, std::string& error_hint);
+
 /// Banding mode used in Needleman-Wunsch algorithm
 /// - full_band performs computations on full scores matrix, highest accuracy
 /// - static_band performs computations on a fixed band along scores matrix diagonal, fastest implementation

diff --git a/cudapoa/samples/sample_cudapoa.cpp b/cudapoa/samples/sample_cudapoa.cpp
@@ -64,6 +64,7 @@ std::unique_ptr<Batch> initialize_batch(bool msa, const BatchConfig& batch_size)
 void process_batch(Batch* batch, bool msa_flag, bool print, std::vector<int32_t>& list_of_group_ids, int id_offset)
 {
     batch->generate_poa();
+    std::string error_message, error_hint;
 
     StatusType status = StatusType::success;
     if (msa_flag)
@@ -75,14 +76,20 @@ void process_batch(Batch* batch, bool msa_flag, bool print, std::vector<int32_t>
         status = batch->get_msa(msa, output_status);
         if (status != StatusType::success)
         {
-            std::cerr << "Could not generate MSA for batch : " << status << std::endl;
+            decode_error(status, error_message, error_hint);
+            std::cerr << "Could not generate MSA for batch : " << std::endl;
+            std::cerr << error_message << std::endl
+                      << error_hint << std::endl;
         }
 
         for (int32_t g = 0; g < get_size(msa); g++)
         {
             if (output_status[g] != StatusType::success)
             {
-                std::cerr << "Error generating  MSA for POA group " << list_of_group_ids[g + id_offset] << ". Error type " << output_status[g] << std::endl;
+                decode_error(output_status[g], error_message, error_hint);
+                std::cerr << "Error generating  MSA for POA group " << list_of_group_ids[g + id_offset] << std::endl;
+                std::cerr << error_message << std::endl
+                          << error_hint << std::endl;
             }
             else
             {
@@ -106,14 +113,20 @@ void process_batch(Batch* batch, bool msa_flag, bool print, std::vector<int32_t>
         status = batch->get_consensus(consensus, coverage, output_status);
         if (status != StatusType::success)
         {
-            std::cerr << "Could not generate consensus for batch : " << status << std::endl;
+            decode_error(status, error_message, error_hint);
+            std::cerr << "Could not generate consensus for batch : " << std::endl;
+            std::cerr << error_message << std::endl
+                      << error_hint << std::endl;
         }
 
         for (int32_t g = 0; g < get_size(consensus); g++)
         {
             if (output_status[g] != StatusType::success)
             {
-                std::cerr << "Error generating consensus for POA group " << list_of_group_ids[g + id_offset] << ". Error type " << output_status[g] << std::endl;
+                decode_error(output_status[g], error_message, error_hint);
+                std::cerr << "Error generating  consensus for POA group " << list_of_group_ids[g + id_offset] << std::endl;
+                std::cerr << error_message << std::endl
+                          << error_hint << std::endl;
             }
             else
             {
@@ -213,6 +226,9 @@ int main(int argc, char** argv)
         }
     }
 
+    // for error code message
+    std::string error_message, error_hint;
+
     // analyze the POA groups and create a minimal set of batches to process them all
     std::vector<BatchConfig> list_of_batch_sizes;
     std::vector<std::vector<int32_t>> list_of_groups_per_batch;
@@ -304,7 +320,10 @@ int main(int argc, char** argv)
 
             if (status != StatusType::exceeded_maximum_poas && status != StatusType::success)
             {
-                std::cout << "Could not add POA group " << batch_group_ids[i] << " to batch " << b << ". Error code " << status << std::endl;
+                decode_error(status, error_message, error_hint);
+                std::cerr << "Could not add POA group " << batch_group_ids[i] << " to batch " << b << std::endl;
+                std::cerr << error_message << std::endl
+                          << error_hint << std::endl;
                 i++;
             }
         }