Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Commit

Permalink
Overhaul CUB test suite.
Browse files Browse the repository at this point in the history
Issue #399 reported that we were missing several test cases in CUB
that were ifdef'd out. This patch enables most of those tests, though
CDP tests are not added here.

Some other deficiencies were addressed as they were noticed, for instance,
adding value_types other than unsigned char to test_block_histogram.

The way we split up tests into "BENCHMARK", "MINIMAL", and "THOROUGH"
variants wasn't well suited for regression testing, as a lot of
redundant code paths were generated between the various test
executables. These have been removed, leaving only the "THOROUGH" tests,
which should capture all test cases.

Benchmarks should go into the new `thrust_benchmark` project.

Some tests also took an excessively long time to build, especially after
enabling the missing test cases from #399. This patch adds a new
mechanism that allows a test to include a comment such as:

```
// %PARAM% TEST_FOO foo 0:1:2
// %PARAM% TEST_BAR bar 4:8
```

CMake will parse these out, and generate multiple test executables for
each combination of parameters, e.g:

```
cub.test.baz.foo_0.bar_4 -DTEST_FOO=0 -DTEST_BAR=4
cub.test.baz.foo_0.bar_8 -DTEST_FOO=0 -DTEST_BAR=8
cub.test.baz.foo_1.bar_4 -DTEST_FOO=1 -DTEST_BAR=4
cub.test.baz.foo_1.bar_8 -DTEST_FOO=1 -DTEST_BAR=8
cub.test.baz.foo_2.bar_4 -DTEST_FOO=2 -DTEST_BAR=4
cub.test.baz.foo_2.bar_8 -DTEST_FOO=2 -DTEST_BAR=8
```

This can be used to quickly split up problematically large tests. See
the note at the top of cub/test/CMakeLists.txt for more details.

The PrintNinjaBuildTimes.cmake file from Thrust was used to identify
tests that needed to be split.

Several tests were testing Thrust APIs. This isn't necessary, as Thrust
has it's own test suite. These tests have been removed.

This isn't needed for regression testing and has been removed. Some
of the other command line options could also be removed now that
benchmarking isn't handled by these regression tests, but this is a start.

Extended testing revealed that the cub::BlockHistogram algorithm's
behavior is undefined when input values are outside of [0, BINS). Added
this info to the algorithm docs.

* test_device_histogram from 15m -> 35s.
* test_device_run_length_encode from 7m -> 3s.
* test_device_scan tests from <3m  -> <4s.

# Conflicts:
#	test/test_device_radix_sort.cu
  • Loading branch information
alliepiper committed Dec 13, 2021
1 parent 57a20e8 commit 8bb6e90
Show file tree
Hide file tree
Showing 28 changed files with 901 additions and 2,950 deletions.
3 changes: 0 additions & 3 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,6 @@ endif()

option(CUB_ENABLE_HEADER_TESTING "Test that all public headers compile." ON)
option(CUB_ENABLE_TESTING "Build CUB testing suite." ON)
option(CUB_ENABLE_THOROUGH_TESTING "Build CUB thorough test variants." ON)
option(CUB_ENABLE_BENCHMARK_TESTING "Build CUB benchmark test variants." ON)
option(CUB_ENABLE_MINIMAL_TESTING "Build CUB minimal test variants." ON)
option(CUB_ENABLE_EXAMPLES "Build CUB examples." ON)
# This is needed for NVCXX QA, which requires a static set of executable names.
# Only a single dialect may be enabled when this is off.
Expand Down
6 changes: 0 additions & 6 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,6 @@ The configuration options for CUB are:
- Whether to test compile public headers. Default is `ON`.
- `CUB_ENABLE_TESTING={ON, OFF}`
- Whether to build unit tests. Default is `ON`.
- `CUB_ENABLE_THOROUGH_TESTS={ON, OFF}`
- Whether to build the thorough test variants.
- `CUB_ENABLE_BENCHMARK_TESTS={ON, OFF}`
- Whether to build the benchmark test variants.
- `CUB_ENABLE_MINIMAL_TESTS={ON, OFF}`
- Whether to build the minimal test variants.
- `CUB_ENABLE_EXAMPLES={ON, OFF}`
- Whether to build examples. Default is `ON`.
- `CUB_ENABLE_DIALECT_CPPXX={ON, OFF}`
Expand Down
5 changes: 5 additions & 0 deletions cub/block/block_histogram.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,10 @@ enum BlockHistogramAlgorithm
* \par Overview
* - A <a href="http://en.wikipedia.org/wiki/Histogram"><em>histogram</em></a>
* counts the number of observations that fall into each of the disjoint categories (known as <em>bins</em>).
* - The `T` type must be implicitly castable to an integer type.
* - BlockHistogram expects each integral `input[i]` value to satisfy
* `0 <= input[i] < BINS`. Values outside of this range result in undefined
* behavior.
* - BlockHistogram can be optionally specialized to use different algorithms:
* -# <b>cub::BLOCK_HISTO_SORT</b>. Sorting followed by differentiation. [More...](\ref cub::BlockHistogramAlgorithm)
* -# <b>cub::BLOCK_HISTO_ATOMIC</b>. Use atomic addition to update byte counts directly. [More...](\ref cub::BlockHistogramAlgorithm)
Expand Down Expand Up @@ -136,6 +140,7 @@ enum BlockHistogramAlgorithm
* \endcode
*
* \par Performance and Usage Considerations
* - All input values must fall between [0, BINS), or behavior is undefined.
* - The histogram output can be constructed in shared or device-accessible memory
* - See cub::BlockHistogramAlgorithm for performance details regarding algorithmic alternatives
*
Expand Down
19 changes: 19 additions & 0 deletions cub/iterator/tex_ref_input_iterator.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,11 @@ struct CUB_DEPRECATED IteratorTexRef
CUB_HOST_COMPILER == CUB_HOST_COMPILER_CLANG
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
#endif

#ifdef __NVCC_DIAG_PRAGMA_SUPPORT__
#pragma nv_diagnostic push
#pragma nv_diag_suppress 1215
#endif

/// And by unique ID
Expand Down Expand Up @@ -141,6 +146,10 @@ template <int UNIQUE_ID>
typename IteratorTexRef<T>::template TexId<UNIQUE_ID>::TexRef IteratorTexRef<T>::template TexId<UNIQUE_ID>::ref = 0;

// Re-enable deprecation warnings:
#ifdef __NVCC_DIAG_PRAGMA_SUPPORT__
#pragma nv_diagnostic pop
#endif

#if CUB_HOST_COMPILER == CUB_HOST_COMPILER_MSVC
#pragma warning(default:4996)
#elif CUB_HOST_COMPILER == CUB_HOST_COMPILER_GCC || \
Expand Down Expand Up @@ -233,6 +242,11 @@ class CUB_DEPRECATED TexRefInputIterator
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
#endif

#ifdef __NVCC_DIAG_PRAGMA_SUPPORT__
#pragma nv_diagnostic push
#pragma nv_diag_suppress 1215
#endif

public:

// Required iterator traits
Expand Down Expand Up @@ -399,6 +413,11 @@ public:
}

// Re-enable deprecation warnings:

#ifdef __NVCC_DIAG_PRAGMA_SUPPORT__
#pragma nv_diagnostic pop
#endif

#if CUB_HOST_COMPILER == CUB_HOST_COMPILER_MSVC
#pragma warning(default:4996)
#elif CUB_HOST_COMPILER == CUB_HOST_COMPILER_GCC || \
Expand Down
166 changes: 105 additions & 61 deletions test/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,65 @@
# The function below reads the filepath `src`, extracts the %PARAM% comments,
# and fills `labels_var` with a list of `label1_value1.label2_value2...`
# strings, and puts the corresponding `DEFINITION=value1:DEFINITION=value2`
# entries into `defs_var`.
#
# See the README.md file in this directory for background info.
function(cub_get_test_params src labels_var defs_var)
file(READ "${src}" file_data)
string(REGEX MATCHALL
"//[ ]+%PARAM%[ ]+([^ ]+)[ ]+([^ ]+)[ ]+([^\n]*)"
matches
"${file_data}"
)

set(variant_labels)
set(variant_defs)

foreach(match IN LISTS matches)
string(REGEX MATCH
"//[ ]+%PARAM%[ ]+([^ ]+)[ ]+([^ ]+)[ ]+([^\n]*)"
unused
"${match}"
)
set(def ${CMAKE_MATCH_1})
set(label ${CMAKE_MATCH_2})
set(values "${CMAKE_MATCH_3}")
string(REPLACE ":" ";" values "${values}")

if (NOT variant_labels)
foreach(value IN LISTS values)
list(APPEND variant_labels ${label}_${value})
endforeach()
else()
set(tmp_labels)
foreach(old_label IN LISTS variant_labels)
foreach(value IN LISTS values)
list(APPEND tmp_labels ${old_label}.${label}_${value})
endforeach()
endforeach()
set(variant_labels "${tmp_labels}")
endif()

if (NOT variant_defs)
foreach(value IN LISTS values)
list(APPEND variant_defs ${def}=${value})
endforeach()
else()
set(tmp_defs)
foreach(old_def IN LISTS variant_defs)
foreach(value IN LISTS values)
list(APPEND tmp_defs ${old_def}:${def}=${value})
endforeach()
endforeach()
set(variant_defs "${tmp_defs}")
endif()

endforeach()

set(${labels_var} "${variant_labels}" PARENT_SCOPE)
set(${defs_var} "${variant_defs}" PARENT_SCOPE)
endfunction()

# Create meta targets that build all tests for a single configuration:
foreach(cub_target IN LISTS CUB_TARGETS)
cub_get_target_property(config_prefix ${cub_target} PREFIX)
Expand Down Expand Up @@ -58,80 +120,62 @@ function(cub_add_test target_name_var test_name test_src cub_target)
)
endfunction()

# Sets HAS_BENCHMARK_VARIANT / HAS_MINIMAL_VARIANT / NO_VARIANTS to True/False in
# the calling scope.
# Used to detect variants of unit tests depending on whether a source file
# contains the strings "CUB_TEST_BENCHMARK" or "CUB_TEST_MINIMAL".
function(cub_check_for_test_variants src)
file(READ "${src}" data)

string(FIND "${data}" "CUB_TEST_BENCHMARK" benchmark_loc)
set(HAS_BENCHMARK_VARIANT False PARENT_SCOPE)
if (NOT benchmark_loc EQUAL -1)
set(HAS_BENCHMARK_VARIANT True PARENT_SCOPE)
endif()

string(FIND "${data}" "CUB_TEST_MINIMAL" minimal_loc)
set(HAS_MINIMAL_VARIANT False PARENT_SCOPE)
if (NOT minimal_loc EQUAL -1)
set(HAS_MINIMAL_VARIANT True PARENT_SCOPE)
endif()

set(NO_VARIANTS False PARENT_SCOPE)
if (NOT (HAS_BENCHMARK_VARIANT OR HAS_MINIMAL_VARIANT))
set(NO_VARIANTS True PARENT_SCOPE)
endif()
endfunction()

foreach (test_src IN LISTS test_srcs)
# TODO: Per-test flags.

get_filename_component(test_name "${test_src}" NAME_WE)
string(REGEX REPLACE "^test_" "" test_name "${test_name}")

# Some tests change behavior based on whether the compiler defs BENCHMARK
# and/or MINIMAL_TEST are defined. Detect these and build variants for each
# configuration:
cub_check_for_test_variants("${test_src}")
cub_get_test_params("${test_src}" variant_labels variant_defs)
list(LENGTH variant_labels num_variants)

# Subtract 1 to support the inclusive endpoint of foreach(...RANGE...):
math(EXPR range_end "${num_variants} - 1")

# Verbose output:
if (NOT num_variants EQUAL 0)
message(VERBOSE "Detected ${num_variants} variants of test '${test_src}':")
foreach(var_idx RANGE ${range_end})
math(EXPR i "${var_idx} + 1")
list(GET variant_labels ${var_idx} label)
list(GET variant_defs ${var_idx} defs)
message(VERBOSE " ${i}: ${test_name} ${label} ${defs}")
endforeach()
endif()

foreach(cub_target IN LISTS CUB_TARGETS)
if (NO_VARIANTS)
cub_get_target_property(config_prefix ${cub_target} PREFIX)

if (num_variants EQUAL 0)
# Only one version of this test.
cub_add_test(test_target ${test_name} "${test_src}" ${cub_target})
else()
# Multiple test variants requested, so we need to give the targets and
# binaries suffixes.
# Meta target to build all parametrizations of the current test for the
# current CUB_TARGET config
set(variant_meta_target ${config_prefix}.test.${test_name}.all)
add_custom_target(${variant_meta_target})

# Meta target to build all parametrizations of the current test for all
# CUB_TARGET configs
set(cub_variant_meta_target cub.all.test.${test_name}.all)
if (NOT TARGET ${cub_variant_meta_target})
add_custom_target(${cub_variant_meta_target})
endif()

# Generate multiple tests, one per variant.
# See `cub_get_test_params` for details.
foreach(var_idx RANGE ${range_end})
list(GET variant_labels ${var_idx} label)
list(GET variant_defs ${var_idx} defs)
string(REPLACE ":" ";" defs "${defs}")

if (CUB_ENABLE_THOROUGH_TESTING)
cub_add_test(test_target_thorough
${test_name}.thorough
cub_add_test(test_target
${test_name}.${label}
"${test_src}"
${cub_target}
)
target_compile_definitions(${test_target_thorough} PRIVATE CUB_TEST_THOROUGH)
endif()

if (CUB_ENABLE_BENCHMARK_TESTING)
if (HAS_BENCHMARK_VARIANT)
cub_add_test(test_target_benchmark
${test_name}.benchmark
"${test_src}"
${cub_target}
)
target_compile_definitions(${test_target_benchmark} PRIVATE CUB_TEST_BENCHMARK)
endif()
endif()

if (CUB_ENABLE_MINIMAL_TESTING)
if (HAS_MINIMAL_VARIANT)
cub_add_test(test_target_minimal
${test_name}.minimal
"${test_src}"
${cub_target}
)
target_compile_definitions(${test_target_minimal} PRIVATE CUB_TEST_MINIMAL)
endif()
endif()
add_dependencies(${variant_meta_target} ${test_target})
add_dependencies(${cub_variant_meta_target} ${test_target})
target_compile_definitions(${test_target} PRIVATE ${defs})
endforeach()
endif()
endforeach()
endforeach()
Expand Down
113 changes: 113 additions & 0 deletions test/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Test Parametrization

Some of CUB's tests are very slow to build and are capable of exhausting RAM
during compilation/linking. To avoid such issues, large tests are split into
multiple executables to take advantage of parallel computation and reduce memory
usage.

CUB facilitates this by checking for special `%PARAM%` comments in each test's
source code, and then uses this information to generate multiple executables
with different configurations.

## Using `%PARAM%`

The `%PARAM%` hint provides an automated method of generating multiple test
executables from a single source file. To use it, add one or more special
comments to the test source file:

```cpp
// %PARAM% [definition] [label] [values]
```

CMake will parse the source file and extract these comments, using them to
generate multiple test executables for the full cartesian product of values.

- `definition` will be used as a preprocessor definition name. By convention,
these begin with `TEST_`.
- `label` is a short, human-readable label that will be used in the test
executable's name to identify the test variant.
- `values` is a colon-separated list of values used during test generation. Only
numeric values have been tested.

## Example

For example, if `test_baz.cu` contains the following lines:

```cpp
// %PARAM% TEST_FOO foo 0:1:2
// %PARAM% TEST_BAR bar 4:8
```

Six executables and CTest targets will be generated with unique definitions
(only c++17 targets shown):

| Executable Name | Preprocessor Definitions |
|----------------------------------|-----------------------------|
| `cub.cpp17.test.baz.foo_0.bar_4` | `-DTEST_FOO=0 -DTEST_BAR=4` |
| `cub.cpp17.test.baz.foo_0.bar_8` | `-DTEST_FOO=0 -DTEST_BAR=8` |
| `cub.cpp17.test.baz.foo_1.bar_4` | `-DTEST_FOO=1 -DTEST_BAR=4` |
| `cub.cpp17.test.baz.foo_1.bar_8` | `-DTEST_FOO=1 -DTEST_BAR=8` |
| `cub.cpp17.test.baz.foo_2.bar_4` | `-DTEST_FOO=2 -DTEST_BAR=4` |
| `cub.cpp17.test.baz.foo_2.bar_8` | `-DTEST_FOO=2 -DTEST_BAR=8` |

## Changing `%PARAM%` Hints

Since CMake does not automatically reconfigure the build when source files are
modified, CMake will need to be rerun manually whenever the `%PARAM%` comments
change.

## Building and Running Split Tests

CMake will generate individual build and test targets for each test variant, and
also provides build "metatargets" that compile all variants of a given test.

The variants follow the usual naming convention for CUB's tests, but include a
suffix that differentiates them (e.g. `.foo_X.bar_Y` in the example above).

### Individual Test Variants

Continuing with the `test_baz.cu` example, the test variant that uses
`-DTEST_FOO=1 -DTEST_BAR=4` can be built and run alone:

```bash
# Build a single variant:
make cub.cpp17.test.baz.foo_1.bar_4

# Run a single variant
bin/cub.cpp17.test.baz.foo_1.bar_4

# Run a single variant using CTest regex:
ctest -R cub\.cpp17\.test\.baz\.foo_1\.bar_4
```

### All Variants of a Test

Using a metatarget and the proper regex, all variants of a test can be built and
executed without listing all variants explicitly:

```bash
# Build all variants using the `.all` metatarget
make cub.cpp17.test.baz.all

# Run all variants:
ctest -R cub\.cpp17\.test\.baz\.
```

## Debugging

Running CMake with `--log-level=VERBOSE` will print out extra information about
all detected test variants.

## Additional Info

Ideally, only parameters that directly influence kernel template instantiations
should be split out in this way. If changing a parameter doesn't change the
kernel template type, the same kernel will be compiled into multiple
executables. This defeats the purpose of splitting up the test since the
compiler will generate redundant code across the new split executables.

The best candidate parameters for splitting are input value types, rather than
integral parameters like BLOCK_THREADS, etc. Splitting by value type allows more
infrastructure (data generation, validation) to be reused. Splitting other
parameters can cause build times to increase since type-related infrastructure
has to be rebuilt for each test variant.
Loading

0 comments on commit 8bb6e90

Please sign in to comment.