This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 448
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Issue #399 reported that we were missing several test cases in CUB that were ifdef'd out. This patch enables most of those tests, though CDP tests are not added here. Some other deficiencies were addressed as they were noticed, for instance, adding value_types other than unsigned char to test_block_histogram. The way we split up tests into "BENCHMARK", "MINIMAL", and "THOROUGH" variants wasn't well suited for regression testing, as a lot of redundant code paths were generated between the various test executables. These have been removed, leaving only the "THOROUGH" tests, which should capture all test cases. Benchmarks should go into the new `thrust_benchmark` project. Some tests also took an excessively long time to build, especially after enabling the missing test cases from #399. This patch adds a new mechanism that allows a test to include a comment such as: ``` // %PARAM% TEST_FOO foo 0:1:2 // %PARAM% TEST_BAR bar 4:8 ``` CMake will parse these out, and generate multiple test executables for each combination of parameters, e.g: ``` cub.test.baz.foo_0.bar_4 -DTEST_FOO=0 -DTEST_BAR=4 cub.test.baz.foo_0.bar_8 -DTEST_FOO=0 -DTEST_BAR=8 cub.test.baz.foo_1.bar_4 -DTEST_FOO=1 -DTEST_BAR=4 cub.test.baz.foo_1.bar_8 -DTEST_FOO=1 -DTEST_BAR=8 cub.test.baz.foo_2.bar_4 -DTEST_FOO=2 -DTEST_BAR=4 cub.test.baz.foo_2.bar_8 -DTEST_FOO=2 -DTEST_BAR=8 ``` This can be used to quickly split up problematically large tests. See the note at the top of cub/test/CMakeLists.txt for more details. The PrintNinjaBuildTimes.cmake file from Thrust was used to identify tests that needed to be split. Several tests were testing Thrust APIs. This isn't necessary, as Thrust has it's own test suite. These tests have been removed. This isn't needed for regression testing and has been removed. Some of the other command line options could also be removed now that benchmarking isn't handled by these regression tests, but this is a start. Extended testing revealed that the cub::BlockHistogram algorithm's behavior is undefined when input values are outside of [0, BINS). Added this info to the algorithm docs. * test_device_histogram from 15m -> 35s. * test_device_run_length_encode from 7m -> 3s. * test_device_scan tests from <3m -> <4s. # Conflicts: # test/test_device_radix_sort.cu
- Loading branch information
1 parent
57a20e8
commit 8bb6e90
Showing
28 changed files
with
901 additions
and
2,950 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,113 @@ | ||
# Test Parametrization | ||
|
||
Some of CUB's tests are very slow to build and are capable of exhausting RAM | ||
during compilation/linking. To avoid such issues, large tests are split into | ||
multiple executables to take advantage of parallel computation and reduce memory | ||
usage. | ||
|
||
CUB facilitates this by checking for special `%PARAM%` comments in each test's | ||
source code, and then uses this information to generate multiple executables | ||
with different configurations. | ||
|
||
## Using `%PARAM%` | ||
|
||
The `%PARAM%` hint provides an automated method of generating multiple test | ||
executables from a single source file. To use it, add one or more special | ||
comments to the test source file: | ||
|
||
```cpp | ||
// %PARAM% [definition] [label] [values] | ||
``` | ||
|
||
CMake will parse the source file and extract these comments, using them to | ||
generate multiple test executables for the full cartesian product of values. | ||
|
||
- `definition` will be used as a preprocessor definition name. By convention, | ||
these begin with `TEST_`. | ||
- `label` is a short, human-readable label that will be used in the test | ||
executable's name to identify the test variant. | ||
- `values` is a colon-separated list of values used during test generation. Only | ||
numeric values have been tested. | ||
|
||
## Example | ||
|
||
For example, if `test_baz.cu` contains the following lines: | ||
|
||
```cpp | ||
// %PARAM% TEST_FOO foo 0:1:2 | ||
// %PARAM% TEST_BAR bar 4:8 | ||
``` | ||
|
||
Six executables and CTest targets will be generated with unique definitions | ||
(only c++17 targets shown): | ||
|
||
| Executable Name | Preprocessor Definitions | | ||
|----------------------------------|-----------------------------| | ||
| `cub.cpp17.test.baz.foo_0.bar_4` | `-DTEST_FOO=0 -DTEST_BAR=4` | | ||
| `cub.cpp17.test.baz.foo_0.bar_8` | `-DTEST_FOO=0 -DTEST_BAR=8` | | ||
| `cub.cpp17.test.baz.foo_1.bar_4` | `-DTEST_FOO=1 -DTEST_BAR=4` | | ||
| `cub.cpp17.test.baz.foo_1.bar_8` | `-DTEST_FOO=1 -DTEST_BAR=8` | | ||
| `cub.cpp17.test.baz.foo_2.bar_4` | `-DTEST_FOO=2 -DTEST_BAR=4` | | ||
| `cub.cpp17.test.baz.foo_2.bar_8` | `-DTEST_FOO=2 -DTEST_BAR=8` | | ||
|
||
## Changing `%PARAM%` Hints | ||
|
||
Since CMake does not automatically reconfigure the build when source files are | ||
modified, CMake will need to be rerun manually whenever the `%PARAM%` comments | ||
change. | ||
|
||
## Building and Running Split Tests | ||
|
||
CMake will generate individual build and test targets for each test variant, and | ||
also provides build "metatargets" that compile all variants of a given test. | ||
|
||
The variants follow the usual naming convention for CUB's tests, but include a | ||
suffix that differentiates them (e.g. `.foo_X.bar_Y` in the example above). | ||
|
||
### Individual Test Variants | ||
|
||
Continuing with the `test_baz.cu` example, the test variant that uses | ||
`-DTEST_FOO=1 -DTEST_BAR=4` can be built and run alone: | ||
|
||
```bash | ||
# Build a single variant: | ||
make cub.cpp17.test.baz.foo_1.bar_4 | ||
|
||
# Run a single variant | ||
bin/cub.cpp17.test.baz.foo_1.bar_4 | ||
|
||
# Run a single variant using CTest regex: | ||
ctest -R cub\.cpp17\.test\.baz\.foo_1\.bar_4 | ||
``` | ||
|
||
### All Variants of a Test | ||
|
||
Using a metatarget and the proper regex, all variants of a test can be built and | ||
executed without listing all variants explicitly: | ||
|
||
```bash | ||
# Build all variants using the `.all` metatarget | ||
make cub.cpp17.test.baz.all | ||
|
||
# Run all variants: | ||
ctest -R cub\.cpp17\.test\.baz\. | ||
``` | ||
|
||
## Debugging | ||
|
||
Running CMake with `--log-level=VERBOSE` will print out extra information about | ||
all detected test variants. | ||
|
||
## Additional Info | ||
|
||
Ideally, only parameters that directly influence kernel template instantiations | ||
should be split out in this way. If changing a parameter doesn't change the | ||
kernel template type, the same kernel will be compiled into multiple | ||
executables. This defeats the purpose of splitting up the test since the | ||
compiler will generate redundant code across the new split executables. | ||
|
||
The best candidate parameters for splitting are input value types, rather than | ||
integral parameters like BLOCK_THREADS, etc. Splitting by value type allows more | ||
infrastructure (data generation, validation) to be reused. Splitting other | ||
parameters can cause build times to increase since type-related infrastructure | ||
has to be rebuilt for each test variant. |
Oops, something went wrong.