-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve testing infrastructure. #871
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -91,3 +91,134 @@ release. Once this is done perform a final round of updates: | |
* Update BlueBrain Spack recipe to use the archive and not the Git commit. | ||
* Update the upstream Spack recipe. | ||
|
||
## Writing Tests | ||
### Generate Multi-Dimensional Test Data | ||
Input array of any dimension and type can be generated using the template class | ||
`DataGenerator`. For example: | ||
``` | ||
auto dims = std::vector<size_t>{4, 2}; | ||
auto values = testing::DataGenerator<std::vector<std::array<double, 2>>::create(dims); | ||
``` | ||
Generates an `std::vector<std::array<double, 2>>` initialized with suitable | ||
values. | ||
|
||
If "suitable" isn't specific enough, one can specify a callback: | ||
``` | ||
auto callback = [](const std::vector<size_t>& indices) { | ||
return 42.0; | ||
} | ||
|
||
auto values = testing::DataGenerator<std::vector<double>>::create(dims, callback); | ||
``` | ||
|
||
The `dims` can be generated via `testing::DataGenerator::default_dims` or by | ||
using `testing::DataGenerator::sanitize_dims`. Remember, that certain | ||
containers are fixed size and that we often compute the number of elements by | ||
multiplying the dims. | ||
|
||
### Generate Scalar Test Data | ||
To generate a single "suitable" element use template class `DefaultValues`, e.g. | ||
``` | ||
auto default_values = testing::DefaultValues<double>(); | ||
auto x = testing::DefaultValues<double>(indices); | ||
``` | ||
|
||
### Accessing Elements | ||
To access a particular element from an unknown container use the following trait: | ||
``` | ||
using trait = testing::ContainerTraits<std::vector<std::array<int, 2>>; | ||
// auto x = values[1][0]; | ||
auto x = trait::get(values, {1, 0}); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Following the command above:
there is also
to read directly from the flat storage. (comment applicable below) |
||
|
||
// values[1][0] = 42.0; | ||
trait::set(values, {1, 0}, 42.0); | ||
``` | ||
|
||
### Utilities For Multi-Dimensional Arrays | ||
Use `testing::DataGenerator::allocate` to allocate an array (without filling | ||
it) and `testing::copy` to copy an array from one type to another. There's | ||
`testing::ravel`, `testing::unravel` and `testing::flat_size` to compute the | ||
position in a flat array from a multi-dimensional index, the reverse and the | ||
number of element in the multi-dimensional array. | ||
|
||
### Deduplicating DataSet and Attribute | ||
Due to how HighFive is written testing `DataSet` and `Attribute` often requires | ||
duplicating the entire test code because somewhere a `createDataSet` must be | ||
replaced with `createAttribute`. Use `testing::AttributeCreateTraits` and | ||
`testing::DataSetCreateTraits`. For example, | ||
``` | ||
template<class CreateTraits> | ||
void check_write(...) { | ||
// Same as one of: | ||
// file.createDataSet(name, values); | ||
// file.createAttribute(name, values); | ||
CreateTraits::create(file, name, values); | ||
} | ||
``` | ||
|
||
### Test Organization | ||
#### Multi-Dimensional Arrays | ||
All tests for reading/writing whole multi-dimensional arrays to datasets or | ||
attributes belong in `tests/unit/tests_high_five_multi_dimensional.cpp`. This | ||
includes write/read cycles; checking all the generic edges cases, e.g. empty | ||
arrays and mismatching sizes; and checking non-reallocation. | ||
|
||
Read/Write cycles are implemented in two distinct checks. One for writing and | ||
another for reading. When checking writing we read with a "trusted" | ||
multi-dimensional array (a nested `std::vector`), and vice-versa when checking | ||
reading. This matters because certain bugs, like writing a column major array | ||
as if it were row-major can't be caught if one reads it back into a | ||
column-major array. | ||
|
||
Remember, `std::vector<bool>` is very different from all other `std::vector`s. | ||
|
||
Every container `template<class T> C;` should at least be checked with all of | ||
the following `T`s that are supported by the container: `bool`, `double`, | ||
`std::string`, `std::vector`, `std::array`. The reason is `bool` and | ||
`std::string` are special, `double` is just a POD, `std::vector` requires | ||
dynamic memory allocation and `std::array` is statically allocated. | ||
|
||
Similarly, each container should be put inside an `std::vector` and an | ||
`std::array`. | ||
|
||
#### Scalar Data Set | ||
Write-read cycles for scalar values should be implemented in | ||
`tests/unit/tests_high_five_scalar.cpp`. | ||
|
||
#### Data Types | ||
Unit-tests related to checking that `DataType` API, go in | ||
`tests/unit/tests_high_data_type.cpp`. | ||
|
||
#### Selections | ||
Anything selection related goes in `tests/unit/test_high_five_selection.cpp`. | ||
This includes things like `ElementSet` and `HyperSlab`. | ||
|
||
#### Strings | ||
Regular write-read cycles for strings are performed along with the other types, | ||
see above. This should cover compatibility of `std::string` with all | ||
containers. However, additional testing is required, e.g. character set, | ||
padding, fixed vs. variable length. These all go in | ||
`tests/unit/test_high_five_string.cpp`. | ||
|
||
#### Specific Tests For Optional Containers | ||
If containers, e.g. `Eigen::Matrix` require special checks those go in files | ||
called `tests/unit/test_high_five_*.cpp` where `*` is `eigen` for Eigen. | ||
|
||
#### Memory Layout Assumptions | ||
In HighFive we make assumptions about the memory layout of certain types. For | ||
example, we assume that | ||
``` | ||
auto array = std::vector<std::array<double, 2>>(n); | ||
doube * ptr = (double*) array.data(); | ||
``` | ||
is a sensible thing to do. We assume similar about `bool` and | ||
`details::Boolean`. These types of tests go into | ||
`tests/unit/tests_high_five_memory_layout.cpp`. | ||
|
||
#### H5Easy | ||
Anything `H5Easy` related goes in files with the appropriate name. | ||
|
||
#### Everything Else | ||
What's left goes in `tests/unit/test_high_five_base.cpp`. This covers opening | ||
files, groups, dataset or attributes; checking certain pathological edge cases; | ||
etc. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
#pragma once | ||
|
||
namespace HighFive { | ||
namespace testing { | ||
|
||
/// \brief Trait for `createAttribute`. | ||
/// | ||
/// The point of these is to simplify testing. The typical issue is that we | ||
/// need to write the tests twice, one with `createDataSet` and then again with | ||
/// `createAttribute`. This trait allows us to inject this difference. | ||
struct AttributeCreateTraits { | ||
using type = Attribute; | ||
|
||
template <class Hi5> | ||
static Attribute get(Hi5& hi5, const std::string& name) { | ||
return hi5.getAttribute(name); | ||
} | ||
|
||
|
||
template <class Hi5, class Container> | ||
static Attribute create(Hi5& hi5, const std::string& name, const Container& container) { | ||
return hi5.createAttribute(name, container); | ||
} | ||
|
||
template <class Hi5> | ||
static Attribute create(Hi5& hi5, | ||
const std::string& name, | ||
const DataSpace& dataspace, | ||
const DataType& datatype) { | ||
return hi5.createAttribute(name, dataspace, datatype); | ||
} | ||
|
||
template <class T, class Hi5> | ||
static Attribute create(Hi5& hi5, const std::string& name, const DataSpace& dataspace) { | ||
auto datatype = create_datatype<T>(); | ||
return hi5.template createAttribute<T>(name, dataspace); | ||
} | ||
}; | ||
|
||
/// \brief Trait for `createDataSet`. | ||
struct DataSetCreateTraits { | ||
using type = DataSet; | ||
|
||
template <class Hi5> | ||
static DataSet get(Hi5& hi5, const std::string& name) { | ||
return hi5.getDataSet(name); | ||
} | ||
|
||
template <class Hi5, class Container> | ||
static DataSet create(Hi5& hi5, const std::string& name, const Container& container) { | ||
return hi5.createDataSet(name, container); | ||
} | ||
|
||
template <class Hi5> | ||
static DataSet create(Hi5& hi5, | ||
const std::string& name, | ||
const DataSpace& dataspace, | ||
const DataType& datatype) { | ||
return hi5.createDataSet(name, dataspace, datatype); | ||
} | ||
|
||
template <class T, class Hi5> | ||
static DataSet create(Hi5& hi5, const std::string& name, const DataSpace& dataspace) { | ||
auto datatype = create_datatype<T>(); | ||
return hi5.template createDataSet<T>(name, dataspace); | ||
} | ||
}; | ||
|
||
} // namespace testing | ||
} // namespace HighFive |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use a library for this! For example, you can use the NumPy equivalent xtensor. Writing and maintaining some extra custom code is extra work and error-prone. I would say that testable data are as simple as
Then you can use functions like
ravel
,ravel_index
, ... just like in NumPy. No new API, no new code.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to test the following:
Hence, I need to able to create arrays of all supported types
and so forth, and be able to fill them with exactly the correct values. Being able to have any container filled with sensible values, requires being able to a) allocate containers of a given shape, b) assign values. Those two feature make up a good chunk of the PR. Then there very minor niceties like copying, etc.
The second issue is that I need a somewhat uniform API for creating the product of
template<class T> Container
with a bunch of different scalar typesT
. That's the other large chunk of the PR.How does an
xt::array
help in this situation? Yes, I can test XTensor, but I can't test writing of any of the other containers. If I use XTensor arrays to write, I can read the other containers, but I still can't check that the values are correct, because I don't have access to some API unifying trait to hide the difference ofa[i][j]
,a(i,j)
etc. (gets worse in 3D).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a pressing reason to use a custom implementation, of course, it's fine. Anyway, this is not a public implementation, nor will change a lot.
I just thought it would be easier to write some casters but avoid implementing (un)ravel(index) functions and equality operators.