[REVIEW] New suballocator memory resources #162

harrism · 2019-10-17T00:46:08Z

This PR introduces a few new suballocator memory_resources.

pool_memory_resource is a coalescing pool suballocator memory resource. The behavior is effectively the same as cnmem_memory_resource, except that it is not currently thread safe and the code uses STL containers rather than custom linked lists.
fixed_size_memory_resource allocates fixed size blocks from a pre-allocated list (which can grow as blocks get used). Provides constant time allocation and deallocation for allocations that fit within its fixed block size.
fixed_multisize_memory_resource allocates blocks from a specified range of fixed power of two block sizes. (e.g. the default is 256KB to 4MB).
hybrid_memory_resource given two suballocators and a byte size threshold to determine which one to use, returns arbitrary size allocations. Typically this means using a fixed_multisize_memory_resource below a threshold (say 4MiB), and a sub_memory_resource above that threshold.

Tests for all of the above are added, and the tests have been consolidated a bit so they run faster.

This PR also makes the RANDOM_ALLOCATIONS google benchmark more flexible so it can be run for a requested allocator and a given number of allocations and maximum allocation size.

TODO:

More documentation
meminfo (or remove it [FEA] Deprecate and remove rmmGetInfo #305 ). Update: worked around this by adding a new base class function supports_get_mem_info which returns false for all the new resource types.
Improve pool growth heuristic. Done in [REVIEW] New suballocator memory resource classes #314.

Future PRs:

Smarter growth heuristics
Use events to support per-thread default stream
Thread safety (do it as a wrapper memory resource type).
Multi-device? May not be needed since the device can be set before creating a resource. However should probably check current device when allocating from CUDA to ensure its the same as the device the MR was created on...

…eblazing/rmm into felipeblazing-feature/use-memory-manager

…tions

…ther cleanup.

…eblazing/rmm into felipeblazing-feature/use-memory-manager

…ctor-block_list-class

jrhemstad · 2020-02-27T18:23:34Z

Multi-device? May not be needed since the device can be set before creating a resource. However should probably check current device when allocating from CUDA to ensure its the same as the device the MR was created on...

Regarding multi-device, I think we can do something like what Thrust does with maintaining a different resource per device. See:

https://github.com/thrust/thrust/blob/63d847beab9931978d9c894afd7432a6f94abd46/thrust/system/cuda/detail/per_device_resource.h#L53

jrhemstad · 2020-03-05T20:55:15Z

tests/mr/device/mr_tests.cpp

+}
+
+template <>
+MRTest<hybrid_mr>::~MRTest()


There may be some missing dtor specializations.

Consider instead using a vector<unique_ptr<device_memory_resource>> in the test fixture and just push upstream resources onto the vector. That way the upstreams will be automatically destroyed when the fixture is destroyed.

This is looking difficult without bumping RMM to C++14...

jrhemstad · 2020-03-05T21:00:48Z

include/rmm/mr/device/hybrid_memory_resource.hpp

+   * @param threshold_size Allocations > this size (in bytes) use large_mr. Allocations <= this size
+   *                       use small_mr. Must be a power of two.
+   */
+  explicit hybrid_memory_resource(


Suggested change

explicit hybrid_memory_resource(

hybrid_memory_resource(

jrhemstad · 2020-03-05T21:07:05Z

include/rmm/mr/device/detail/free_list.hpp

+ */
+inline block merge_blocks(block const& a, block const& b)
+{
+  RMM_EXPECTS(a.ptr + a.size == b.ptr, "Merging noncontiguous blocks");


This function probably shouldn't throw. Use asserts instead.

jrhemstad · 2020-03-05T21:19:16Z

include/rmm/mr/device/detail/free_list.hpp

+   */
+  template< class InputIt >
+  void insert( InputIt first, InputIt last ) {
+    for (auto iter = first; iter != last; ++iter) {


std::for_each.

jrhemstad · 2020-03-05T21:28:26Z