Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Deprecate and remove rmmGetInfo #305

Closed
jrhemstad opened this issue Feb 24, 2020 · 10 comments
Closed

[FEA] Deprecate and remove rmmGetInfo #305

jrhemstad opened this issue Feb 24, 2020 · 10 comments
Labels
feature request New feature or request proposal Change current process or code

Comments

@jrhemstad
Copy link
Contributor

Is your feature request related to a problem? Please describe.

RMM currently provides the function rmmGetInfo to return the free and total amounts of memory.

Since the refactor to use device_memory_resources (see #301), rmmGetInfo simply calls get_mem_info on the default resource:

rmm::mr::get_default_resource()->get_mem_info( stream);

The interface defined by std::pmr::memory_resource does not provide an interface to return this information, so we were required to add a non-standard get_mem_info pure virtual function to the device_memory_resource. This forces all resource implementations to provide an implementation for get_mem_info.

This is problematic as for some resource implementations it is difficult or impossible to provide this information---requiring the implementation of get_mem_info to simply throw a "not supported" exception. For example, the Thrust pool is based off of the std::pmr::memory_resource interface and does not provide a mechanism to implement get_mem_info and thus can only throw.

Describe the solution you'd like

Eliminate the rmmGetInfo API in addition to the get_mem_info pure virtual function in device_memory_resource.

Note that this does not mean get_mem_info needs to be eliminated all together. Some device_memory_resources can implement get_mem_info, e.g., cuda_memory_resource or cnmem_memory_resource. We can easily keep the get_mem_info as members of those derived classes.

What I am suggesting is removing it from the device_memory_resource base class and thus eliminating the need for all derived types to implement it. There is no reason we need to remove it from the derived device_memory_resource types that can support get_mem_info.

However, this does mean you can't call get_mem_info through the device_memory_resource base class pointer. You will first need to cast to the appropriate derived type (which requires knowing what kind of resource is pointed to through the device_memory_resource*) to access it's specific member functions.

Describe alternatives you've considered

Alternatively, we can keep get_mem_info and as we continue to add resources, most of them will just throw an exception. This doesn't seem very helpful.

@jrhemstad jrhemstad added feature request New feature or request proposal Change current process or code labels Feb 24, 2020
@jlowe
Copy link
Contributor

jlowe commented Feb 27, 2020

From the Spark perspective the Java bindings do not expose this, so nothing will immediately break there. We were thinking of using this during asynchronous memory spill to get an idea of how much needs to spill and when to stop spilling. However we can approximate that by wrapping the memory resource and tracking the allocation total ourselves. This approach won't account for any padding added by the wrapped allocator, but it should be close enough for a ballpark figure to use for async spill thresholds.

@jrhemstad
Copy link
Contributor Author

jrhemstad commented Feb 27, 2020

However we can approximate that by wrapping the memory resource and tracking the allocation total ourselves.

We can probably provide a opt-in resource adaptor that simply keeps a running sum of all allocate calls and subtracts for any deallocate calls.

It would be documented that this doesn't guarantee accurate information about the actual availability of memory in the upstream resource due to fragmentation, padding, etc.

This could also be useful for memory leak detection.

@gmarkall
Copy link
Contributor

This is OK from the perspective of adding support for external memory management plugins to Numba - it is not essential for Numba to know how much total and free memory is available, and the spec of NBEP 7 can be modified to make the get_memory_info method implementation optional.

@jakirkham
Copy link
Member

cc @pentschev (as RMM memory info is used in Dask-CUDA tests here and here)

@teju85
Copy link
Member

teju85 commented Feb 28, 2020

I totally get your reasoning behind removing rmmGetInfo().

However, I want to subjectively point out that the downside of removing it can make some implementations which want to rely on such info to dynamically configure their behavior (more workspace = lesser iterations and vice-versa, etc) impossible. So, those will have to go with a sub-optimal static configurations, which can certainly have less than optimal performance. That said, it is currently hard to give an estimate on perf hit due to this, as it is very use-case dependent.

In such cases, we can even get away with an isAllocationPossible() like API as well and then doing some sort of binary search to converge on the right size.

@jrhemstad
Copy link
Contributor Author

jrhemstad commented Feb 28, 2020

I totally get your reasoning behind removing rmmGetInfo().

However, I want to subjectively point out that the downside of removing it can make some implementations which want to rely on such info to dynamically configure their behavior (more workspace = lesser iterations and vice-versa, etc) impossible. So, those will have to go with a sub-optimal static configurations, which can certainly have less than optimal performance. That said, it is currently hard to give an estimate on perf hit due to this, as it is very use-case dependent.

To be clear, certain device_memory_resources can still easily provide the same functionality as rmmGetInfo. However, it is not reasonable to expect all possible implementations to support it. So as a user, you are free to choose to use a device_memory_resource that supports get_mem_info and use it accordingly.

What's being discussed here is dropping the requirement/expectation that all resources support this functionality.

Furthermore, to achieve what you described, you can easily use a rudimentary allocation tracker adaptor like I described here: #305 (comment)

In such cases, we can even get away with an isAllocationPossible() like API as well and then doing some sort of binary search to converge on the right size.

If an allocation size isn't feasible, then a rmm::bad_alloc exception will be thrown. You could still easily do the binary search by catching the exception and trying a smaller allocation size until you find one that succeeds.

@pentschev
Copy link
Member

We only use it in dask-cuda for a test, confirming that we setup an RMM pool and checking that its size matches. Would that still be possible once this is removed? It doesn't matter the code to do that would be more complex in dask-cuda as it's only a test anyway.

@jrhemstad
Copy link
Contributor Author

We only use it in dask-cuda for a test, confirming that we setup an RMM pool and checking that its size matches. Would that still be possible once this is removed?

For the CNMEM pool, yes. But there are other resource implementations where it wouldn't be possible.

@pentschev
Copy link
Member

For the CNMEM pool, yes. But there are other resource implementations where it wouldn't be possible.

Alright, that seems good enough, as it's only a check anyway. Thanks for clarifying, from dask-cuda side we are good with the deprecation/removal.

@jrhemstad
Copy link
Contributor Author

Closed by #396

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request proposal Change current process or code
Projects
None yet
Development

No branches or pull requests

6 participants