-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make buffer manager aware of current device memory usage #179
Conversation
psalz
commented
May 30, 2023
- Limit amount of device memory that can be used (currently set to 95%)
- Detect out-of-memory errors and print helpful error message listing all currently allocated buffers and their sizes
- As it turns out DPC++ (on LevelZero at least) swaps out GPU memory to system memory, and it'll just continue work (albeit slowly)
- If resizing large buffer would exceed device memory, attempt to go through host first
- If accessing part of large buffer would result in resize that exceeds device memory, attempt to spill currently unused parts to the host
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor remarks from my side, but looks good overall.
Another naming pet-peeve of mine: I would prefer if the code explicitly talked about raw allocation sizes and alignments in terms of bytes, e.g. size_bytes
, assume_bytes_freed
to highlight that it doesn't refer to element counts (or some sort of allocation granularity like pages).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
eb2756b
to
b4548d5
Compare
- Limit amount of device memory that can be used (currently set to 95%) - Detect out-of-memory errors and print helpful error message listing all currently allocated buffers and their sizes - If resizing large buffer would exceed device memory, attempt to go through host first - If accessing part of large buffer would result in resize that exceeds device memory, attempt to spill currently unused parts to the host
b4548d5
to
3838aff
Compare