Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make buffer manager aware of current device memory usage #179

Merged
merged 1 commit into from
Jun 22, 2023

Conversation

psalz
Copy link
Member

@psalz psalz commented May 30, 2023

  • Limit amount of device memory that can be used (currently set to 95%)
  • Detect out-of-memory errors and print helpful error message listing all currently allocated buffers and their sizes
    • As it turns out DPC++ (on LevelZero at least) swaps out GPU memory to system memory, and it'll just continue work (albeit slowly)
  • If resizing large buffer would exceed device memory, attempt to go through host first
  • If accessing part of large buffer would result in resize that exceeds device memory, attempt to spill currently unused parts to the host

@psalz psalz requested review from PeterTh and fknorr May 30, 2023 14:19
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

include/buffer_manager.h Outdated Show resolved Hide resolved
include/device_queue.h Show resolved Hide resolved
include/device_queue.h Outdated Show resolved Hide resolved
src/buffer_manager.cc Outdated Show resolved Hide resolved
test/buffer_manager_tests.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@fknorr fknorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor remarks from my side, but looks good overall.

Another naming pet-peeve of mine: I would prefer if the code explicitly talked about raw allocation sizes and alignments in terms of bytes, e.g. size_bytes, assume_bytes_freed to highlight that it doesn't refer to element counts (or some sort of allocation granularity like pages).

include/buffer_manager.h Outdated Show resolved Hide resolved
include/buffer_storage.h Outdated Show resolved Hide resolved
include/device_queue.h Show resolved Hide resolved
src/buffer_manager.cc Outdated Show resolved Hide resolved
test/buffer_manager_tests.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@PeterTh PeterTh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/buffer_manager.cc Outdated Show resolved Hide resolved
include/device_queue.h Outdated Show resolved Hide resolved
@psalz psalz added this to the 0.4.0 milestone Jun 13, 2023
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

include/buffer_storage.h Outdated Show resolved Hide resolved
test/buffer_manager_tests.cc Outdated Show resolved Hide resolved
test/buffer_manager_tests.cc Outdated Show resolved Hide resolved
test/buffer_manager_tests.cc Outdated Show resolved Hide resolved
test/buffer_manager_tests.cc Outdated Show resolved Hide resolved
include/buffer_storage.h Outdated Show resolved Hide resolved
Copy link
Contributor

@fknorr fknorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

include/buffer_manager.h Show resolved Hide resolved
include/buffer_storage.h Outdated Show resolved Hide resolved
@psalz psalz force-pushed the bm-memory-awareness branch from eb2756b to b4548d5 Compare June 21, 2023 21:30
- Limit amount of device memory that can be used (currently set to 95%)
- Detect out-of-memory errors and print helpful error message listing
  all currently allocated buffers and their sizes
- If resizing large buffer would exceed device memory, attempt to go
  through host first
- If accessing part of large buffer would result in resize that exceeds
  device memory, attempt to spill currently unused parts to the host
@psalz psalz force-pushed the bm-memory-awareness branch from b4548d5 to 3838aff Compare June 22, 2023 10:37
@psalz psalz merged commit 79f97c2 into master Jun 22, 2023
@psalz psalz deleted the bm-memory-awareness branch June 22, 2023 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants