-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nice approach on DL dev scenario #12
Comments
@pokerfaceSad Hi, thanks for the feedback! For the overhead of UVM in and of itself (i.e., when an app runs alone on the system), you can take a look at chapter 11.3 of my diploma thesis [1]. The overhead of the UVM swapping when the GPU lock changes hands, which happens every TQ seconds assuming > 1 apps want to run GPU work, it depends on the PCIe bandwidth and the working set size of the application. Simple ExampleLet's assume a GPU has 32 GB/s PCIe bandwidth and the application that just got the GPU lock uses 32 GB of data, then the UVM swapping overhead is around You can measure the actual PCIe bandwidth of a GPU by using the [1] https://dspace.lib.ntua.gr/xmlui/handle/123456789/54290 |
Thanks for your detailed reply! Any ideas about GPU migration? I see it in your Future Improvements. It seems that it is possible to achieve it by UVM , according to https://dl.acm.org/doi/10.1145/3357223.3362714. |
I haven't looked at migration thoroughly yet. (Though a prerequisite for that is Are you perhaps interested in taking a look? If you want to talk about something in private, you can send me an e-mail :) |
Sorry for the late reply. I have sent you an email:) |
I think nvshare a nice approach on DL develop scenario!
Has there been any testing on the overhead brought by UVM swap in training scenarios?
BTW, I have posted a solution to address the issue of long GPU idle times in dev scenarios by dynamically mounting the GPU.
https://github.com/pokerfaceSad/GPUMounter
The text was updated successfully, but these errors were encountered: