-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix multi-threading issue with using real Vulkan handles #1235
Fix multi-threading issue with using real Vulkan handles #1235
Conversation
CI gfxreconstruct build queued with queue ID 28867. |
CI gfxreconstruct build # 3114 running. |
CI gfxreconstruct build # 3114 passed. |
I see - you changed the copyright in the generator in order to update the one updated file and so it also changed in 40 other files... I understand why that would be necessary. |
Good catch noticing the race condition.
Since you will need to use the new mutex in |
I'll try it out, will keep you updated. Thanks. |
Yes, I did meet the compilation errors due to cyclic dependencies with the change of moving class ScopedDestroyLock to capture_manager.h ( 1357ec0 ). Unfortunately, I didn't find a workaround. |
I think I prefer 1357ec0 better. |
Thank you @mizhen for fixing it. However, I think it's not enough if the Create protection is only in I have a small sample for the protection. Further, maybe we could give an individual mutex to every kind of handle, so |
Thank you for your review and providing the sample. #1189 is the issue show up in the target title, I think the key of the issue is because driver return previously deleted handle (driver reuse destroyed resource handle). The issue only happens when driver return previously deletes handle and before GFXR delete the wrapper of that handle. So the pull request makes deleting Vulkan object and its related wrapper become an atomic operation for wrapper creation. For the case in the comment which is related to VulkanCaptureManager::OverrideCreateBuffer, the behaviour of the target game and GFXR is as the following:
[GFXR capture the call]:
[Target game]:
''' Therefore before the control return to target game, it cannot use the returned buffer handle or delete the buffer. In another word, if other thread of the game gets the control and delete a buffer, that will be some other buffer. For deleting other buffer during the creation handling of this buffer, it doesn't cause the multi-threading issue in the pull request. During debugging/testing the pull request, also compared the performance with different lock scope, it shows different lock scope heavily affect the loading process in the running of target title, so the pull request tries to select a minimum scope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. It's impossible to happen that the buffer gets destroyed during creating. The PR looks good to me. Please rebase it, and it's ready to merge.
close: #1189
Ok, will do, thanks. |
With default GFXR setting, if target title destroy some Vulkan handles in one thread (A) and create same type of handle in another thread (B). At the time when is after real handles were destroyed and before their wrappers were deleted from map in thread A, if thread B got CPU to create same type of handles and if anyone of the new created handles was same value of previously destroyed handles, capturing will fail because no correct handle rapper for the new created handle. The commit fix this multi-threading issue.
Update copyright info for all changed files. Please note that the change in gencode.py is only for adding additional copyright info to auto generated files. Change-Id: I211e4433da32682804d6138aa8c0820feb1ce7e6
Changed the description of the mutex and related functions which are used in the commit to address review comments.
Change locking mechanism of the mutex in the commit to RAII.
Add class ScopedDestroyLock to optimize the source code for lock/unlock the corresponding mutex;
Move ScopedDestroyLock class in its own dedicated source file.
00853fa
to
e8a3582
Compare
CI gfxreconstruct build queued with queue ID 113868. |
CI gfxreconstruct build # 3656 running. |
Done. |
CI gfxreconstruct build # 3656 failed. |
CI gfxreconstruct build queued with queue ID 113946. |
FYI, your LunarG CI build #3656 seems to have failed due to an unrelated GPU crash on one machine (passed on all others). I've restarted the run. |
CI gfxreconstruct build # 3658 running. |
CI gfxreconstruct build # 3658 passed. |
The problem
With default GFXR setting, sometimes an app will destroy some Vulkan handles in one thread (A) and create same type of Vulkan handle in another thread (B). There is a gap of time in between when the real handle is destroyed, and when its wrappers were deleted from map in thread A. If during this time period, thread B was able to run, and creates same type of handles, and if any of the newly-created handles had the same value of those destroyed by thread A, capturing will crash because handle rapper handling error for the new created handle.
The solution
The fix fixed the issue by acquiring shared lock for all wrapped handle creation and acquiring exclusive lock for range from destroying real Vulkan handle to end of deleting related wrapper.
Result
Tested target title full trace and trim trace capture/playback with the build of the pull request, all tests passed, the commit fixed the issue.