-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCP/RMA/FLUSH: Dynamic Selection of Strong vs. Weak Fence #10474
Open
michal-shalev
wants to merge
21
commits into
openucx:master
Choose a base branch
from
michal-shalev:dynamic-selection-of-strong-vs-weak-fence
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
7d123db
UCP/RMA/FLUSH: Add unflushed_lanes to ucp_ep
michal-shalev 74f5e73
UCP/RMA/FLUSH: Add sequence numbers to EP and worker structs
michal-shalev 8d7bf47
UCP/RMA/FLUSH: Delete auto fence mode and change default to strong
michal-shalev 4b5fbc7
UCP/RMA/FLUSH: Add EP-based fence mode
michal-shalev c8d8e9f
UCP/RMA/FLUSH: Implement per-EP weak and strong fences
michal-shalev 57e8752
UCP/RMA/FLUSH: Handle fence during one-sided operations
michal-shalev a33f74d
UCP/RMA/FLUSH: Move unflushed_lanes from the union
michal-shalev 2d459bd
UCP/RMA/FLUSH: Move fence_seq from the union
michal-shalev 6160121
UCP/RMA/FLUSH: Add correctness test
michal-shalev c306d1f
UCP/RMA/FLUSH: PR fixes
michal-shalev d139b75
UCP/RMA/FLUSH: PR fixes 2.0
michal-shalev 22aef72
UCP/RMA/FLUSH: PR fixes 3.0
michal-shalev 471cfbe
UCP/RMA/FLUSH: PR fixes 4.0
michal-shalev 0101b84
UCP/RMA/FLUSH: PR fixes 5.0
michal-shalev df2b83f
UCP/RMA/FLUSH: PR fixes 6.0
michal-shalev 74d5cdb
UCP/RMA/FLUSH: PR fixes 7.0
michal-shalev 09eee92
UCP/RMA/FLUSH: PR fixes 8.0
michal-shalev 9126795
UCP/RMA/FLUSH: PR fixes 9.0
michal-shalev c28dd5a
UCP/RMA/FLUSH: PR fixes 10.0
michal-shalev 1cfc7d8
UCP/RMA/FLUSH: PR fixes 11.0
michal-shalev 1b09153
UCP/RMA/FLUSH: PR fixes 12.0
michal-shalev File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does the assembly compare? seems more complicated IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used Godbolt to check:
This shows that
update_with_branch
includes a conditional branch (je .L1
),while
update_branchless
avoids branching entirely by using bitwise operations.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but actually we have 2 flags, can you check in godbolt something like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated my original comment @yosefe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, maybe we could update unflushed_lanes unconditionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's the best solution so far, pushed another commit