-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use NCCL instead of ray for control-plane communication to remove serialization overhead #2221
Merged
Merged
Changes from 4 commits
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
7265829
small test
zhuohan123 20274cc
test ray_pg
zhuohan123 1b73dd7
update ray test
zhuohan123 0d89354
implement driver worker
zhuohan123 e0c4c4e
broadcast swap info
zhuohan123 1baf87b
Broadcast inputmetadata as well
zhuohan123 c947fa0
fix bugs
zhuohan123 761584b
fix comments
zhuohan123 19110fb
remove unused files
zhuohan123 7b05ec6
fix async llm engine
zhuohan123 5f90351
fix format
zhuohan123 6f7ea32
Merge branch 'main' into remove-serialization-overhead
zhuohan123 966e366
[BUGFIX] Fix API server test
zhuohan123 fe2c29a
fix and remove print
zhuohan123 5557cdb
fix test_cache
zhuohan123 d92b38d
Merge branch 'fix-test-api-server' into remove-serialization-overhead
zhuohan123 c7f6c21
fix api test
zhuohan123 332d370
[BUGFIX] Fix the path of test prompts
zhuohan123 9a8c16f
Merge branch 'fix-test-prompt-path' into remove-serialization-overhead
zhuohan123 6ea2a42
fix test_model_runner
zhuohan123 0434a76
Merge branch 'main' into remove-serialization-overhead
zhuohan123 95bb1d3
Fix async llm engine
zhuohan123 de4c8d2
[BUGFIX] Fix communication test
zhuohan123 89d7cfd
Merge branch 'fix-comm-test-2' into remove-serialization-overhead
zhuohan123 2b4863a
style
zhuohan123 3096c56
Fix smaller review comments
zhuohan123 dc4a4c2
fix
zhuohan123 f2b8e88
remove unused files
zhuohan123 83c2735
fix review comments
zhuohan123 3d3a547
allgather -> gather
zhuohan123 680c8d9
fix
zhuohan123 5280a61
fix and revert unnecessary changes
zhuohan123 03b2734
fix
zhuohan123 0ca5e07
fix
zhuohan123 ddb0795
fix review comments
zhuohan123 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a minor optimization you can make it another loop so that the Workers can be initialized in a non-blocking fashion but considering that there's nothing really happening in the
__init__
I think it's ok to leave it in (though it is an anti-pattern).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah also this only happens once so I think this should not relate to the performance.