Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add moe_router_device_choice_method argument to choose method … #1381

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

bzantium
Copy link

@bzantium bzantium commented Feb 6, 2025

…as DeepSeekV2 or DeepSeekV3
resolved: #1378

@bzantium bzantium changed the title add moe_router_topk_limited_devices_method argument to choose method … add moe_router_device_choice_method argument to choose method … Feb 6, 2025
@bzantium
Copy link
Author

bzantium commented Feb 6, 2025

Could you review this?
to: @ko3n1g

@yanring
Copy link
Contributor

yanring commented Feb 6, 2025

Hi @bzantium, thanks for the PR! It's indeed a diff between deepseek-v2 and v3.
We actually have an internal PR for this, along with the support for node-limited routing for V3. It's currently under review and should be merged into the main branch soon.

    group.add_argument('--moe-router-num-groups', type=int, default=None,
                       help='Number of groups to divide experts into for group-limited routing. When using group-limited routing: 1) Experts are divided into equal-sized groups, 2) For each token, a subset of groups are selected based on routing scores (sum of top-2 expert scores within each group), 3) From these selected groups, moe_router_topk experts are chosen.'
                       'Two common use cases: 1) Device-limited routing: Set equal to expert parallel size (EP) to limit each token to experts on a subset of devices (See DeepSeek-V2: https://arxiv.org/pdf/2405.04434) 2) Node-limited routing: Set equal to number of nodes in EP group to limit each token to experts on a subset of nodes (See DeepSeek-V3: https://arxiv.org/pdf/2412.19437)')
    group.add_argument('--moe-router-group-topk', type=int, default=None,
                       help='Number of selected groups for group-limited routing.')

Feel free to reach out if you have any questions.

@bzantium
Copy link
Author

bzantium commented Feb 6, 2025

Hi @yanring, Thanks for fast review! looking forward to having it merged soon :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT] add options how to choose topk devices for device_limited_topk
2 participants