[ENHANCEMENT] add options how to choose topk devices for `device_limited_topk` #1378

bzantium · 2025-02-06T02:10:53Z

Is your feature request related to a problem? Please describe.
Based on the original implementation for deepseek-v3, they use top-2 and sum instead of max to choose topk groups.

if self.topk_method == "noaux_tc":
    scores_for_choice = scores.view(bsz * seq_len, -1) + self.e_score_correction_bias.unsqueeze(0)
    group_scores = (
    scores_for_choice.view(bsz * seq_len, self.n_group, -1).topk(2, dim=-1)[0].sum(dim = -1)

but in Megatron-LM, it only uses max like:

    num_group = (
        parallel_state.get_expert_model_parallel_world_size()
    )  # num_group equals to expert parallel size
    group_scores = scores.view(num_tokens, num_group, -1).max(dim=-1).values
    group_idx = torch.topk(group_scores, k=moe_router_topk_limited_devices, dim=-1, sorted=False)[1]

However, DeepSeek-V2 technical report and implementation suggest to use max so giving options (between two) would be the solution.

The text was updated successfully, but these errors were encountered:

bzantium changed the title ~~[ENHANCEMENT] use sum instead of max for device_limited_topk~~ [ENHANCEMENT] add options how to choose topk devices for device_limited_topk Feb 6, 2025

bzantium linked a pull request Feb 6, 2025 that will close this issue

add moe_router_device_choice_method argument to choose method … #1381

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENHANCEMENT] add options how to choose topk devices for `device_limited_topk` #1378

[ENHANCEMENT] add options how to choose topk devices for `device_limited_topk` #1378

bzantium commented Feb 6, 2025 •

edited

Loading

[ENHANCEMENT] add options how to choose topk devices for device_limited_topk #1378

[ENHANCEMENT] add options how to choose topk devices for device_limited_topk #1378

Comments

bzantium commented Feb 6, 2025 • edited Loading

[ENHANCEMENT] add options how to choose topk devices for `device_limited_topk` #1378

[ENHANCEMENT] add options how to choose topk devices for `device_limited_topk` #1378

bzantium commented Feb 6, 2025 •

edited

Loading