[misc] fix: gradient accumulation in seq balance and modify default vllm log level #141

PeterSH6 · 2025-01-27T06:12:16Z

Previous gradient accumulation value is computed by micro_batch_size, which is wrong when using dynamic_bsz
Fix ci script to avoid overlooking this issue
Change vLLM state log default value to True to disable log.
We will check the self.config.actor.ppo_mini_batch_size % self.config.actor.ppo_micro_batch_size_per_gpu == 0 after normalization in fsdp_workers instead of in dp_actor and dp_critic.

…level

verl/workers/actor/dp_actor.py

verl/workers/critic/dp_critic.py

vermouth1992 · 2025-01-27T08:43:58Z

verl/workers/fsdp_workers.py

@@ -125,6 +125,7 @@ def __init__(self, config: DictConfig, role: str):
                self.config.actor.ppo_micro_batch_size //= (self.device_mesh.shape[0] //
                                                            self.ulysses_sequence_parallel_size)
                self.config.actor.ppo_micro_batch_size_per_gpu = self.config.actor.ppo_micro_batch_size
+                assert self.config.actor.ppo_mini_batch_size % self.config.actor.ppo_micro_batch_size_per_gpu == 0


We also have to encore self.config.actor.ppo_mini_batch_size >= n_gpus * self.config.actor.ppo_micro_batch_size_per_gpu

Is it necessary? The mini_batch_size here is already normalized, then if self.config.actor.ppo_mini_batch_size < n_gpus * self.config.actor.ppo_micro_batch_size_per_gpu, the above line will not get 0

…llm log level (volcengine#141) - Previous gradient accumulation value is computed by micro_batch_size, which is wrong when using dynamic_bsz - Fix ci script to avoid overlooking this issue - Change vLLM state log default value to True to disable log. - We will check the `self.config.actor.ppo_mini_batch_size % self.config.actor.ppo_micro_batch_size_per_gpu == 0` after normalization in fsdp_workers instead of in dp_actor and dp_critic.

PeterSH6 added 2 commits January 27, 2025 14:06

fix gradient accumulation in seq balance and modify default vllm log …

4aafcaa

…level

fix ci script

b7b6427

PeterSH6 requested a review from vermouth1992 January 27, 2025 06:12

PeterSH6 added 2 commits January 27, 2025 16:05

quick fix of grad accu and option for chunk prefill

c03f0b0

lint

54df770

vermouth1992 reviewed Jan 27, 2025

View reviewed changes

verl/workers/actor/dp_actor.py Outdated Show resolved Hide resolved

vermouth1992 reviewed Jan 27, 2025

View reviewed changes

verl/workers/critic/dp_critic.py Outdated Show resolved Hide resolved

vermouth1992 reviewed Jan 27, 2025

View reviewed changes

PeterSH6 added 2 commits January 27, 2025 17:45

delete

861dea1

fix ci bsz

f5e27fd

PeterSH6 force-pushed the gm/fix_grad branch from 099e21c to f5e27fd Compare January 27, 2025 11:56

PeterSH6 added 4 commits January 27, 2025 19:56

add early exit of github ci

7fd5efd

fix ci bsz

4c5a6c1

test ci optim

c5c2de5

delete ci optim

75f8629

PeterSH6 mentioned this pull request Jan 27, 2025

"ppo_micro_batch_size" missing in examples/grpo_trainer/run_qwen2-7b_seq_balance.sh #145

Closed

vermouth1992 approved these changes Jan 27, 2025

View reviewed changes

vermouth1992 merged commit 695bdbb into main Jan 27, 2025
10 checks passed

vermouth1992 deleted the gm/fix_grad branch January 27, 2025 13:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[misc] fix: gradient accumulation in seq balance and modify default vllm log level #141

[misc] fix: gradient accumulation in seq balance and modify default vllm log level #141

PeterSH6 commented Jan 27, 2025

vermouth1992 Jan 27, 2025

PeterSH6 Jan 27, 2025

[misc] fix: gradient accumulation in seq balance and modify default vllm log level #141

[misc] fix: gradient accumulation in seq balance and modify default vllm log level #141

Conversation

PeterSH6 commented Jan 27, 2025

vermouth1992 Jan 27, 2025

Choose a reason for hiding this comment

PeterSH6 Jan 27, 2025

Choose a reason for hiding this comment