[QUESTION]What is the reason why bfloat16 requires gradient accumulation and allreductions need to be completed in fp32 #502

huhu0823 · 2023-09-15T08:26:26Z

Hi,I am currently studying the Megatron framework. I noticed that bfloat16 in megatron requires gradient accumulation, and allreductions need to be completed in fp32. Then the gradient communicates in fp32 format.
However, fp16 requires gradient accumulation, and allreductions can be completed in fp16.Then the gradient communicates in fp16 format.
I want to know what are the special reasons for these two different calculation methods

Line 159-160 of the megatron/arguments. py file

github-actions · 2023-11-14T18:19:50Z

Marking as stale. No activity in 60 days.

Boreaso · 2024-02-22T14:26:25Z

Same question, anyone knows the reason?

github-actions · 2024-04-22T18:20:50Z

Marking as stale. No activity in 60 days.

github-actions bot added the stale No activity in 60 days on issue or PR label Nov 14, 2023

github-actions bot removed the stale No activity in 60 days on issue or PR label Feb 22, 2024

github-actions bot added the stale No activity in 60 days on issue or PR label Apr 22, 2024

shanmugamr1992 closed this as completed Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION]What is the reason why bfloat16 requires gradient accumulation and allreductions need to be completed in fp32 #502

[QUESTION]What is the reason why bfloat16 requires gradient accumulation and allreductions need to be completed in fp32 #502

huhu0823 commented Sep 15, 2023 •

edited

Loading

github-actions bot commented Nov 14, 2023

Boreaso commented Feb 22, 2024

github-actions bot commented Apr 22, 2024

[QUESTION]What is the reason why bfloat16 requires gradient accumulation and allreductions need to be completed in fp32 #502

[QUESTION]What is the reason why bfloat16 requires gradient accumulation and allreductions need to be completed in fp32 #502

Comments

huhu0823 commented Sep 15, 2023 • edited Loading

github-actions bot commented Nov 14, 2023

Boreaso commented Feb 22, 2024

github-actions bot commented Apr 22, 2024

huhu0823 commented Sep 15, 2023 •

edited

Loading