Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adapt CI tests to use compiled_rmsnorm #451

Closed
wants to merge 2 commits into from

Conversation

tianyu-l
Copy link
Contributor

@tianyu-l tianyu-l commented Jul 10, 2024

tianyu-l added a commit that referenced this pull request Jul 10, 2024
ghstack-source-id: 357f6104859f6d2d9372622a8284330502a4db77
Pull Request resolved: #451
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 10, 2024
@tianyu-l tianyu-l requested review from wconstab and awgu July 11, 2024 05:10
@@ -54,7 +54,7 @@ def build_test_list():
"--experimental.pipeline_parallel_split_points layers.4",
"--experimental.pipeline_parallel_schedule 1f1b",
"--training.data_parallel_degree 1",
"--model.norm_type rmsnorm", # fused_rmsnorm crashes with PP
"--model.norm_type rmsnorm", # compiled_rmsnorm / fused_rmsnorm crashes with PP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious what is the reason that compiled RMSNorm is incompatible with PP

test_runner.py Outdated
@@ -257,7 +253,7 @@ def build_test_list():
"--experimental.pipeline_parallel_degree 4",
"--experimental.pipeline_parallel_split_points layers.1,layers.2,layers.3,layers.4,layers.5,layers.6,layers.7",
"--experimental.pipeline_parallel_schedule interleaved_1f1b",
"--model.norm_type rmsnorm", # fused_rmsnorm throws cuda context error with pp
"--model.norm_type rmsnorm", # compiled_rmsnorm / fused_rmsnorm throws cuda context error with pp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. is this cuda context error the same reason as above for "crashes with PP"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It seems nobody understands why the error happens at this moment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is the same error, maybe we should make the commit uniform?

tianyu-l added a commit that referenced this pull request Aug 2, 2024
ghstack-source-id: fa38b2261667b182e0b05699d2f04772ec5852d3
Pull Request resolved: #451
@tianyu-l tianyu-l force-pushed the gh/tianyu-l/15/head branch from a437389 to a757a84 Compare August 16, 2024 21:00
tianyu-l added a commit that referenced this pull request Aug 16, 2024
ghstack-source-id: fa38b2261667b182e0b05699d2f04772ec5852d3
Pull Request resolved: #451
@tianyu-l
Copy link
Contributor Author

see updates in #535

@tianyu-l tianyu-l closed this Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants