Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROPOSAL]: Speed improvement of Intra-Op plan generation in ColossalAuto #5436

Closed
1 task done
stephankoe opened this issue Mar 9, 2024 · 0 comments · Fixed by #5446
Closed
1 task done

[PROPOSAL]: Speed improvement of Intra-Op plan generation in ColossalAuto #5436

stephankoe opened this issue Mar 9, 2024 · 0 comments · Fixed by #5446
Labels
enhancement New feature or request

Comments

@stephankoe
Copy link
Contributor

Proposal

Generating an Inter-Op plan with ColossalAuto takes usually a 1-2 minutes when running examples/tutorial/auto_parallel/auto_parallel_with_resnet.py. Profiling with cProfile reveals that a large portion of this time is consumed by calling copy.deepcopy, especially in the method DimSpec.build_difference_2d_dict(). Since many DimSpec objects are created1, that function is also called hundreds of thousands of times. Upon closer examination of the logic in this function, it becomes apparent that the result of this method is in fact independent of the DimSpec object, and its content is not mutated throughout its lifetime. Hence, it suffices to only create this dict once and share it among all instances of DimSpec. Due to the large quantity of DimSpec instances created throughout the plan generation, this change can introduce a speed-up of up to 50%2.

Self-service

  • I'd be willing to do some initial work on this proposal myself.

Footnotes

  1. many of which are just empty placeholders btw

  2. when running examples/tutorial/auto_parallel/auto_parallel_with_resnet.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
1 participant