You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generating an Inter-Op plan with ColossalAuto takes usually a 1-2 minutes when running examples/tutorial/auto_parallel/auto_parallel_with_resnet.py. Profiling with cProfile reveals that a large portion of this time is consumed by calling copy.deepcopy, especially in the method DimSpec.build_difference_2d_dict(). Since many DimSpec objects are created1, that function is also called hundreds of thousands of times. Upon closer examination of the logic in this function, it becomes apparent that the result of this method is in fact independent of the DimSpec object, and its content is not mutated throughout its lifetime. Hence, it suffices to only create this dict once and share it among all instances of DimSpec. Due to the large quantity of DimSpec instances created throughout the plan generation, this change can introduce a speed-up of up to 50%2.
Self-service
I'd be willing to do some initial work on this proposal myself.
Proposal
Generating an Inter-Op plan with ColossalAuto takes usually a 1-2 minutes when running
examples/tutorial/auto_parallel/auto_parallel_with_resnet.py
. Profiling with cProfile reveals that a large portion of this time is consumed by callingcopy.deepcopy
, especially in the methodDimSpec.build_difference_2d_dict()
. Since manyDimSpec
objects are created1, that function is also called hundreds of thousands of times. Upon closer examination of the logic in this function, it becomes apparent that the result of this method is in fact independent of theDimSpec
object, and its content is not mutated throughout its lifetime. Hence, it suffices to only create this dict once and share it among all instances ofDimSpec
. Due to the large quantity ofDimSpec
instances created throughout the plan generation, this change can introduce a speed-up of up to 50%2.Self-service
Footnotes
many of which are just empty placeholders btw ↩
when running
examples/tutorial/auto_parallel/auto_parallel_with_resnet.py
↩The text was updated successfully, but these errors were encountered: