-
Notifications
You must be signed in to change notification settings - Fork 609
Difference between Task and Job (and task clustering)
Although we use task and job interchangeably in some case, there are strictly different. A task is a program specified by the user to run. It is corresponding to a 'job' in the DAX since DAX is created by the user. A job in WorkflowSim is a single execution unit which contains one or multiple tasks. However, a job itself in WorkflowSim extends from task to simplify some codes.
Task clustering is an optimization method that merges multiple tasks into a job. With different optimization purposes, we may end up with fault tolerant clustering (minimize the failure influence), balanced task clustering (balancing the data transfer cost and communication cost) and etc.
Why task clustering may reduce the makespan even it loss some parallelism? Task clustering works only under resource contention which means we do not have enough resources and we have to merge tasks into jobs. For example, the Montage workflow can have up to 10,000 tasks in each level while usually we have 20 nodes in a small data cluster. By merging these tasks into 20 or 40 jobs, we still fully utilize the resources and improve the overall runtime as well. Another important issue is overhead, job submission, job execution, job preparation all have overheads that are significant in modern distributed systems. 10,000 tasks executing alone may end up 10,000x overheads while 20 jobs have 20x overheads.
For more details about the benefit of task clustering, please refer to any task clustering paper.