Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(execution): Implement parallelism in Job #81

Merged
merged 6 commits into from
May 29, 2022

Conversation

irvinlim
Copy link
Member

@irvinlim irvinlim commented May 29, 2022

Closes #71.

This implements all necessary features to support parallel tasks in a single Job. The following changes are introduced:

  1. Added API changes to introduce ParallelismSpec according to the proposal in Proposal: Support task-level parallelism #71.
  2. Revamped API for JobStatus fields, and reduced the possible set of phase, results, states, etc. to reduce duplication.
  3. Changed the job and task naming convention to delimit name components with hyphens instead of periods (e.g. jobconfig-parallel-1653824280 and jobconfig-parallel-sleep-1653822660-ge3tgm-0)
  4. Added mutation and validation handlers for ParallelismSpec.
  5. Compute uncreated tasks and create them in reconciler.

Remaining TODO items:

  • Immediately terminate all remaining tasks when a single task fails (all retries exceeded) when using AllSuccessful
  • Create reconciler integration tests

@irvinlim irvinlim added component/execution Issues or PRs related exclusively to the Execution component (Job, JobConfig) kind/feature Categorizes issue or PR as related to a new, well-defined and agreed-upon feature. area/workloads Related to workload execution (e.g. jobs, tasks) labels May 29, 2022
@codecov
Copy link

codecov bot commented May 29, 2022

Codecov Report

Merging #81 (1343c81) into main (fe5f311) will increase coverage by 0.18%.
The diff coverage is 75.96%.

@@            Coverage Diff             @@
##             main      #81      +/-   ##
==========================================
+ Coverage   62.60%   62.79%   +0.18%     
==========================================
  Files         197      199       +2     
  Lines        9858    10340     +482     
==========================================
+ Hits         6172     6493     +321     
- Misses       3373     3499     +126     
- Partials      313      348      +35     
Impacted Files Coverage Δ
...g/execution/taskexecutor/podtaskexecutor/labels.go 100.00% <ø> (ø)
pkg/execution/util/job/job_utils.go 60.00% <ø> (-24.38%) ⬇️
pkg/execution/mutation/mutation.go 83.74% <27.27%> (-4.34%) ⬇️
pkg/utils/time/time.go 48.00% <46.66%> (-2.00%) ⬇️
...ecution/taskexecutor/podtaskexecutor/pod_lister.go 80.64% <50.00%> (-8.25%) ⬇️
pkg/execution/util/job/task.go 50.00% <50.00%> (ø)
pkg/execution/variablecontext/provider.go 58.69% <53.84%> (+5.75%) ⬆️
...ecution/taskexecutor/podtaskexecutor/pod_client.go 78.57% <57.14%> (+1.64%) ⬆️
.../execution/controllers/jobcontroller/reconciler.go 70.92% <57.66%> (-4.08%) ⬇️
apis/execution/v1alpha1/zz_generated.deepcopy.go 60.40% <67.03%> (-0.62%) ⬇️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe5f311...1343c81. Read the comment docs.

irvinlim added 4 commits May 29, 2022 20:34
- Perform adoption of tasks by name if controllerRef matches in the reconciler
- Avoid using expensive List() which performs linear search
- Add reconciler tests for parallel jobs
@irvinlim irvinlim force-pushed the irvinlim/feat/parallelism branch from 5d4fe1e to 1343c81 Compare May 29, 2022 15:26
@irvinlim irvinlim merged commit 8320441 into main May 29, 2022
@irvinlim irvinlim deleted the irvinlim/feat/parallelism branch May 29, 2022 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/workloads Related to workload execution (e.g. jobs, tasks) component/execution Issues or PRs related exclusively to the Execution component (Job, JobConfig) kind/feature Categorizes issue or PR as related to a new, well-defined and agreed-upon feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: Support task-level parallelism
1 participant