Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Considering best-effort pods when calculating ready task number #647

Merged
merged 1 commit into from
Dec 31, 2019

Conversation

sivanzcw
Copy link
Contributor

@volcano-sh-bot volcano-sh-bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Dec 26, 2019
@TravisBuddy
Copy link

Hey @sivanzcw,
Your changes look good to me!

View build log

TravisBuddy Request Identifier: 361624e0-2796-11ea-8215-356b6c757fb9

@@ -297,6 +297,15 @@ func (ssn *Session) Allocate(task *api.TaskInfo, hostname string) error {
return err
}
}
} else {
// For best-effort pod, it it not necessary to consider the gang constraint.
if task.InitResreq.IsEmpty() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's happen if there not enough resources for the other Pods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are six pods in a job, and the minA is 6, two of the the six pods are best-effort pods, and the others are non-best-effort pods. The best-effort pods will be dispatched succeed and refreshed to running while the non-best-effort pods will be remain pending if there are not enough resources to satisfy the resoure requests of the job in the cluster.

@volcano-sh-bot volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 26, 2019
@sivanzcw sivanzcw changed the title dispatch best-effort task immediately without considering the gang constraint add jobCandidateReady branch in allocate action Dec 26, 2019
@TravisBuddy
Copy link

Hey @sivanzcw,
Your changes look good to me!

View build log

TravisBuddy Request Identifier: 5e544e60-27c7-11ea-8215-356b6c757fb9

@TravisBuddy
Copy link

Hey @sivanzcw,
Something went wrong with the build.

TravisCI finished with status errored, which means the build failed because of something unrelated to the tests, such as a problem with a dependency or the build process itself.

View build log

TravisBuddy Request Identifier: 986c21e0-27c7-11ea-8215-356b6c757fb9

@TravisBuddy
Copy link

Hey @sivanzcw,
Your changes look good to me!

View build log

TravisBuddy Request Identifier: de043ca0-27cd-11ea-8215-356b6c757fb9

@@ -231,6 +231,10 @@ func (alloc *allocateAction) Execute(ssn *framework.Session) {

if ssn.JobReady(job) {
stmt.Commit()
} else if ssn.JobCandidateReady(job) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's not necessary to add a new callback; that should ok to check best-effort pods in JobReadyFn.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the judgement of the best-effort pod is merged into the JobReady fuction, when JobReady is satisfied, the dispatch of non-best-effort pod will be started, which will cause the non-best-effort pod of the job to be bound to the node first. But at this time, the gang constraint of the job is not satisfied. Maybe the remaining best-effort pods under the job can not be successfully scheduled during the backfill action.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should dispatch all pods of job in allocate action, including best-effort pod :)

@volcano-sh-bot volcano-sh-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 29, 2019
@sivanzcw sivanzcw changed the title add jobCandidateReady branch in allocate action Considering best-effort pods when calculating ready task number Dec 29, 2019
@TravisBuddy
Copy link

Hey @sivanzcw,
Your changes look good to me!

View build log

TravisBuddy Request Identifier: 7cdb45c0-2a40-11ea-8495-6f9e5f4a4dcf

for _, task := range ji.Tasks {
if task.InitResreq.IsEmpty() {
occupied++
} else if AllocatedStatus(task.Status) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if so, we're going to go through all pending tasks for this. Use TaskStatusIndex to check allocated tasks, and find best-effort pods in pending status.

@TravisBuddy
Copy link

Hey @sivanzcw,
Something went wrong with the build.

TravisCI finished with status errored, which means the build failed because of something unrelated to the tests, such as a problem with a dependency or the build process itself.

View build log

TravisBuddy Request Identifier: 8daca410-2aaa-11ea-8495-6f9e5f4a4dcf

@volcano-sh-bot volcano-sh-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 30, 2019
@TravisBuddy
Copy link

Hey @sivanzcw,
Your changes look good to me!

View build log

TravisBuddy Request Identifier: 2ef65d10-2abb-11ea-b3af-3392aa6c8569

@k82cn
Copy link
Member

k82cn commented Dec 31, 2019

/lgtm
/approve

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Dec 31, 2019
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: k82cn, sivanzcw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants