Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: fix a bug where we did not account for poststart tasks resources #24297

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mvegter
Copy link
Contributor

@mvegter mvegter commented Oct 25, 2024

Fixes a bug in the AllocatedResources.Comparable method, which resulted in
reporting less required resources than actually expected. This could result in
overscheduling of allocations on a single node and overlapping cgroup cpusets.

Split to #24304


job "redis2nd" {
  type = "service"
  group "cache" {
    count = 1

    task "redis-prestart" {
      lifecycle {
        hook    = "prestart"
        sidecar = false
      }
      driver = "docker"
      config {
        image = "hello-world:latest"
      }
      resources {
        cpu = 1000
      }
    }

    task "redis" {
      driver = "docker"
      config {
        image = "redis:3.2"
      }
      resources {
        cpu = 1000
      }
    }

    task "redis-start-side" {
      lifecycle {
        hook    = "poststart"
        sidecar = true
      }
      driver = "docker"
      config {
        image = "redis:3.2"
      }
      resources {
        cpu = 1000
      }
    }

    task "redis-poststop" {
      lifecycle {
        hook    = "poststop"
        sidecar = false
      }
      driver = "docker"
      config {
        image = "hello-world:latest"
      }
      resources {
        cpu = 1000
      }
    }
  }
}

image

Before

[sandbox@nomad-dev nomad]$ curl -s http://localhost:4646/v1/metrics | jq '.Gauges[] | select(.Name | contains("allocated.cpu")) | .Name, .Value'
"nomad.client.allocated.cpu"
1000.0
"nomad.client.unallocated.cpu"
277380.0

After

[sandbox@nomad-dev nomad]$ curl -s http://localhost:4646/v1/metrics | jq '.Gauges[] | select(.Name | contains("allocated.cpu")) | .Name, .Value'
"nomad.client.allocated.cpu"
2000.0
"nomad.client.unallocated.cpu"
276380.0

@mvegter mvegter force-pushed the mvegter-fix-overlapping-cpuset-due-to-posstart-tasks branch from 77b60ba to 7d34250 Compare October 25, 2024 18:09
@mvegter mvegter force-pushed the mvegter-fix-overlapping-cpuset-due-to-posstart-tasks branch from 7d34250 to fa4a1ee Compare October 27, 2024 14:18
@mvegter mvegter changed the title scheduler: take into account posstart task to prevent overlapping cpusets scheduler: fix a bug where we did not account for poststart tasks resources Oct 27, 2024
…ources

Fixes a bug in the AllocatedResources.Comparable method, which resulted in
reporting less required resources than actually expected. This could result in
overscheduling of allocations on a single node  and overlapping cgroup cpusets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

1 participant