Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The default value of cfs_quota_us should be -1, which is negative #499

Closed
wants to merge 1 commit into from

Conversation

LaynePeng
Copy link

The default value of cfs_quota_us should be -1, means unlimited. So, I think we need to make this field int. Because RunC has added update feature, which should let user set it back to unlimited.

@wking
Copy link
Contributor

wking commented Jun 16, 2016

On Thu, Jun 16, 2016 at 11:07:25AM -0700, Layne Peng wrote:

The default value of cfs_quota_us should be -1, means unlimited. So,
I think we need to make this field int. Because RunC has added
update feature, which should let user set it back to unlimited.

This issue is broader than that one field. For example 2ce2c86
(runtime: config: linux: add cgroups information, 2015-09-18, #199)
converted a number of signed fields to unsigned fields, including
swappiness and others where -1 means unlimited. 488f174 (Fix cgroups
value types in the spec, 2015-10-27, #233) made a lot of those fields
pointers.

I agree that if you allow join-and-tweak it's nice to make a
distinction between “leave it alone” and “explicitly set this to
unlimited”, and the #233 discussion did not address that. If we do
want to make adjustments to support “join-and-tweak” in the spec, I'd
rather go all the way and support it (#17, #305). Making this sort of
adjustment on a per-setting basis seems like it will leave us with an
inconsistent approach for a longtime ;).

Another approach to this distinction (which is clear in JSON, but
maybe not in the Go bindings) would be to treat explicit nulls as
“remove this limit” while treating missing entries as “don't touch
this limit” (see 1 for an example where memory.limit_in_bytes is
explicitly set to null and another where it is unset).

@tianon
Copy link
Member

tianon commented Sep 8, 2016

For reference, https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt:

A value of -1 for cpu.cfs_quota_us indicates that the group does not have any
bandwidth restriction in place, such a group is described as an unconstrained
bandwidth group. This represents the traditional work-conserving behavior for
CFS.

I'm +1 on this change (given that the kernel clearly considers negative values not only permissible but meaningful), but to be considered the commit will need a Signed-off-by line amended in.

@wking
Copy link
Contributor

wking commented Sep 8, 2016

On Thu, Sep 08, 2016 at 02:01:04PM -0700, Tianon Gravi wrote:

For reference, https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt:

A value of -1 for cpu.cfs_quota_us indicates that the group does not have any
bandwidth restriction in place, such a group is described as an unconstrained
bandwidth group. This represents the traditional work-conserving behavior for
CFS.

I'm +1 on this change (given that the kernel clearly considers
negative values not only permissible but meaningful)…

Can you float a situation where you'd use -1 outside of join-and-tweak
1? When you set up a new cgroup, the bandwidth to be unconstrained
by default. Quoting from sched-bwc.txt:

The default values are:
cpu.cfs_period_us=100ms
cpu.cfs_quota=-1

I'm ok with allowing join-and-tweak, but if we do I think we should
make that shift more holistically instead of one property at a time
;).

@crosbymichael
Copy link
Member

@LaynePeng the change makes sense to me. Thanks. Can you rebase this PR and make sure the travis errors are fixed?

Thanks!

@hqhq
Copy link
Contributor

hqhq commented Oct 29, 2016

I agree with @wking this is a more than one field issue, if we consider join-and-tweak , all fields which accept negative values by kernel api should use int64, and this is also useful when updating a container, though we don't support update in runtime-spec yet.

@crosbymichael Do you think we should fix all these fields to suit kernel api? If so, I can carry this PR and fix them all in once.

@wking
Copy link
Contributor

wking commented Oct 29, 2016

On Fri, Oct 28, 2016 at 11:48:01PM -0700, Qiang Huang wrote:

… all fields which accept negative values by kernel api should use
int64

The filesystem cgroup API isn't based on JSON, so I don't think we
need to use -1 to mean “explicitly clear any current limit” just
because the kernel does. For example, we could use explicit nulls or
an “unlimited” string. More on this and some discussion of
alternatives, see #233 and 1. The discussion around a ‘namespaces’
object also touches on potential distinctions between “set to empty”
and “unset” 2.

So using -1 where the kernel uses -1, and using null or unset where
the kernel uses ‘a’ or ‘*’ 3, etc., etc. is possible. But we may
want to consider an alternative way to represent “unlimited” that is
not property-specific.

 Subject: Re: rlimits
 Date: Thu, 20 Oct 2016 14:06:27 -0700
 Message-ID: <20161020210627.GT4005@odin.tremily.us>
 And subsequent comments in that thread

@hqhq
Copy link
Contributor

hqhq commented Oct 31, 2016

@wking Yes we don't have to use -1 in json for cgroup API, and upper level tools can still use -1 and it can also work, as discussion in moby/moby#22578 , it's because we depend on the fact that:

uint64(-1) = 18446744073709551615
int64(18446744073709551615) = -1

I don't think this is good enough.

And yes we can use explicit nulls or an "unlimited" string, I just don't see the necessity that we stick value types to unsigned and make such diversity from kernel API. Is that because it's more close to instinct? I also doubt that, because from the abundant discussion about naming issues in Docker, people always finally agreed to use the name following current kernel API way instead of creating a new-more-instinctive-name.

@wking
Copy link
Contributor

wking commented Oct 31, 2016

On Sun, Oct 30, 2016 at 11:16:54PM -0700, Qiang Huang wrote:

… people always finally agreed to use the name following current
kernel API way instead of creating a new-more-instinctive-name.

Following the kernel is certainly the path of least resistance. There
are some typing difficulties with that approach though, for example
where the kernel has a character interface for the device controller's
major/minor but we use an int64 1. If we wanted to follow the
kernel's ‘
’ for “unlimited”, we'd need to convert that to a string
field. If we stick with our current “explicit null or unset” we can
keep our current integer pointer. If we switch to using 0 or -1, we
can drop the pointer. We can make that sort of call on a case-by-case
basis, but I'd rather have a policy of some sort to guide those
decisions.

@crosbymichael
Copy link
Member

I would like for us to keep a consistent model of nil/null meaning do nothing. This can stay the same with the initial create and update actions.

Is -1 valid to write to the file to unset it?

@wking
Copy link
Contributor

wking commented Nov 4, 2016

On Fri, Nov 4, 2016 at 9:46AM -0700, Michael Crosby wrote:

I would like for us to keep a consistent model of nil/null meaning
do nothing.

And also presumably “unset means do nothing”. That uses two values
for “do nothing”, and rules out using null as an explicit “unlimited”.
But it makes Go handling easier, because Go's unmarshal-into-a-type
JSON handler doesn't distinguish between unset and “set to null”.

Is -1 valid to write to the file to unset it?

Yes, for cfs_quota_us 1. But the devices cgroup uses ‘a’ for
“unlimited type” and ‘*’ for “unlimited major/minor” 2. However,
handling a mixed type field (e.g. “set to an integer major/minor or
the string ‘unlimited’ for unset”) is probably a no-go with Go's
unmarshal-into-a-type JSON handler.

So as far as I can see, that leaves us with “just use the kernel's
default unlimited value”.

The only tricky cases will be deciding whether existing values like
linux.resources.devices[].type 3 actually make a distinction between
“do nothing” and “set to the unlimited value”, because you've already
made the decision to do something by adding the device entry. If we
decide that we can let null/unset be “set to the unlimited value”, we
can leave the current spec alone. If we decide that it is doing
something and we can't accept null/unset mapping to the unlimited
value, we need to update the spec to require the ‘type’ field and
error out if it is not set.

@tianon
Copy link
Member

tianon commented Nov 4, 2016

LGTM

@LaynePeng can you please amend your commit with a valid DCO (Signed-off-by) and force push to update this PR?

Approved with PullApprove

hqhq added a commit to hqhq/runtime-spec that referenced this pull request Jan 5, 2017
Carry opencontainers#499

For these values, cgroup kernal APIs accept -1 to set
them as unlimited, as docker and runc all support
update resources, we should not set drawbacks in spec.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
@hqhq
Copy link
Contributor

hqhq commented Jan 5, 2017

I'm carrying it in #648 , thanks for taking this to us @LaynePeng .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants