Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove range limit which depend on kernel #780

Merged
merged 1 commit into from
Apr 26, 2017

Conversation

hqhq
Copy link
Contributor

@hqhq hqhq commented Apr 26, 2017

Signed-off-by: Qiang Huang h.huangqiang@huawei.com

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
@dqminh
Copy link
Contributor

dqminh commented Apr 26, 2017

LGTM

Approved with PullApprove

1 similar comment
@mrunalp
Copy link
Contributor

mrunalp commented Apr 26, 2017

LGTM

Approved with PullApprove

@@ -352,14 +352,14 @@ For more information, see [the kernel cgroups documentation about blkio][cgroup-

The following parameters can be specified to setup the controller:

* **`blkioWeight`** *(uint16, OPTIONAL)* - specifies per-cgroup weight. This is default weight of the group on all devices until and unless overridden by per-device rules. The range is from 10 to 1000.
* **`blkioWeight`** *(uint16, OPTIONAL)* - specifies per-cgroup weight. This is default weight of the group on all devices until and unless overridden by per-device rules.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Punting to the kernel is fine when it's clear which kernel parameter you're punting to. In this case, the spec-reader has to figure out that blkioWeight is (MUST be?) implemented with blkio.weight in the blkio controller, which has:

Currently allowed range of weights is from 10 to 1000

The current docs make it clear that blockIO represents the blkio controller, but mapping between the individual properties and the controller API is left up to the reader's imagination (with no RFC 2119 language for compliance testing, #746). That leaves the semantics of these values pretty wiggly, and outside of validation and compliance testing.

Also the kernel's “Currently allowed” wording is troublesome. If we punt to the kernel for semantics, it's not clear to me how we handle evolving kernel semantics unless we pin the property to a particular kernel version or trust the kernel to maintain backwards compatibility for as long as this version of the spec is useful.

@hqhq hqhq deleted the remove_blkio_range branch April 27, 2017 01:12
@zhouhao3 zhouhao3 mentioned this pull request Apr 27, 2017
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request May 15, 2017
This is a bit awkward, since:

* It's not a direct wrapper around mknod(2) (which, for example, does
  not use the c/b/u/p characters).
* The runtime doesn't have to use mknod, so binding it to mknod(1)-ish
  invocations doesn't make much sense.

Instead, I've bound it to POSIX's stat(3) to show what compliance
testing (and anything else inside the container) can expect the
results (however the runtime accomplishes them) to look like.

The previous wording wasn't clear on whether symlinks were an allowed
approach.  The new wording explicitly allows them by using
stat(1)-like symlink resolution.

I've also clarified relative 'path' handling and explicitly declared
the appropriate mount namespace (impacts 'path') and PID namespace
(impacts 'uid' and 'gid').

Because we're focused on post-create stat calls, I've also added new
wording about handling duplicate 'path' values.

I've used POSIX reference where possible (vs. Linux man pages),
because they contain sufficient detail for this section, have
well-versioned URLs, and are more likely to be portable if this
section ever applies to non-Linux configs (BSD?  Solaris?).

Related to recent discussion around punting to the kernel [1,2],
although in this case we're not changing the JSON Schema because the
existing local validation (valid 'type' characters and the 'fileMode'
range) both feed into a single mode_t integer in the stat(3) and
mknod(2) APIs.  For a cleaner kernel punt, we could drop 'type', lift
the range limit on 'fileMode', and map it directly to st.st_mode. But
that seemed like a big backwards-compat shift for this commit.

[1]: opencontainers#780
[2]: opencontainers#690 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request May 19, 2017
Since ce55de2 (Remove range limit which depend on kernel, 2017-04-26,
opencontainers#780), the spec has been more aggressively punting to the kernel APIs
(vs. carrying local versions of kernel limitations).  For the
properties touched by this commit, a pull request to reflect our old
valid values (e.g. requiring 'type' to match ^[acb]$) was rejected as
part of this punting approach.  However, before this commit, it wasn't
clear exactly what kernel interface was being punted to.

With this commit, we replace the old inline docs with an explicit punt
to the device whitelist controller, listing the exact actions that the
runtime MUST take for given config values.  This allows for
compliance-testing runtimes [2] (ensuring config portability between
compliant runtimes) and makes it possible to validate a given config
against a given kernel (e.g. Linux 4.11.1 only accepts 'a', 'b', and
'c' as type characters [3]).

[1]: opencontainers#690 (comment)
[2]: opencontainers#746
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/security/device_cgroup.c?h=v4.11.1#n618

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Jun 1, 2017
This is a bit awkward, since:

* It's not a direct wrapper around mknod(2) (which, for example, does
  not use the c/b/u/p characters).
* The runtime doesn't have to use mknod, so binding it to mknod(1)-ish
  invocations doesn't make much sense.

Instead, I've bound it to POSIX's stat(3) to show what compliance
testing (and anything else inside the container) can expect the
results (however the runtime accomplishes them) to look like.

The previous wording wasn't clear on whether symlinks were an allowed
approach.  The new wording explicitly allows them by using
stat(1)-like symlink resolution.

I've also clarified relative 'path' handling and explicitly declared
the appropriate mount namespace (impacts 'path') and PID namespace
(impacts 'uid' and 'gid').

Because we're focused on post-create stat calls, I've also added new
wording about handling duplicate 'path' values.

I've used POSIX reference where possible (vs. Linux man pages),
because they contain sufficient detail for this section, have
well-versioned URLs, and are more likely to be portable if this
section ever applies to non-Linux configs (BSD?  Solaris?).

Related to recent discussion around punting to the kernel [1,2],
although in this case we're not changing the JSON Schema because the
existing local validation (valid 'type' characters and the 'fileMode'
range) both feed into a single mode_t integer in the stat(3) and
mknod(2) APIs.  For a cleaner kernel punt, we could drop 'type', lift
the range limit on 'fileMode', and map it directly to st.st_mode. But
that seemed like a big backwards-compat shift for this commit.

[1]: opencontainers#780
[2]: opencontainers#690 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Jun 1, 2017
Since ce55de2 (Remove range limit which depend on kernel, 2017-04-26,
opencontainers#780), the spec has been more aggressively punting to the kernel APIs
(vs. carrying local versions of kernel limitations).  For the
properties touched by this commit, a pull request to reflect our old
valid values (e.g. requiring 'type' to match ^[acb]$) was rejected as
part of this punting approach.  However, before this commit, it wasn't
clear exactly what kernel interface was being punted to.

With this commit, we replace the old inline docs with an explicit punt
to the device whitelist controller, listing the exact actions that the
runtime MUST take for given config values.  This allows for
compliance-testing runtimes [2] (ensuring config portability between
compliant runtimes) and makes it possible to validate a given config
against a given kernel (e.g. Linux 4.11.1 only accepts 'a', 'b', and
'c' as type characters [3]).

[1]: opencontainers#690 (comment)
[2]: opencontainers#746
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/security/device_cgroup.c?h=v4.11.1#n618

Signed-off-by: W. Trevor King <wking@tremily.us>
@vbatts vbatts mentioned this pull request Jul 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants