Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use random ID as runc container ID #384

Closed
wants to merge 2 commits into from

Conversation

hqhq
Copy link
Contributor

@hqhq hqhq commented Nov 4, 2015

Currently we're using the working directory of the runc process as the container ID, that's not good to identify, and you can't start two container in the same directory, or different path with the same directory name.

Use a random ID as the container ID like Docker does, I think that makes more sense.

hqhq added 2 commits November 4, 2015 15:45
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
@mrunalp
Copy link
Contributor

mrunalp commented Nov 4, 2015

LGTM @crosbymichael @LK4D4

@crosbymichael
Copy link
Member

ya, but why won't you name the directories as random ids? you shouldn't run multiple containers in the same dir because they are going to overwrite state.

@LK4D4
Copy link
Contributor

LK4D4 commented Nov 4, 2015

Hmm, how am I supposed to find what ID my container have? :)

@wking
Copy link
Contributor

wking commented Nov 4, 2015

On Wed, Nov 04, 2015 at 11:02:06AM -0800, Alexander Morozov wrote:

Hmm, how am I supposed to find what ID my container have? :)

With a pre-start hook?

@wking
Copy link
Contributor

wking commented Nov 4, 2015

On Wed, Nov 04, 2015 at 10:58:46AM -0800, Michael Crosby wrote:

ya, but why won't you name the directories as random ids? you
shouldn't run multiple containers in the same dir because they are
going to overwrite state.

I can imagine read-only bundles that have no writable state inside the
bundle directory. Is there any reason to require a writable bundle
directory?

@mrunalp
Copy link
Contributor

mrunalp commented Nov 4, 2015

@LK4D4 That's a fair point :)

@marcosnils
Copy link
Contributor

Currently we're using the working directory of the runc process as the container ID, that's not good to identify, and you can't start two container in the same directory, or different path with the same directory name.

Isn't the --id option there for this use case?. I like the fact that by convention it takes the name of the dir where you're standing and that if you need a new container you can just use the id flag.

@crosbymichael
Copy link
Member

yes, you can use the id flag here for this

@hqhq
Copy link
Contributor Author

hqhq commented Nov 5, 2015

@LK4D4 @mrunalp Since we are going to support '--bundle', it'll be a common problem how people know what ID the container has because it'll be common that people left the directory and forgot where there were when there create the container. It's fixable in many ways, if they got shell in container, can get the ID from cat /proc/self/cgroup, and we can consider print the ID to stdout after create the container, or consider support command like runc ls.

@hqhq
Copy link
Contributor Author

hqhq commented Nov 5, 2015

@marcosnils @crosbymichael Yes we can use --id, I'm trying to change the default naming rule, random ID is a more unified naming style than directory names. Several people asked me why their containers named as test or myrunc, it's not easy for people to figure out that's the directory name where they create the runc container. If we use random ID, I think that'll cause no confusing.

And if people don't like random IDs, they can use --id as well.

@wking
Copy link
Contributor

wking commented Nov 5, 2015

On Wed, Nov 04, 2015 at 05:34:04PM -0800, Qiang Huang wrote:

… consider support command like runc ls.

This is just ‘ls /run/opencontainer/containers’, no?

I'd forgotten runC supports ‘--id’, so I'm back to not caring what the
runtime's default rule is ;).

@hqhq
Copy link
Contributor Author

hqhq commented Nov 5, 2015

This is just ‘ls /run/opencontainer/containers’, no?

@wking Could be more complicated than that, because we also support --root.

@crosbymichael
Copy link
Member

I don't think random ids at this level will work. Higher level tools should be providing these ids and it only uses the dir name as a default when --id is not specified.

@mrunalp
Copy link
Contributor

mrunalp commented Nov 9, 2015

Agree with @crosbymichael. We should handle this at a higher level.

@marcosnils
Copy link
Contributor

@crosbymichael @mrunalp 👍 agree.

@hqhq
Copy link
Contributor Author

hqhq commented Nov 10, 2015

All right, let's keep runc simple as it was on this, thanks for your comments, I'm closing it.

@hqhq hqhq closed this Nov 10, 2015
@hqhq hqhq deleted the hq_add_container_id branch April 27, 2016 01:14
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
# digest/hashing target

Most of this has spun off with [1], and I haven't heard of anyone
talking about verifying the on-disk filesystem in a while.  My
personal take is on-disk verification doesn't add much over serialized
verification unless you have a local attacker (or unreliable disk),
and you'll need some careful threat modeling if you want to do
anything productive about the local attacker case.  For some more
on-disk verification discussion, see the thread starting with [2].

# distributable-format target

This spun off with [1].

# lifecycle target

I think this is resolved since 7713efc (Add lifecycle for containers,
2015-10-22, opencontainers#231), which was committed on the same day as the ROADMAP
entry (4859f6d, Add initial roadmap, 2015-10-22, opencontainers#230).

# container-action target

Addressed by 7117ede (Expand on the definition of our ops,
2015-10-13, opencontainers#225), although there has been additional discussion in
a7a366b (Remove exec from required runtime functionalities,
2016-04-19, opencontainers#388) and 0430aaf1 (Split create and start, 2016-04-01,
opencontainers#384).

# validation and testing targets

Validation is partly covered by cdcabde (schema: JSON Schema and
validator for `config.json`, 2016-01-19, opencontainers#313) and subequent JSON
Schema work.  The remainder of these targets are handled by ocitools
[3].

# printable/compiled-spec target

The bulk of this was addressed by 4ee036f (*: printable documents,
2015-12-09, opencontainers#263).  Any remaining polishing of that workflow seems
like a GitHub-issue thing and not a ROADMAP thing.  And publishing
these to opencontainers.org certainly seems like it's outside the
scope of this repository (although I think that such publishing is a
good idea).

[1]: https://github.com/opencontainers/image-spec
[2]: https://groups.google.com/a/opencontainers.org/d/msg/dev/xo4SQ92aWJ8/NHpSQ19KCAAJ
     Subject: OCI Bundle Digests Summary
     Date: Wed, 14 Oct 2015 17:09:15 +0000
     Message-ID: <CAD2oYtN-9yLLhG_STO3F1h58Bn5QovK+u3wOBa=t+7TQi-hP1Q@mail.gmail.com>
[3]: https://github.com/opencontainers/ocitools

Signed-off-by: W. Trevor King <wking@tremily.us>
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
Restore the line removed by be59415 (Split create and start,
2016-04-01, opencontainers#384).  Without this, GitHub renders the list as a single
paragraph.

Signed-off-by: W. Trevor King <wking@tremily.us>
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
The shorter-than-normal (for the rest of this list) indent landed with
the line in be59415 (Split create and start, 2016-04-01, opencontainers#384).

Signed-off-by: W. Trevor King <wking@tremily.us>
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
proc(5) describes the following state entries in proc/[pid]/stat [1]
(for modern kernels):

* R Running
* S Sleeping in an interruptible wait
* D Waiting in uninterruptible disk sleep
* Z Zombie
* T Stopped (on a signal)
* t Tracing stop
* X Dead

and ps(1) has a bit more context [2] (for modern kernels):

* D uninterruptible sleep (usually IO)
* R running or runnable (on run queue)
* S interruptible sleep (waiting for an event to complete)
* T stopped by job control signal
* t stopped by debugger during the tracing
* X dead (should never be seen)
* Z defunct ("zombie") process, terminated but not reaped by its
  parent

So I expect "stopped" to mean "process still exists but is paused,
e.g. by SIGSTOP".  And I expect "exited" to mean "process has finished
and is either a zombie or dead".

After this commit, 'git grep -i stop' only turns up the "stopped"
state (which I've left alone for backwards compat), some poststop-hook
stuff, a reference in principles.md, a "stoppage" in LICENSE, and some
ChangeLog entries.

Also replace "container's process" with "container process" to match
usage in the rest of the repository.  After this commit:

  $ git grep -i "container process" | wc -l
  20
  $ git grep -i "container's process" | wc -l
  1

Also reword status entries to avoid "running", which is less precise
in our spec (e.g. it also includes "sleeping", "waiting", ...).

Also removes a "them" leftover from a partial plural -> singular
reroll of be59415 (Split create and start, 2016-04-01, opencontainers#384).

[1]: http://man7.org/linux/man-pages/man5/proc.5.html
[2]: http://man7.org/linux/man-pages/man1/ps.1.html

Signed-off-by: W. Trevor King <wking@tremily.us>
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
Since be59415 (Split create and start, 2016-04-01, opencontainers#384), it's
possible for a container process to never execute user-specified code
(e.g. you can call 'create', 'kill', 'delete' without calling
'start').  For folks who expect to do that, there's no reason to
define process.args.

The only other process property required for all platforms is 'cwd',
but the runtime's idler code isn't specified in sufficient detail for
the configuration author to have an opinion about what its working
directory should be.

On Linux and Solaris, 'user' is also required for 'uid' and 'gid'.  My
preferred approach here is to make those optional and define defaults
[1,2]:

  If unset, the runtime will not attempt to manipulate the user ID
  (e.g. not calling setuid(2) or similar).

But the maintainer consensus is that they want those to be explicitly
required properties [3,4,5].  With the current spec, one option could
be to make process optional (with the idler's working directory
unspecified) for OSes besides Linux and Solaris.  On Windows, username
is optional, but that was likely accidental [6].

So an unspecified 'process' would leave process.cwd and process.user
unset.  What that means for the implementation-defined container
process between 'create' and 'start' is unclear, but clarifying how
that is handled is a separate issue [7] independent of whether
'process' is optional or not.

[1]: opencontainers/runtime-spec#417 (comment)
[2]: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/DWdystx5X3A
     Subject: Exposing platform defaults
     Date: Thu, 14 Jan 2016 15:36:26 -0800
     Message-ID: <20160114233625.GN6362@odin.tremily.us>
[3]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2016/opencontainers.2016-05-04-17.00.log.html#l-44
[4]: opencontainers/runtime-spec#417 (comment)
[5]: opencontainers/runtime-spec#417 (comment)
[6]: opencontainers/runtime-spec#618 (comment)
[7]: opencontainers/runtime-spec#700

Signed-off-by: W. Trevor King <wking@tremily.us>
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
I expect the lifecycle information was removed accidentally in
be59415 (Split create and start, 2016-04-01, opencontainers#384), because for a
time it seemed like that PR would also be removing hooks.  Putting the
lifecycle information back in, I made some tweaks to adjust to the new
environment, for example:

* Put the pre-start hooks after the 'start' call, but before the meat
  of the start call (the container-process exec trigger).  Folks who
  want a post-create hook can add one with that name.  I'd like to
  have renamed poststop to post-delete to avoid confusion like [1].
  But the motivation for keeping hooks was backwards compatibility [2]
  so I've left the name alone.

* Put each "...command is invoked..." lifecycle entry in its own list
  entry, to match the 'create' list entry.

* Move the rules about what happens on hook failure into the
  lifecycle.  This matches pre-split entries like:

    If any prestart hook fails, then the container MUST be stopped and
    the lifecycle continues at step 7.

  and avoids respecifying that information in a second location
  (config.md).

* I added the warning section to try and follow post-split's generic
  "generates an error" approach while respecting the pre-split desire
  to see what failed (we had "then an error including the exit code
  and the stderr is returned to the caller" and "then an error is
  logged").

* I left the state 'id' context out, since Michael didn't want it [3].

* Make runtime.md references to "generate an error" and "log a
  warning" links, so readers have an easier time finding more detail
  on that wording.

Where I reference a section, I'm still using the auto-generated anchor
for that header and not the anchors which were added in 41839d7 (Merge
pull request opencontainers#707 from mrunalp/anchor_tags, 2017-03-03) and similar.
Mrunal suggested that the manually-added anchors were mainly intended
for the validation tooling [4].

[1]: opencontainers/runtime-spec#395
     Subject: Run post-stop hooks before the container sandbox is deleted.
[2]: opencontainers/runtime-spec#483 (comment)
     Subject: *: Remove hooks
[3]: opencontainers/runtime-spec#532 (comment)
     Subject: Restore hook language removed by create/start split
[4]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2017-03-03.log.html#t2017-03-03T18:02:12

Signed-off-by: W. Trevor King <wking@tremily.us>
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
Step 3 of the lifecycle from before this commit had two sentences
which both landed in be59415 (Split create and start, 2016-04-01,
opencontainers#384).  I pushed back a bit on the entry then [1,2], but we seem to be
pretty comfortable with the current "keep all lifecyle entries in a
one-layer enumerated list" approach, so I'm leaving that alone in this
commit.  Step 3 isn't really a lifecycle step though, it's more about
clarifying that you can jump around in the lifecycle instead of
hitting all the steps in consecutive order.  I'd floated a new
paragraph addressing that jumping, but was unable to form a consensus
around wording, and the jumping is already somewhat covered by the
current list entries (e.g. "The container process exits.").  This
commit just drops the old step 3, and Michael will follow up with
wording about jumping [3].

The other sentence from the old step 3 doesn't need replacing, because
the limits are already covered in more detail in the operation
sections themselves.  For example, the 'delete' operation has:

  Attempting to delete a container that does not exist MUST generate
  an error.  Attempting to delete a container whose process is still
  running MUST generate an error.

I don't see the need to call generic attention to that idea, and
especially do not think that an entry in the lifecycle list is the
right place for such a generic call-out.

[1]: opencontainers/runtime-spec#384 (comment)
[2]: opencontainers/runtime-spec#384 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2017/opencontainers.2017-05-10-21.03.log.html#l-79

Signed-off-by: W. Trevor King <wking@tremily.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants