-
Notifications
You must be signed in to change notification settings - Fork 39.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Container identifiers should be globally unique #199
Comments
Internally we use RFC4122 UUIDs for identifying pods. Any objections to making this part of the pod setup? I guess it would really be a string (like "id") but with the strong suggestion that it be an encoded UUID. Or we could use docker-style 256 bit randoms, but that might get confusing. If we further lock down container names to RFC1035 labels, we can use . as the docker container name, which seems much nicer than the current dashes and underscores :) What think? |
Both points sound great to me. These encoded names are just plain ugly:) Id is also a required field (network containers break otherwise), so it seems weird to have them marked as 'omitempty'. This is probably more an issue for config files than anything else. |
Was going to get familiar and try to fix this - sounds like the suggestion is to add "ID string" to Container, and fail PodRegistryStorage.Create() if Container.ID is empty? Or should PodRegistryStorage.Create() populate unset DesiredState.Manifest.Containers[].ID that are empty? Latter seems more flexible (server controls default UUID generation for clients) |
I started in on some validation logic this morning. It's not clear whether unique ID is something users should have to spec or Is that in the same vein as you were thinking?
|
As an API consumer I like not having to specify things that the server can do for me - having to generate a UUID on the command line to curl a new pod into existence feels wrong. I started here but didn't pass down to kubelet yet. |
One thing to note with global unique identifier is static containers. Today On Wed, Jun 25, 2014 at 7:31 PM, Clayton Coleman notifications@github.com
|
Static as in "defined on each host via a config file"? Would it make sense for the Kubelet to auto assign a UUID for containers pulled from files based on the host MAC and the position in the file (or a SHA1 of the contents of the manifest plus the host MAC)? |
Are you setting it up by config file? Can we generate a new uuid when we write the config file, or do we want it We cod do something like: if Kubelet finds a config without a uuid, it will Internally we go one step further and define "master space" where each Do we need to go that far? Uuids are notoriously human-hostile.
|
Yes, it is a static file we leave on the machine. I don't think we have the On Wed, Jun 25, 2014 at 7:43 PM, Clayton Coleman notifications@github.com
|
To be clearer on masterspace. Each new pod gets fields such as: Masterspace: kubernetes.google.com
|
I want the container names from the different sources to have different namespaces, so there won't be any collisions. E.g., kubelet prepends/appends ".etcd" (or something) to etcd-sourced containers, ".cfg" to containers from the config file, ".http" to containers from the manifest url, etc. Then it is up to each source to stay unique. Api server can stay unique via uuid or counting up. Config files & manifest url stay unique by humans not screwing up; kubelet rejects them otherwise. This has the nice effect that the container name produced from a config file is predictable without having to do a lookup. This would be good for our own container vm image and anything like it. |
This started structured and turned into a stream0-of-consiousness, sorry. I agree that unique names are desirable, but I'm not sure "etcd" and The reason we have masterspace internally is because we also need to attach Here's what I think I have convinced myself of, so far.
This does not have the property that Daniel wants - predictable container That said, I could maybe be convinced. If we put the rule that the pod E.g. Pod { ...would be created with container name If the apiserver did not care about phantoms, it could leave off the I still have a vague foreboding about making the uniqueness be the master's What if the rule is that masterspace is optional. If not specified, Pod { ...would be created with container name cadvisor.cadvisor.file.<host_fqdn> This still sort of sucks in that you can't aggregate by masterspace, e.g. Thoughts? I could go wither way (UUIDs or masterspace + unique name). Or Tim On Wed, Jun 25, 2014 at 8:09 PM, Daniel Smith notifications@github.com
|
Just to be clear, I only think that the property of predictable names is desirable for containers that come from the manifest url and maybe the container, because without that property, I don't know if we want to wait to solve naming in general before we accept a better solution than what we do currently. |
I could put my weight behind either approach. The problem with UUIDs (be On Wed, Jun 25, 2014 at 10:12 PM, Daniel Smith notifications@github.com
|
"I feel like this got complicated pretty fast." Yup:) [I think I've read & parsed all of the comments, but I could be mistaken] I like the idea of DNS-styled hierarchical namespaces. I'm not super worried about multiple masters scheduling over the same cluster of minions, but it does make multi-tenant masters easier (i.e. a provider is running a master, and customers run their minions). It's also something that people are used to. I think it's also reasonable to require that users produce unique names for managing in the system. Machine-generated unique IDs are more useful for running containers (i.e. a global PID). Having this be globally unique is really handy for things like log aggregation, etc. Why not always have the kubelet generate that? The master can learn it when it lists the running containers. It's worth noting that the only restriction of docker names is that they are unique to the host. We can still encode useful human data (manifest+container names) in addition to the unique ID into the name...even if we only parse out the unique ID! FWIW, I'm just polishing my ID cleanups to remove dead (or dieing) code. My goal was to normalize on docker IDs for many/most things inside the kubelet. I should have it out for review tomorrow. |
Also, clarifying some terminology might be handy. Here's how I think of things (could easily be wrong!):
Things that confuse me:
|
On Wed, Jun 25, 2014 at 10:32 PM, Justin Huff notifications@github.com wrote:
I'm more worried about "masters" that are config files crafted by
Here "users" == apiserver?
Yeah, this is doable (and is in fact closer to what we do internally)
I think it is useful, but not critical that docker-reported names be
I'm keenly interested in this, and I'd like to stabilize it all ASAP.
|
On Wed, Jun 25, 2014 at 11:07 PM, Justin Huff notifications@github.com wrote:
The names as defined in pkg/api are somewhat confusing. It all From Kubelet's POV ContainerManifest == Pod.
Yes, let's settle the discussion about UUIDs vs Names before we rename
I think this was to leave room for growth. Kubelet should do the same BTW: I have a change pending (not sent yet) to do validation of a
|
Whew! This is long. This issue started with no context about what we're trying to do. Starting with the end: Yes, we should clean up the identifiers. Currently, every object/resource includes JSONBase, which has an ID (which probably should be Id). ContainerManifest also has an Id. Container has Name. Port has Name. I'll point out that Container and Port are not standalone resources right now. ContainerManifest is the way it is due to compatibility with the container-vm release. What are we trying to do? I saw several things mentioned in this issue:
Both unique identifiers and human-friendly names have value. Human-friendly names can only be unique in space, not in time. We should use "Id" for the former and "Name" for the latter. Names could be used to ensure idempotent (at most once) creation, though a non-indexed resource value could be used for that, also. The replicationController would treat "Name" in the template as a prefix and would append relatively short random numbers for uniqueness. Static pods could be treated similarly. Label selectors should be used for set formation / aggregation. If users are providing Names, we could also use them for DNS (#146). I'd use "domain" rather than "masterspace". Argh, my laptop needs to be rebooted... |
Continuing... I agree we want a mechanism that permits unique id allocation by Kubelets and that doesn't require centralized and/or persistent state. I like UUIDs. There are the issues of ensuring unique MAC addresses in VMs and/or namespaces, and determinism for testing, but I think we've found ways around these issues. I've considered not having human-friendly names for pods before and just using labels instead. Labels are predictable and human-friendly, but don't require uniqueness, and don't require concatenating lots of identifying info together in order to ensure uniqueness, which users WILL do for names (and they'll want to parameterize them). They aren't short, though. Also, I guess part of the problem is that Docker doesn't support labels. We should push on that. Idempotence could be ensured by a client-generated cookie, such as a fingerprint/hash or PR number. It wouldn't be required for static configs. DNS names for instances aren't super-useful if they aren't predictable and DNS-like human-friendly names aren't that friendly if they are long. Short nicknames don't have to be predictable (so they could have a short uniquifying suffix) and don't even need to be semantically relevant -- just memorable (hence Docker's silly auto-generated names, I guess). Services need predictable DNS names, OTOH, and ports need predictable names (for DNS SRV lookup or ENV vars or whatever), and pod-relative hostnames of containers should be predictable, so they can communicate with each other, though I don't know that they actually need to be FQDNs. Other types of services (e.g., master-elected services) and groups will need predictable DNS names, also. Is the main motivation for names for pods to be consistent across all resource types? If so, I could buy into DNS-like hierarchical names for them. I'd use something like "domain" or "namespace" instead of "masterspace". |
Pod-relative hostnames and stable internal names are definitely valuable - and limiting "name" to rfc1035 subdomain has been extremely valuable in practice to us on OpenShift.
Quasi-uniqueness I assume?
How predictable? As a concrete example, with something like Zookeeper you need the container to have an identifier/name that is stable across restarts / reschedules in a shared config (so not a pod ID). I'd assumed you'd model this with a set of replication controllers (vs a shared controller) so you had 3 replication controllers with 1 item each, and you'd be able to either set an ENV per pod, or use the name in order to apply that. If name changes over pod instances that rules out the reuse there. |
One last thought from me; it occurs to me that docker already generates a long container id. Perhaps we can consider using that directly as our spatial/temporal unique identifier, and use this hypothetical dns-style solution as the friendly human name. We'd need to investigate just how unique docker's id's are, and we may not want to depend on docker for that, but it would reduce the number of IDs needed. Also, as a footnote, if we end up with both pod.ID and Manifest.ID, IMO they should be the same identifier. |
|
For clarity, let me suggest the convention that "name" means a friendly mostly human readable string, possibly in the style of dns names, and that "id" means an opaque, machine generated identifier, guaranteed unique at some level of resolution. Maybe everyone except for me is already using this convention. :) But I want to talk about ids and names in general without referring to a particular implementation.
I think, in that case, the rep. controller itself has a name, which it can use to make names for the pods it creates (prepend/append indices or something). IDs for the pods could still be generated by the apisever or wherever we decide they need to be generated. I was trying to say that since there's a 1:1 relationship between pods and manifests, we shouldn't make different IDs for each. |
Trying to collect thoughts and ideas into a proposal. It's tricky I have to run out right now, but I'm hoping we can distill the discussion NB: We spec names as RFC 1035 compatible, though we might extend that to From kubelet's point of view:
Open: Do we need UUIDs at all? For what purpose? The only argument I can From the apiserver's point of view:
Open: Should the unique ID persist if the pod is moved to a new minion? On Thu, Jun 26, 2014 at 9:46 AM, Daniel Smith notifications@github.com
|
Numeric host names: http://tools.ietf.org/html/rfc1123#page-13 |
Regarding the open question: is a pod on a new minion the same as the old pod? Is a move an action that is logically part of kubernetes, or is only "remove" and "create new" available? If it's the former, it seems like the ID should be the same, if it's the later it seems like it should be different. We have a use case for being able to move a pod from minion to minion (and any volumes that come with it) - however, since this requires the volume data on disk to be in a resting state, it's not an operation that seems to lend itself well to the replication controller (since the move is inherently stateful). So I would expect this to be managed as an operation above the replication controller, vs part of it. One namespace per apiserver can be limiting if the namespace is automatically bound to DNS and you're dealing with very large numbers of containers, but it doesn't sound unreasonable if you are using wildcard DNS. |
One minor note regarding Docker ecosystem - "name" is currently being used in Docker for a lot of lightweight integrations (linking, sky dock dns, hostname in container). A side effect of any generated name for a Docker container is that those lightweight integrations may become more difficult for end admins. Is there a practical way to make the name appropriately unique on the minion without breaking those potential integrations (making the name a subdomain fragment by omitting '.' for instance) |
I would also like to see us drive labels all the way down into Docker. That Brendan On Fri, Jun 27, 2014, 8:39 AM, Clayton Coleman notifications@github.com
|
+1 to what brendanburns@ wrote. Labels provide much cleaner solutions for |
On Thu, Jun 26, 2014 at 10:51 PM, Justin Huff notifications@github.com wrote:
This is an implementation detail - human-friendliness of docker's
Yeah, where is the line between "ID" and "name"? The apiserver COULD |
+1 to labels On Fri, Jun 27, 2014 at 8:45 AM, brendanburns notifications@github.com
|
What would break on those? We're not proposing to do anything with name On Fri, Jun 27, 2014 at 8:39 AM, Clayton Coleman notifications@github.com
|
It's not that the names aren't valid for Docker, its that something consuming those names (as a dns prefix a la skydock, or as a hostname) would break due to length or format on the Kube generated name. I don't think that is a blocker to the proposed naming patterns above, but it's a consideration when thinking about how other software that plays well with Docker might react. Concrete ideas might be to reduce the generated names' length and avoid using '.' as a separator. |
I would argue (for the sake of arguing) that anyone who was making On Sat, Jun 28, 2014 at 2:55 PM, Clayton Coleman notifications@github.com
|
Agree, naming is hard |
Writing up a summary md - does this belong in api/doc, DESIGN.md, or another location? |
I would go for docs/identifiers.md or something On Mon, Jun 30, 2014 at 2:48 PM, Clayton Coleman notifications@github.com
|
Referenced pull includes a summary of this discussion, open questions:
|
V1.7.14 patchset
…lease-blocking-signaling update release blocking issue handling for 1.6
pod template visitor
A temporary workaround is to seed our random number generators. But really, apiserver should assign a guaranteed-unique identifier upon resource creation.
The text was updated successfully, but these errors were encountered: