Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

Implement the runc events command #186

Closed
jodh-intel opened this issue Apr 5, 2018 · 12 comments
Closed

Implement the runc events command #186

jodh-intel opened this issue Apr 5, 2018 · 12 comments
Assignees
Labels
limitation Issue cannot be resolved

Comments

@jodh-intel
Copy link
Contributor

Originally raised as clearcontainers/runtime#379.

@jodh-intel jodh-intel added the limitation Issue cannot be resolved label Apr 5, 2018
@jshachm
Copy link
Member

jshachm commented Apr 19, 2018

@WeiZhang555 @jodh-intel
may be I can help fix this issue

@WeiZhang555
Copy link
Member

@jshachm Please go ahead!

If anyone is already working on this, please let us know to avoid waste of someone's work 😄

@grahamwhaley
Copy link
Contributor

As per clearcontainers/runtime#379, I'd be really happy to see a rough idea of what we can/will support and how first. Sure, that can be via an RFC PR, or just scribble some stuff down here for us to discuss :-)
I'll see if I can assign this to you @jshachm - if I cannot, I will assign it to me (whilst you work on it), so that nobody else picks it up :-)

@grahamwhaley grahamwhaley self-assigned this Apr 19, 2018
@WeiZhang555
Copy link
Member

@grahamwhaley Thanks!

@jshachm is my colleague so I will push him on this, and make sure this can be done before release 1.0.0!

@sboeuf
Copy link

sboeuf commented Apr 19, 2018

I agree with @grahamwhaley, a proposal to explain what you plan doing (as a comment of this Github issue) should be the starting point. It's not worth spending time on the implementation if we don't agree on how to proceed first ;)

@jshachm
Copy link
Member

jshachm commented Apr 20, 2018

@sboeuf @grahamwhaley
Thanks a lot. Already begin to code first RFC PR.
First of my work is to get stats from container in sandbox and display it.CPU and Memory will be fully supported firstly as a POC and then other objects.
And I have a little question as below:
To implementation the events cli ,I think StatsContainer API which get stats from container is needed for the work as runc doing. What's your suggestions?

@grahamwhaley
Copy link
Contributor

Hi @jshachm I'll let @sboeuf comment on the StatsContainer API etc.
Looking forward to the RFC. I have always contemplated if we should extract the data from inside the container (some we have to, like PIDS I think), or outside the container (such as memory stats - should we ask the container kernel, or report what the VM is consuming on the host side).
And then having a look at the runc code, I suspect the network measurements are done by docker on the host side already - so, that is probably out of the scope of the 'events command' already - we'll have to see if docker manages to account for our different network setup for the VMs - in my quick local test it seemed to account for something, which surprised me :-)

@egernst
Copy link
Member

egernst commented Apr 20, 2018

Nice -- I wonder if we should rephrase this issue to something more like, "support CRI core metrics". Or, "make kata work with prometheus" -- I think this is the more important gap that we have in this area. Let me know if I'm missing the idea, though -- I haven't looked into runc events CLI before.

This has been on my radar for a while as an important enabling for Kata in the ecosystem, and saw this as a post release activity. If we have dedicated head to look into this now, then I'm very happy!

A couple of relevant pointers:
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/cri/runtime/v1alpha2/api.proto#L83-L87

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/core-metrics-pipeline.md

@tallclair , FYI, as this is something we've discussed briefly in the past.

@jodh-intel
Copy link
Contributor Author

Can we avoid renaming this issue unless we also update the limitations doc (kata-containers/documentation#48) which references it.

@sboeuf
Copy link

sboeuf commented Apr 20, 2018

@jshachm @grahamwhaley @jodh-intel I was looking into events.go from runc and first thing, I agree we will need a StatsContainer() at the Kata API level. Now that being said, I think that for the sake of keeping virtcontainers generic, we will need to do a few things.
Basically here is what I was thinking:

  • We define a structure Stats from virtcontainers, as a pure duplicate of what is defined by libcontainer. But the duplicate will not tie virtcontainers to libcontainer and the agent will do the necessary conversion if needed. Moreover, I am talking about a duplicate and not starting from scratch on that since I think this structure is pretty complete.
  • Now the idea is that our CLI implementing events.go will get a type virtcontainers.Stats that will have to be converted the same way runc convert the info from libcontainer. events.go being into the CLI, this is the place where we want to make sure the output generated completely fits runc output. And I think that part of the code from runc events.go could be simply copy/pasted.
  • About the internal implementation of StatsContainer() inside virtcontainers, we need to extend the agent interface with a statsContainer().
  • Last thing, we need the kata agent protocol to be updated in order to add a new StatsContainer to the protocol. The same PR on the agent will have to implement this new function, by calling the simple libcontainer.container.Stats() function. This should be pretty straightforward.

@jshachm
Copy link
Member

jshachm commented Apr 21, 2018

@sboeuf tough day today and now finally have time to take a closer look at your suggestions.

We define a structure Stats from virtcontainers, as a pure duplicate of what is defined by libcontainer. But the duplicate will not tie virtcontainers to libcontainer and the agent will do the necessary conversion if needed.

I can't agree more on this point and this is what my local codes is being organized.

events.go being into the CLI, this is the place where we want to make sure the output generated completely fits runc output.

yeap, just a function like convertVirtcontainerStats to convert virtcontainers.Stats into runc output

Good detail analysis. Thx a lot.

@grahamwhaley as you said if docker manages to account for our different network setup for the VMs , network measurements will not be done in events cli.

@jshachm
Copy link
Member

jshachm commented Apr 24, 2018

First RFC pr will be adding api in agent.
agent api add

zklei pushed a commit to zklei/runtime that referenced this issue Jun 13, 2019
Add a template to guide the user on raising github issues.

Fixes kata-containers#186.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
limitation Issue cannot be resolved
Projects
None yet
Development

No branches or pull requests

6 participants