Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

event monitor causes infinite memory growth #165

Closed
minrk opened this issue Apr 23, 2018 · 9 comments
Closed

event monitor causes infinite memory growth #165

minrk opened this issue Apr 23, 2018 · 9 comments

Comments

@minrk
Copy link
Member

minrk commented Apr 23, 2018

The event reflector in #150 has causes a memory leak, which brought down Binder last week.

I suspect we aren't properly cleaning up the reflector when we are done with it. We need to investigate and fix this before releasing kubespawner or pushing it to zero-to-jupyterhub.

@yuvipanda
Copy link
Collaborator

@minrk it isn't being used for anything right now, so I reverted it in #166

@minrk
Copy link
Member Author

minrk commented Apr 23, 2018

I tracked it down. Instantiating a Watch unconditionally instantiates an APIClient, which unconditionally spawns n_cpus threads. So when we create one reflector per pod, we are creating a huge number of threads.

It's the same as the kube-4.0 upgrade bug, but isolated to a more specific circumstance.

@minrk
Copy link
Member Author

minrk commented Apr 23, 2018

It does provide debug logging, and is used in #153. But I agree that we should probably revert it for now, or pin kubernetes-3 again.

@minrk
Copy link
Member Author

minrk commented Apr 23, 2018

I opened a PR with swagger-codegen, which is responsible for the flood of threads.

@yuvipanda
Copy link
Collaborator

@minrk let's just revert it, and introduce it back when needed. Does that sound right, @clkao?

@clkao
Copy link
Contributor

clkao commented Apr 23, 2018

Yeah please revert it for now, and I'll make this part of #153. Is there any alternative way to make reflector sane before Watch() gets fixed by the swagger-codegen?

@minrk
Copy link
Member Author

minrk commented Apr 23, 2018

@clkao thanks. One way is to revert the kubernetes client to 3.x, which doesn't have this issue. We might also be able to find a workaround. I think we're not reliably cleaning these reflectors up, which could help.

@clkao
Copy link
Contributor

clkao commented Apr 24, 2018

hmm, i think the reflectors are stopped once the pod is running. maybe something is still missing.

@consideRatio
Copy link
Member

I tracked it down. Instantiating a Watch unconditionally instantiates an APIClient, which unconditionally spawns n_cpus threads. So when we create one reflector per pod, we are creating a huge number of threads.

The reflectors we use are now Singleton's - assuming this is resolved due to that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants