-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reading plugin stderr read |0: file already closed #116
Comments
Anyone had a chance to look at this? cc @jbardin |
Hi @skriss, I haven't seen this myself yet, and it does appear that Do you have a way to replicate the error, or any other log messages from the go-plugin package? |
@jbardin thanks for taking a look. It's not causing any functional issues other than writing a bunch of errors to the log. It's pretty easily replicated in our project which makes use of go-plugin, though as I said it's sporadic, which implies to me it's probably a race condition. There aren't any other useful logs emitted around the same time. Let me see if I can come up with a simpler repro for you that doesn't involve deploying a Kubernetes cluster :) |
My thought was that the switch statement at https://github.com/hashicorp/go-plugin/blob/master/client.go#L947 should actually be checking for an os.ErrClosed instead of an io.EOF -- looks like in a previous version of the code, only the write end of the pipe being used was closed, which resulted in an io.EOF being returned, but now, with the use of os/exec's StderrPipe, the read end is closed, so you get an os.ErrClosed (well, I think it's actually a wrapped version of that). |
Yes, that switch is where we would need to add the check for I would like to at least understand why we're seeing this before ignoring the error completely however. A repro would be great, but even a description of how you're configuring and using the plugin might help. |
…een invocations With errors like issue 116 [1], terraform restarts the provider server. And with terraform turning on `AutoMTLS` for all plugins [2], the client cannot reattach to already running provider server [3]. Provider's memory only caches creates errors like `invalid file \"49e2cc7ba023d87070d9b49a0c8f02ff0f63b7f61b390fbce8f64b011bd42ed7\", unknown file id` because the ignition_config resource tried to retrived a file id stored in cache before provider server was restarted. This stores the objects in `diskv` store in users' cache directory, retireving it when the memory cache is not populated. The in-memory cache keeps providing the no overhead access to objects. [1]: hashicorp/go-plugin#116 [2]: hashicorp/terraform#19560 [3]: https://github.com/hashicorp/go-plugin/blob/5692942914bbdbc03558fde936b1f0bc2af365be/client.go#L203-L204
…een invocations With errors like issue 116 [1], terraform restarts the provider server. And with terraform turning on `AutoMTLS` for all plugins [2], the client cannot reattach to already running provider server [3]. Provider's memory only caches creates errors like `invalid file \"49e2cc7ba023d87070d9b49a0c8f02ff0f63b7f61b390fbce8f64b011bd42ed7\", unknown file id` because the ignition_config resource tried to retrived a file id stored in cache before provider server was restarted. This stores the objects in `diskv` store in users' cache directory, retireving it when the memory cache is not populated. The in-memory cache keeps providing the no overhead access to objects. [1]: hashicorp/go-plugin#116 [2]: hashicorp/terraform#19560 [3]: https://github.com/hashicorp/go-plugin/blob/5692942914bbdbc03558fde936b1f0bc2af365be/client.go#L203-L204
…een invocations With errors like issue 116 [1], terraform restarts the provider server. And with terraform turning on `AutoMTLS` for all plugins [2], the client cannot reattach to already running provider server [3]. Provider's memory only caches creates errors like `invalid file \"49e2cc7ba023d87070d9b49a0c8f02ff0f63b7f61b390fbce8f64b011bd42ed7\", unknown file id` because the ignition_config resource tried to retrived a file id stored in cache before provider server was restarted. This stores the objects in `diskv` store in users' cache directory, retireving it when the memory cache is not populated. The in-memory cache keeps providing the no overhead access to objects. [1]: hashicorp/go-plugin#116 [2]: hashicorp/terraform#19560 [3]: https://github.com/hashicorp/go-plugin/blob/5692942914bbdbc03558fde936b1f0bc2af365be/client.go#L203-L204
@abhinavdahiya are you seeing this error as well? |
@jbardin it looks like if you add a I implemented a super-simple fix in the following branch, that seems to prevent the issue from happening (and hopefully clearly demonstrates the race): https://github.com/hashicorp/go-plugin/compare/master...skriss:fix-stderr-race?expand=1. Happy to submit it as a PR if you like it as-is, otherwise I can change the approach as needed. |
Oh yes, thanks for tracking that down! The logger might not have entered the I think the solution here seems reasonable. I would just suggest that we re-organize the code here a bit so that the |
This makes sense -- the only issue I see here is that we want to call |
@skriss: yes, that's the minor reorganization I had in mind. The order the goroutines are declared shouldn't matter, as long as the waitgroup is setup correctly and we're not changing anything in the closure. |
…#386) * 1 - attempt to fix stream closing issues in kelpos.Blocking() * 2 - only close manually created pipes, not Std* pipes * 3 - the actual fix, Wait() after Read() see: hashicorp/go-plugin#116 (comment) * 4 - remove IPC server and components since that is causing an explosion of pipes/files and is also unused closes #385
… (#386) * 1 - attempt to fix stream closing issues in kelpos.Blocking() * 2 - only close manually created pipes, not Std* pipes * 3 - the actual fix, Wait() after Read() see: hashicorp/go-plugin#116 (comment) * 4 - remove IPC server and components since that is causing an explosion of pipes/files and is also unused closes #385
… (#386) * 1 - attempt to fix stream closing issues in kelpos.Blocking() * 2 - only close manually created pipes, not Std* pipes * 3 - the actual fix, Wait() after Read() see: hashicorp/go-plugin#116 (comment) * 4 - remove IPC server and components since that is causing an explosion of pipes/files and is also unused closes #385
I'm seeing intermittent errors being logged from go-plugin with a message of "reading plugin stderr", and an error of "read |0: file already closed". This seems to happen when plugins are terminated.
In looking at https://golang.org/src/os/exec/exec.go?s=17633:17682#L601, and
client.go
, it seems like there may be a race condition where the reader side of the pipe is closed after the command exits, but the go-plugin client is still trying to read from it. The client'slogStderr
is looking for anio.EOF
error, but in this case I think it's actually getting some variant of theos.ErrClosed
error. I'm not familiar enough with the code to understand exactly what's going on/what should be changed, though. Hope this can point someone in the right direction!The text was updated successfully, but these errors were encountered: