reading plugin stderr read |0: file already closed #116

skriss · 2019-05-02T16:20:42Z

I'm seeing intermittent errors being logged from go-plugin with a message of "reading plugin stderr", and an error of "read |0: file already closed". This seems to happen when plugins are terminated.

In looking at https://golang.org/src/os/exec/exec.go?s=17633:17682#L601, and client.go, it seems like there may be a race condition where the reader side of the pipe is closed after the command exits, but the go-plugin client is still trying to read from it. The client's logStderr is looking for an io.EOF error, but in this case I think it's actually getting some variant of the os.ErrClosed error. I'm not familiar enough with the code to understand exactly what's going on/what should be changed, though. Hope this can point someone in the right direction!

The text was updated successfully, but these errors were encountered:

skriss · 2019-05-06T14:43:11Z

Anyone had a chance to look at this? cc @jbardin

jbardin · 2019-05-06T17:41:55Z

Hi @skriss,

I haven't seen this myself yet, and it does appear that Wait should not be called until the logger is done reading. It's not a fatal error (we're only logging it, there's no action taken), but is it causing a problem for you in other ways?

Do you have a way to replicate the error, or any other log messages from the go-plugin package?

skriss · 2019-05-06T19:25:17Z

@jbardin thanks for taking a look. It's not causing any functional issues other than writing a bunch of errors to the log.

It's pretty easily replicated in our project which makes use of go-plugin, though as I said it's sporadic, which implies to me it's probably a race condition. There aren't any other useful logs emitted around the same time. Let me see if I can come up with a simpler repro for you that doesn't involve deploying a Kubernetes cluster :)

skriss · 2019-05-06T19:32:49Z

My thought was that the switch statement at https://github.com/hashicorp/go-plugin/blob/master/client.go#L947 should actually be checking for an os.ErrClosed instead of an io.EOF -- looks like in a previous version of the code, only the write end of the pipe being used was closed, which resulted in an io.EOF being returned, but now, with the use of os/exec's StderrPipe, the read end is closed, so you get an os.ErrClosed (well, I think it's actually a wrapped version of that).

jbardin · 2019-05-06T20:44:08Z

Yes, that switch is where we would need to add the check for os.ErrClosed, but only if that's actually an error we should be expecting there (and it's more likely a PathError we'd have to check for first).

I would like to at least understand why we're seeing this before ignoring the error completely however. A repro would be great, but even a description of how you're configuring and using the plugin might help.

…een invocations With errors like issue 116 [1], terraform restarts the provider server. And with terraform turning on `AutoMTLS` for all plugins [2], the client cannot reattach to already running provider server [3]. Provider's memory only caches creates errors like `invalid file \"49e2cc7ba023d87070d9b49a0c8f02ff0f63b7f61b390fbce8f64b011bd42ed7\", unknown file id` because the ignition_config resource tried to retrived a file id stored in cache before provider server was restarted. This stores the objects in `diskv` store in users' cache directory, retireving it when the memory cache is not populated. The in-memory cache keeps providing the no overhead access to objects. [1]: hashicorp/go-plugin#116 [2]: hashicorp/terraform#19560 [3]: https://github.com/hashicorp/go-plugin/blob/5692942914bbdbc03558fde936b1f0bc2af365be/client.go#L203-L204

skriss · 2019-05-20T20:00:05Z

@abhinavdahiya are you seeing this error as well?

skriss · 2019-05-20T21:15:00Z

@jbardin it looks like if you add a time.Sleep(time.Second) just above https://github.com/hashicorp/go-plugin/blob/master/client.go#L946, then you should be able to pretty easily reproduce this. It looks like there's nothing preventing https://github.com/hashicorp/go-plugin/blob/master/client.go#L606 from executing before logStderr has finished reading everything and gotten an io.EOF, and if .Wait() gets called before this happens, the read half of the stderr pipe gets closed by Wait(), then the next read in logStderr will get the "file already closed" error.

I implemented a super-simple fix in the following branch, that seems to prevent the issue from happening (and hopefully clearly demonstrates the race): https://github.com/hashicorp/go-plugin/compare/master...skriss:fix-stderr-race?expand=1. Happy to submit it as a PR if you like it as-is, otherwise I can change the approach as needed.

jbardin · 2019-05-20T22:04:47Z

Oh yes, thanks for tracking that down! The logger might not have entered the Read before Wait, and could be executing in any part of that loop when the process exits. While the error here isn't critical, it would be best to have the termination steps properly ordered for correctness.

I think the solution here seems reasonable. I would just suggest that we re-organize the code here a bit so that the .Add(1) call is local to where the goroutine is dispatched. It doesn't look like that would be too hard in ht this case.

skriss · 2019-05-20T22:48:14Z

I would just suggest that we re-organize the code here a bit so that the .Add(1) call is local to where the goroutine is dispatched. It doesn't look like that would be too hard in ht this case.

This makes sense -- the only issue I see here is that we want to call .Add(1) before starting the goroutine that calls cmd.Wait(), whereas currently go c.logStderr(cmdStderr) happens after. Is there any issue with moving the block that runs go c.logStderr(cmdStderr) to just above the goroutine that calls cmd.Wait()? It looks like it should be OK.

jbardin · 2019-05-21T14:50:43Z

@skriss: yes, that's the minor reorganization I had in mind. The order the goroutines are declared shouldn't matter, as long as the waitgroup is setup correctly and we're not changing anything in the closure.

see: hashicorp/go-plugin#116 (comment)

…#386) * 1 - attempt to fix stream closing issues in kelpos.Blocking() * 2 - only close manually created pipes, not Std* pipes * 3 - the actual fix, Wait() after Read() see: hashicorp/go-plugin#116 (comment) * 4 - remove IPC server and components since that is causing an explosion of pipes/files and is also unused closes #385

… (#386) * 1 - attempt to fix stream closing issues in kelpos.Blocking() * 2 - only close manually created pipes, not Std* pipes * 3 - the actual fix, Wait() after Read() see: hashicorp/go-plugin#116 (comment) * 4 - remove IPC server and components since that is causing an explosion of pipes/files and is also unused closes #385

skriss mentioned this issue May 20, 2019

reading plugin stderr error message in logs vmware-tanzu/velero#1395

Closed

skriss mentioned this issue May 20, 2019

ensure we finish reading stderr pipe before calling cmd.Wait() #118

Merged

jbardin closed this as completed in #118 Jun 10, 2019

nikhilsaraf added a commit to nikhilsaraf/kelp that referenced this issue Feb 28, 2020

3 - the actual fix, Wait() after Read()

25241fb

see: hashicorp/go-plugin#116 (comment)

flyer103 mentioned this issue Jan 4, 2022

wait for reading logs to complete before calling cmd.Wait koderover/zadig#851

Merged

4 tasks

lornasong mentioned this issue May 20, 2022

Test for runTerraformCmd leaked go-routine hashicorp/terraform-exec#299

Merged

spraints mentioned this issue Apr 6, 2023

Support clients that don't enable side-band or side-band-64k github/spokes-receive-pack#41

Merged

gabivlj mentioned this issue Feb 6, 2024

error encountered while scanning stdout: file already closed #298

Closed

DavidHermes mentioned this issue Sep 3, 2024

stdout: error="read |0: file already closed hashicorp/terraform#35667

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reading plugin stderr read |0: file already closed #116

reading plugin stderr read |0: file already closed #116

skriss commented May 2, 2019

skriss commented May 6, 2019

jbardin commented May 6, 2019

skriss commented May 6, 2019

skriss commented May 6, 2019

jbardin commented May 6, 2019

skriss commented May 20, 2019

skriss commented May 20, 2019

jbardin commented May 20, 2019

skriss commented May 20, 2019

jbardin commented May 21, 2019

reading plugin stderr read |0: file already closed #116

reading plugin stderr read |0: file already closed #116

Comments

skriss commented May 2, 2019

skriss commented May 6, 2019

jbardin commented May 6, 2019

skriss commented May 6, 2019

skriss commented May 6, 2019

jbardin commented May 6, 2019

skriss commented May 20, 2019

skriss commented May 20, 2019

jbardin commented May 20, 2019

skriss commented May 20, 2019

jbardin commented May 21, 2019