Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] log: Add asynchronous wrappers. #273

Closed
wants to merge 9 commits into from
Closed

[WIP] log: Add asynchronous wrappers. #273

wants to merge 9 commits into from

Conversation

ChrisHines
Copy link
Member

Various implementations as outlined in #246.

Please review and comment on new APIs.

// AsyncLogger is stopping, Log will return ErrAsyncLoggerStopping.
func (l *AsyncLogger) Log(keyvals ...interface{}) error {
l.mu.Lock()
defer l.mu.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. At first I thought it was an oversight, but now I see it's to prevent a race. Still, it seems a shame to require a mutex op with every Log; my intuition is that there's a clever way to avoid it...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no. If AsyncLogger's contract is that all successful Log invocations make it to the writer, then you do indeed need some explicit synchronization.

@peterbourgon
Copy link
Member

Very happy with this. My only quasi-suggestion would be to place some kind of pithy comparison of the 3 new types somewhere, for easy review by new readers. Perhaps in the package-level documentation?

As an aside, it's a unique pleasure to read your code. The comments guide me precisely through the logic, and the code feels light to the touch; as much as any code I read these days, like a quiet conversation between programmer and mechanical colleague. Thanks for that.

@ChrisHines
Copy link
Member Author

... to place some kind of pithy comparison of the 3 new types somewhere, for easy review by new readers.

Indeed. I started on that but wanted to wait until we decided what to keep. That is one of the reasons this PR is marked WIP.

The other reason I consider this still a WIP is that I'd like to take some time to consider the contract of AsyncLogger very carefully.

Although AsyncLogger is rather complicated, it is valuable if you don't want latencies in the logging pipeline to impact clients. The cost we pay is the difficulty of signaling the application about errors encountered by the wrapped logger. In addition we introduce a new failure mode, buffer overflow.

Error Handling

I am pretty happy with the decision that AsyncLogger treats errors from the wrapped logger as fatal. I think this approach is far better than ignoring the errors as done by log15.BufferedHandler. An application retains the option of introducing a more sophisticated error handling strategy in the wrapped logger so that non-fatal errors do not reach AsyncLogger.

AsyncLogger will stop due to either an explicit call to Stop or because of an error returned from the wrapped logger. Currently Log returns ErrAsyncLoggerStopping in either situation.

Should Log return the error from the wrapped logger if it is not nil?

Buffer Overflow

Regarding the buffer overflow semantics: I think there are three choices, return an error as this PR does, provide back pressure by blocking, or allow blocking for a max duration and then timeout with an error.

What do you think of the alternatives?
Should this behavior be configurable?

@peterbourgon
Copy link
Member

peterbourgon commented May 28, 2016

At first I struggled to understand the use cases for which AsyncLogger is intended. My intuition is that the strict error semantics and emphasis on throughput mean it is aimed at applications where logging is critical, e.g. event sourcing rather than application logging. Is that right? If so, it might be worth a little usage text.

In that light the current API design, especially making the error available via Err, makes a lot of sense. It apes bufio.Scanner. So,

Should Log return the error from the wrapped logger if it is not nil?

Seems to me the current design is better, as it gives you more information about the state of the logger.

What do you think of the alternatives? [backpressure-mode, dual-mode]
Should this behavior be configurable?

Regarding backpressure-mode, I think it is a good idea and (without more context about likely usage) probably a better default than erroring out. But it seems like it would require a separate implementation. That is, it would be strange to have the Err method semantics on an instance that would never return an error; just looking at the type wouldn't be enough to tell you if you had to write different logic in your calling code to manage errors, you'd have to look at the constructor [options]. So I guess it would be better to be a different constructor + logger type altogether? Is that also your thinking?

I don't see huge value in the dual-mode implementation. My intuition is that, practically, it would just delay inevitable problems, and reduce to the error-mode implementation. That is, I have trouble imagining a class of logger errors that are transient-enough to be solved in a suitably short timeout window. So if clients use it, they will still have to write their code to manage errors anyway — might as well use the more-predictable version.

@ChrisHines
Copy link
Member Author

@peterbourgon Thanks for the feedback.

My intuition is that the strict error semantics and emphasis on throughput mean it is aimed at applications where logging is critical, e.g. event sourcing rather than application logging.

That is partially right. I would say the emphasis is on low latency of the Log method rather than throughput. I think the fact that buffer overflow is considered a hard failure and the heavyweight synchronization make it a poor fit for high throughput use cases.

I have used something like AsyncLogger for critical application logs when logs were of moderate volume and delays caused by I/O latency were painful. The application was the node agent for a batch processing cluster orchestration system, so it often had noisy neighbors that could be saturating local system I/O and we didn't want the app to block while writing logs. At the same time we wanted logs flushed to disk ASAP because they were sometimes the best source of information to understand what led to a particular issue on a node. We needed strict error and overflow semantics because we preferred that the application abort when it encountered unanticipated errors (a.k.a. crash-only software).

Regarding backpressure-mode, I think it is a good idea and (without more context about likely usage) probably a better default than erroring out. But it seems like it would require a separate implementation.

I agree but I'll take it a step farther. I think it must be a separate implementation. I made an attempt to add it as an option to the existing implementation yesterday and could not figure out how to easily support back-pressure along with the Stop method and the ability to drain the buffer without introducing possible deadlocks.

I don't see huge value in the dual-mode implementation. My intuition is that, practically, it would just delay inevitable problems, and reduce to the error-mode implementation.

I agree with you here as well. I was already leaning toward that assessment, so thanks for confirming.

At this point I think the current semantics of AsyncLogger are probably too specialized for inclusion in package log. Do you think there is a variant that would be more widely useful to Go kit users, or should we just drop it altogether?

@basvanbeek
Copy link
Member

It might be too specialized for inclusion in package log but it would also be a waste not to have it available to potential users.

Maybe we should have a place where we can store or refer to "plugins" and "extensions" which have a smaller use case / audience.

@ChrisHines
Copy link
Member Author

Maybe we should have a place where we can store or refer to "plugins" and "extensions" which have a smaller use case / audience.

Maybe, but log15 has an ext package along those lines and the choice of what went into that package and the main package just feels arbitrary, IMO. I would rather only create packages that have a well defined focus, or serve to isolate external dependencies.

So, my vote is that if we add an AsyncLogger I think it should be useful enough that we feel comfortable putting in package log next to SyncLogger.

@ChrisHines
Copy link
Member Author

ChrisHines commented May 31, 2016

I have renamed the original AsyncLogger to NonblockingLogger which I think clarifies its purpose quite a bit. I have also added a new simpler AsyncLogger implementation that blocks when the buffer is full and ignores errors from the wrapped logger.

The next step is to consider changing NonblockingLogger to have the same error handling semantics as the new AsyncLogger with the hope that it simplifies the code.

@ChrisHines
Copy link
Member Author

I think it makes sense to force the wrapped logger to handle errors for both AsyncLogger and NonblockingLogger so I've removed the error trapping logic from the latter.

The remaining question is whether the ability to stop these loggers and wait for the buffer to drain is worthwhile. It has been important to me in the past to make a best effort that all log events were saved during a graceful shutdown. Note: A graceful shutdown may occur due to an unrecoverable error.

@peterbourgon @basvanbeek PTAL.

l.logger.Store(loggerStruct{logger})
}

// SyncWriter synchronizes concurrent writes to an io.Writer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what circumstances would you want to

logger := log.NewLogfmtLogger(log.NewSyncWriter(w))

instead of

logger := log.NewSyncLogger(log.NewLogfmtLogger(w))

? Could a little bit of guidance make its way into the comments on the types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to add docs like that before removing the WIP tag. Generally speaking you want to do the least amount of work you can within a mutex while making sure that each log event is handled completely while the lock is held. So the first form is preferred in this case because NewLogfmtLogger makes a single call to Write for each log event.

The second form is needed for loggers that perform multiple writes per log event. NewJSONLogger currently does that, and external implementations of Logger could as well.

So, crafting the above into some nice docs is still on the todo list for this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha.

@ChrisHines
Copy link
Member Author

Status update: #327 extracted SyncLogger and SyncWriter into a separate contribution. This PR remains as a place to discuss the merits and design of AsyncLogger and NonblockingLogger until we are happy with them or decide to abandon them.

@ChrisHines ChrisHines changed the title [WIP] log: Add concurrent safe wrappers. [WIP] log: Add asynchronous wrappers. Jul 25, 2016
@peterbourgon
Copy link
Member

So I'm walking through the outstanding PRs with an eye to either merging or abandoning, and I'm curious what the state of this one could be. I'm completely happy with the two implementations here, modulo final tweaks and usage documentation, but I'm also alright with abandoning them. WDYT?

@ChrisHines
Copy link
Member Author

I think the SyncLogger and SyncWriter we've already added to the package are sufficient and the additional channel based approaches here are not worth the trouble in this package. My vote is to abandon them.

@peterbourgon
Copy link
Member

Great. Thanks for the [historical] work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants