Skip to content
This repository has been archived by the owner on Dec 12, 2024. It is now read-only.

Can Logback be used for logging instead of JUL? #217

Closed
dansomething opened this issue Dec 5, 2017 · 14 comments
Closed

Can Logback be used for logging instead of JUL? #217

dansomething opened this issue Dec 5, 2017 · 14 comments
Assignees

Comments

@dansomething
Copy link

This relates to issue #68

Should it be possible to substitute JUL with Logback for logging by using google-cloud-logging-logback?
https://github.com/GoogleCloudPlatform/google-cloud-java/tree/master/google-cloud-contrib/google-cloud-logging-logback

As far as I can tell, the traceId of the request is always set to the TraceLoggingEnhancer in Jetty. It looks like the Logback LoggingHandler also uses the TraceLoggingEnhancer via the MonitoredResourceUtil. Therefore, I'd expect any log entries sent through the Logback LoggingHandler to be decorated with the traceId label. In my testing that doesn't appear to be the case. I can see all of the code executing as expected in the Stackdriver debugger, but the traceId always ends up being null in the TraceLoggingEnhancer when the LoggingHandler logs the entry.

Perhaps I'm missing something or doing something wrong?

Here's a simple Spring Boot project that demonstrates the issue. It correctly creates log entries in Stackdriver, but they never include the traceId of the request.
https://github.com/powerpro/stackdriver-logger

@meltsufin
Copy link
Member

I suspect that the gcp module is somehow not enabled. @gregw Can you take a look please? I think this should work.

@dansomething
Copy link
Author

I was able to get it working in a branch where I take x-cloud-trace-context from the request and set it to the TraceLoggingEnhancer myself, but its not ideal.

https://github.com/powerpro/stackdriver-logger/blob/traceid-handler-interceptor/src/main/java/com/powerpro/stackdriverlogger/TraceIdLoggingHandlerInterceptor.java

@gregw gregw self-assigned this Dec 6, 2017
@gregw
Copy link
Contributor

gregw commented Dec 11, 2017 via email

@gregw
Copy link
Contributor

gregw commented Dec 11, 2017

@dansomething, I believe the issue is that you have set up two logging systems: one is provided by the Flex image and is on the container classpath, essentially hidden from the application except via the JUL logging API. You have then installed a logback logger within the webapplication, but that duplicates all the google cloud mechanisms and the container managed interceptors for traceId are hidden from it.

Thus the traceid is being is being set by the container, but only on it's logging mechanism. Your handler is also taking the traceid and setting it on your logging mechanism.

It may be OK to run like this.

But if you want only one logging system, then the containers logger needs to be replaced with the logback jars, any container components that use JUL have to have handlers added to send JUL logs to logback; and finally the context's classloader has to be configured to expose the logback APIs to the application and to prevent the application overriding it with it's own implementation. The application should then remove logback from WEB-INF/lib.

@dansomething
Copy link
Author

Ok, thanks for all the info! I guess for now I'll manage with the two loggers until we have time to set up our own container. Since that answers all of my questions, I'll close this out.

As a side note, is it possible the Logback feature could eventually be added to this project?

@gregw
Copy link
Contributor

gregw commented Dec 12, 2017

@dansomething There is no technical hurdle preventing the use of Logback rather than JUL. However the benefit of using JUL is that it means the image does not need to decide if the logging is just for the container or for the webapp as well. This is because the JUL API is part of the JVM and exposed to both.

If logback was used, then it would have to be a decision if that logging is just for the container or for both container and webapp. Sharing with the webapp raises the possibility of version clashes with webapps that provide their own logging mechanism (as yours does now). It is very difficult to find a one-size-fits-all solution, so JUL is a simpler default situation. Note that it is not very difficult to change the configuration of the server to almost any logging setup and/or exposure, so a simple default with flexible configuration possibilities is probably best.

@meltsufin
Copy link
Member

@gregw Would it make sense in this situation to use a Logback over SLF4J bridge (log4j-over-slf4j) and an SL4J to JUL (slf4j-jdk14) binding to go from Logback to the JUL-based logging system configured in the runtime?

@gregw
Copy link
Contributor

gregw commented Dec 13, 2017

@meltsufin it depends entirely on why they are using logback. If they are just using it because they have code instrumented with logback, then yes providing bridges (probably via slf4j) would allow those logs to be routed to the JUL log and thus to the container configured log.

But if they wish to use logback because they want to the logback log handlers/configuration, then bridges will obviously not assist.

If they wanted both, bizarrely enough, it might even be a reasonable deployment for the application to bridge logback through slf4j to JUL, then for the container to bridge JUL back via another instance of slf4j to logback. That way JUL can be used to decouple the versions of the logging mechanisms used by application and container.

@dansomething
Copy link
Author

dansomething commented Dec 14, 2017

Some background on our situation might help here. We don't really want both loggers (and we are using slfj4), but we've had issues with the jul log handler that we can't explain.

We want our request logs to correlate with application logs in Stackdriver using the traceId. Unfortunately with that setup, we have issues where some HTTP requests inexplicably block until Jetty times them out after 3600s. During those requests, when there are two log statements in a row, we only see the first one in Stackdriver. As soon as we disable that appender and go back to the default JUL configuration in the container we no longer have any issues.

But that leaves us in a lousy situation with our logs. Not having the traceId makes them almost useless. I have opened a "Google Enterprise Support" case. We've had little luck there so far.

That has lead us down the road of trying other Stackdriver appenders which lead us to here. At this point we have proven that using the Logback appender does not reproduce the same issue in our application. That solves our problem of getting application logs into Stackdriver correctly, but sadly leaves us with a separate logger from the container.

@gregw
Copy link
Contributor

gregw commented Dec 14, 2017

@dansomething So that suggests that there is a bug in the gcloud JUL appender and that if that was fixed you'd be happy to continue using JUL? So we should spend some effort to see what is causing the blocking.

Your work around is probably OK for now, but if we need to apply it long term, then it may be worthwhile to prepare an image that uses stackdriver for the container and exposes that to the application. Note that this will tie your application to that image as it will have to assume a logging API is provided.

@meltsufin
Copy link
Member

+1 to what @gregw said, it sounds like there could be a problem in the Google Cloud Logging client library itself. If you have a way to reproduce the issue or some container logs, it would be worthwhile to investigate that further and fix the root cause, not only for you, but also other developers who might be facing the same issue.

@dansomething
Copy link
Author

Yes, it does appear to be a bug in the JUL appender, but we've yet to be able to prove it without a doubt. The only way we've reproduced reliably is in of our production environment under load which makes it difficult to track down. And we can't run our production app like that for long for obvious reasons. We'd be fine continuing with JUL if it wasn't for this problem.

I'm reaching out on the GCP enterprise support ticket to see if they can provide any information from the snapshot of the incident that was captured. I can provide the case number if that helps.

@meltsufin
Copy link
Member

Thanks for the clarification. I think it's best to go through the standard channel of GCP support in this instance because they may need to dig into the server logs, etc.

@dansomething
Copy link
Author

There is a related ticket open now for the com.google.cloud.logging.LoggingHandler issue.

googleapis/google-cloud-java#2796

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants