-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add awslogs multiline support #30891
Conversation
221fb60
to
ba6fa44
Compare
7b96e58
to
a525b3b
Compare
d02b268
to
d30da02
Compare
If we were to take this functionality, I'd much prefer to implement this in a way that works on any log driver as there is nothing driver specific about the implementation. I also find the date/time format vs "specify any regex" kind of weird from a UX point of view. Implementation should take into account a misbehaving application that either sends really long messages (too big to hold in a single message) or that just does not send the message delimiter. I also expect that the regex matching will impose a significant performance cost for logging. Some benchmarks here would be helpful, and notes in the documentation on this. |
@cpuguy83 - I would tend to agree with a more generic approach, however as per #22920 this proposal was rejected and advice was for individual logging drivers to implement this functionality themselves. I would suggest there is a reasonably strong argument for this in the AWS CloudWatch logs driver, given the native agent supports multiline processing and the date time format processing I have implemented. The date/time format reflects the native CloudWatch Logs agent implementation (see The implementation leverages the existing implementation in terms of dealing with log messages that are too long (CloudWatch logs have published limits and the existing implementation adheres to these limitations), and specifically detects if a buffered message has exceeded these limits and hands off processing to the existing implementation. The implementation also will only buffer for the batch publishing frequency (currently 5 seconds as per the existing implementation), so I think all questions around unusual log event processing are covered. WRT benchmarking, agree there may be a performance hit and will benchmark to quantify this. Ultimately this is an opt-in feature and has been designed such that the same existing processing logic is applied if the multiline feature has not been activated, but will be useful to provide guidance on when this feature may introduce performance issues. |
👏 thanks for doing this! It's a huge addition for anyone running Java applications in Docker with Amazon CloudWatch Logs. I look forward to seeing it upstream. |
d30da02
to
2dd776f
Compare
@cpuguy83 - I have added some basic benchmarks, they just simulate pushing 10 multiline logs with 100 lines per multiline log, with a reasonably real-world multiline pattern of Master branch benchmark (see https://github.com/mixja/docker/tree/awslogs-benchmark):
Pull request benchmark:
So yes as would be expected there is a performance hit with multiline pattern matching, but no performance hit with the new code base if you don't enable multiline pattern matching. |
ping @cpuguy83 @thaJeztah @tonistiigi @icecrime @tiborvass what is the status for this one ? |
+1 for the design. I agree with phemmer in his comment about #22920 (comment) so I wouldn't want this to be generalized in docker, but in the awslogs driver it makes sense. |
Design OK, moving to code review. |
@mixja compilation errors
|
@mixja thanks, fixing. |
a91b345
to
539c4ec
Compare
@vdemeester - can we re-run the windows tests - looks like an unrelated failure in the Jenkins job output:
|
I just tested it out, and it works well, here are my test files incase anyone else wants to try it out.
I created a |
@crosbymichael - yes I have tested both. Attached is a screen shot of CloudWatch Logs console showing Java stack traces with the multiline support enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐯
/cc @cpuguy83 @tiborvass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// batch-calculations. | ||
func (l *logStream) processEvent(events []wrappedEvent, unprocessedLine []byte, timestamp int64) []wrappedEvent { | ||
bytes := 0 | ||
for len(unprocessedLine) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit on the style here, we could flip this and return early to reduce indentation.
} | ||
line := unprocessedLine[:lineBytes] | ||
unprocessedLine = unprocessedLine[lineBytes:] | ||
if (len(events) >= maximumLogEventsPerPut) || (bytes+lineBytes+perEventBytes > maximumBytesPerPut) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parens are not needed here
Thank you @mixja! Given that this adds new features to the AWS logging driver, can you also open a pull request in the documentation repository (probably the |
Sorry for jumping in, but when can we expect this to be released? (we've been looking forward to this feature for a long time!) |
See the milestone; it's gonna be part of the upcoming 17.06 release, targeted start of June (and release candidate 1 later this week for testing) |
@thaJeztah - done docker/docs#3319 |
@mixja thanks! |
Thanks @thaJeztah :) |
This adds bash completion for moby/moby#30891. Signed-off-by: Harald Albers <github@albersweb.de>
This adds bash completion for moby/moby#30891. Signed-off-by: Harald Albers <github@albersweb.de> (cherry picked from commit 1d21a3dd7c4ffff80c093c005b910a131e7fc05a) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds bash completion for moby/moby#30891. Signed-off-by: Harald Albers <github@albersweb.de>
This adds bash completion for moby/moby#30891. Signed-off-by: Harald Albers <github@albersweb.de>
This adds bash completion for moby/moby#30891. Signed-off-by: Harald Albers <github@albersweb.de> (cherry picked from commit 1d21a3dd7c4ffff80c093c005b910a131e7fc05a) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds bash completion for moby/moby#30891. Signed-off-by: Harald Albers <github@albersweb.de> (cherry picked from commit 1d21a3d) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Justin Menga justin.menga@gmail.com
- What I did
This PR adds multiline processing to the AWS CloudWatch logs driver, providing functionality similar to the native AWS CloudWatch logs agent.
The feature allows users to optionally specify multiline pattern matching logic using either a regular expression or a Python strftime expression, meaning application events that span multiple lines (e.g. application stack traces) can be published to CloudWatch logs as a single event.
The
awslogs-multiline-pattern
andawslogs-datetime-format
log options have been added:awslogs-datetime-format
- defines a multiline start pattern in Python strftime format. This option always takes precedence if bothawslogs-datetime-format
andawslogs-multiline-pattern
are configured.awslogs-multiline-pattern
- defines a multiline start pattern using a regular expression. This option is ignored ifawslogs-datetime-format
is also configured.- How I did it
Implemented an event buffer that buffers log messages until a new multiline start pattern is matched. As soon as a new event is detected, the event buffer is appended/flushed to the pre-existing events slice for normal processing using the pre-existing batch publishing mechanism.
The implementation will immediately flush the event buffer if the maximum CloudWatch logs event size is reached, appending the buffer to the events slice up to the maximum event size.
The implementation also will only buffer up to the batch publishing frequency (currently 5 seconds). If a multiline event has been buffered for longer, it will be flushed to the events slice for normal processing at the next batch processing ticker. This ensures a multiline message will not be buffered for a long period of time in the scenario where no further messages are sent by the application for an extended period of time (the maximum amount of time a message can be buffered is 2 * batch publishing frequency or 10 seconds)
- How to verify it
Assuming you have AWS access key ID and secret access key with permissions to a pre-existing CloudWatch logs group called
test
, in the Docker development container first start up the Docker daemon with AWS permissions:Next run the examples below...
Example using
awslogs-multline-pattern
docker run -it --rm --log-driver=awslogs --log-opt awslogs-group=test --log-opt awslogs-multiline-pattern='^ABCD' my-container
With the following container stdout:
Two CloudWatch log events will be generated:
Event 1
Event 2
Example using
awslogs-datetime-format
docker run -it --rm --log-driver=awslogs --log-opt awslogs-group=test --log-opt awslogs-datetime-format='%Y-%m-%d' my-container
With the following container stdout:
Three CloudWatch log events will be generated:
Event 1
Event 2
Event 3
- Description for the changelog
Add AWS CloudWatch Logs multiline processing
- A picture of a cute animal (not mandatory but encouraged)