-
Notifications
You must be signed in to change notification settings - Fork 43
Add possibility to force flush logs after certain period of time #216
Add possibility to force flush logs after certain period of time #216
Conversation
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Co-authored-by: Patryk Małek <pmalek@sumologic.com>
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Codecov Report
@@ Coverage Diff @@
## main #216 +/- ##
=======================================
+ Coverage 75.6% 75.8% +0.1%
=======================================
Files 95 95
Lines 4408 4464 +56
=======================================
+ Hits 3334 3384 +50
- Misses 747 751 +4
- Partials 327 329 +2
|
After this change there is an issue only with double following force flushes IMO if this occurs, there is issue with regexes configuration, because if two following flushes from application to file doesn't match the regex, something seems to be invalid Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
@@ -156,7 +157,7 @@ func (c InputConfig) Build(context operator.BuildContext) ([]operator.Operator, | |||
InputOperator: inputOperator, | |||
Include: c.Include, | |||
Exclude: c.Exclude, | |||
SplitFunc: splitFunc, | |||
Multiline: c.Multiline, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this is just a naming issue, but I'll have a difficult time accepting that the splitfunc is being delegated to a Multiline
struct. Put another way, we need a split func regardless of whether or not we are using multiline splitting.
I suspect that we can accomplish the same functionality as you are implementing, but by composing things in a better way.
Would it make sense to create a new struct which would pair the concept of a split func with the concept of a flush? Maybe something like:
type Splitter struct {
splitFunc // can be simple like '\n' or multiline config
flusher // forces a flush based on timing
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking at this further, I'm not convinced we even need to do what I've suggested here. Please see my related comment on reader.go
. I'll leave this comment in place as a possible alternative, but I think we need to start by justifying the need to alter splitFunc behavior in any way.
docs/operators/file_input.md
Outdated
set `force_flush_period` option to [duration string](https://golang.org/pkg/time/#ParseDuration), | ||
eg: `5s`, `1m`. It's by default `0s` which means, that no force flushing will be performed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, I think we should use helper.Duration
instead of time.Duration
. We can then link to https://github.com/open-telemetry/opentelemetry-log-collection/blob/main/docs/types/duration.md
, which explains formatting sufficiently.
This is basically the same thing, but with a default unit of seconds, since nanoseconds is basically never reasonable in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated documentation
@@ -116,7 +120,7 @@ func (r *Reader) ReadToEnd(ctx context.Context) { | |||
return | |||
} | |||
|
|||
scanner := NewPositionalScanner(r, r.fileInput.MaxLogSize, r.Offset, r.fileInput.SplitFunc) | |||
scanner := NewPositionalScanner(r, r.fileInput.MaxLogSize, r.Offset, r.multiline.SplitFunc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is the only place we are using the split func, so I'm not sure why it needs to be part of the multiline config.
We have two concepts here:
- How to split lines
- When to force a flush
I'm not convinced these are dependent on each other, or that either is necessarily a concern of multiline behavior. (Of course a split func may implement multiline splitting, but it does not always.)
Can we leave splitFunc as it was and just add a new config that controls flushing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some relations I was unsure how to resolve, so I put both functionality into one struct:
- splitFunc needs to be aware when to force a flush (so new structure is created and pass to the splitFunc)
- every file should be flushed separately (so splitFunc is created for every unique reader)
Regarding line splitting. Of course it is not always a multiline behavior, but looking at it from user perspective it is intuitive to use multline
config for it, so Multiline
struct underneath, than mixing Splitter
struct with multiline
config, or using splitter
config.
I will exclude force a flush to separate config, as logic of it is mostly out of the splitFunc, but not sure how to resolve other dependencies in nice way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some changes in requested direction, but after all I think Splitter
struct seems to be reasonable consensus.
You can see current implementation without Splitter
. There are two dependent entities: splitFunc
and ForceFlush
and I feel that they should be connected together via additional struct. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think consolidating into one struct makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the latest code, please
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
…ltiline as Splitter eventually Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Thanks @sumo-drosiek, nice improvement! |
Fixes #215
This is work in progress PR. General idea is to keep information about when last log has been properly flushed and if it was long ago, buffered line should be flushed as we do not except more lines
I'm going to make minor refactor, add unit tests and more comments to code