aws_s3 source should log what it's ingesting #21128
Labels
source: aws_s3
Anything `aws_s3` source related
type: feature
A value-adding code addition that introduce new functionality.
A note for the community
Use Cases
When troubleshooting an aws_s3 source for missing data, we should have a log of which s3 buckets + keys were retrieved. Unlike the aws_sqs source, where such logging would be excessive, for aws_s3 sources there can be many, many events in a single file in s3.
Attempted Solutions
We tried using the metadata along with suppression to log in a remap, similar to:
But rate-limiting appears to apply to the caller location and not the particular message being logged. So this mesage will only appear once every 5 minutes no matter how many different buckets + keys were read.
Proposal
Add logging to record which s3 urls are being ingested.
References
No response
Version
vector 0.38.0 (x86_64-unknown-linux-gnu ea0ec6f 2024-05-07 14:34:39.794027186)
The text was updated successfully, but these errors were encountered: