Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent logs intermittently missing around the time of log harvest #1156

Closed
JcolemanNR opened this issue Jun 16, 2022 · 0 comments · Fixed by #1201
Closed

Agent logs intermittently missing around the time of log harvest #1156

JcolemanNR opened this issue Jun 16, 2022 · 0 comments · Fixed by #1201
Assignees
Labels
bug Something isn't working

Comments

@JcolemanNR
Copy link
Contributor

JcolemanNR commented Jun 16, 2022

Description
We have observed several CI failures where 38 log events are expected, but only 37 are captured/forwarded. Initially we thought that this must be due to a timing issue with a log failing to be captured at the beginning of the test run, or a log failing to be sent up due to timing at the end of a test run. After investigating, the missing log is usually from the middle of the test run and logs before/after the missing log were forwarded without issue.

This indicates that there may be an intermittent timing issue with log events being lost (potentially related to harvest cycle).

Failures like this typically show up in the form of:

NewRelic.Agent.IntegrationTests.Logging.MetricsAndForwarding.log4net.Log4netMetricsDisabledAndForwardingEnabledTestsFW471Tests.Test [FAIL]
      Assert.Equal() Failure
      Expected: 38
      Actual:   37

Here is an excerpt of the ConsoleMF test app showing 5 log messages of the form AsyncNoAwaitInTransaction{Level}LogMessage being requested, and output to the console by log4net:

8:46:03 PM :EXECUTING: 'LoggingTester CreateSingleLogMessageInTransactionAsyncNoAwait AsyncNoAwaitInTransactionDebugLogMessage DEBUG'
8:46:03 PM >
8:46:03 PM :EXECUTING: 'LoggingTester CreateSingleLogMessageInTransactionAsyncNoAwait AsyncNoAwaitInTransactionInfoLogMessage INFO'
10653 [6] DEBUG MultiFunctionApplicationHelpers.NetStandardLibraries.LogInstrumentation.LoggingTester (null) - AsyncNoAwaitInTransactionDebugLogMessage
8:46:03 PM >
8:46:03 PM :EXECUTING: 'LoggingTester CreateSingleLogMessageInTransactionAsyncNoAwait AsyncNoAwaitInTransactionWarningLogMessage WARN'
10773 [6] INFO MultiFunctionApplicationHelpers.NetStandardLibraries.LogInstrumentation.LoggingTester (null) - AsyncNoAwaitInTransactionInfoLogMessage
8:46:03 PM >
8:46:03 PM :EXECUTING: 'LoggingTester CreateSingleLogMessageInTransactionAsyncNoAwait AsyncNoAwaitInTransactionErrorLogMessage ERROR'
10901 [12] WARN MultiFunctionApplicationHelpers.NetStandardLibraries.LogInstrumentation.LoggingTester (null) - AsyncNoAwaitInTransactionWarningLogMessage
8:46:03 PM >
8:46:03 PM :EXECUTING: 'LoggingTester CreateSingleLogMessageInTransactionAsyncNoAwait AsyncNoAwaitInTransactionFatalLogMessage FATAL'
11037 [8] ERROR MultiFunctionApplicationHelpers.NetStandardLibraries.LogInstrumentation.LoggingTester (null) - AsyncNoAwaitInTransactionErrorLogMessage
11177 [6] FATAL MultiFunctionApplicationHelpers.NetStandardLibraries.LogInstrumentation.LoggingTester (null) - AsyncNoAwaitInTransactionFatalLogMessage

Here is the log_event_data payloads sent to the collector:

2022-06-14 20:46:04,381 NewRelic  DEBUG: [pid: 4548, tid: 6] Invoking "log_event_data" with : [***"common":***"attributes":***"entity.name":"IntegrationTestAppName","entity.guid":"MjczMDcwfEFQTXxBUFBMSUNBVElPTnwyMTc5NTg","hostname":"fv-az171-169.ktbnycnmqcfudbavfw4edtkmzc.xx.internal.cloudapp.net"***,"logs":[***"timestamp":1655239562173,"message":"InTransactionInfoLogMessage","level":"INFO","span.id":"b9ab0c91756f874c","trace.id":"dd4d7f7603f86c0632154b3790e052c2"***,***"timestamp":1655239562957,"message":"AsyncInTransactionWarningLogMessage","level":"WARN","span.id":"87a72624fcdc4411","trace.id":"9d423c48108aa2f64bae4043a204466b"***,***"timestamp":1655239562420,"message":"InTransactionErrorLogMessage","level":"ERROR","span.id":"1eaaa9b73f578ccd","trace.id":"d8011f6dc42ace7fcff38c777a948583"***,***"timestamp":1655239561780,"message":"InTransactionDebugLogMessage","level":"DEBUG","span.id":"f43db5098336b6c7","trace.id":"5dce101e0cdee42b45d59fd071f84821"***,***"timestamp":1655239563090,"message":"AsyncInTransactionErrorLogMessage","level":"ERROR","span.id":"e9ae0590d4556ede","trace.id":"d871d33625cedcf17ae217ac6ea09e62"***,***"timestamp":1655239563215,"message":"AsyncInTransactionFatalLogMessage","level":"FATAL","span.id":"a29f1aa652407f72","trace.id":"64fa0fb8a86e296a2df63ee53c090441"***,***"timestamp":1655239562699,"message":"AsyncInTransactionDebugLogMessage","level":"DEBUG","span.id":"b9220ef0ba3bca8a","trace.id":"5d036a447a443795668363459ddebdc4"***,***"timestamp":1655239562292,"message":"InTransactionWarningLogMessage","level":"WARN","span.id":"67d017e1790d63f4","trace.id":"3430b4905a3554d9c89b3cab0a3a53e2"***,***"timestamp":1655239562829,"message":"AsyncInTransactionInfoLogMessage","level":"INFO","span.id":"ef71360edff26274","trace.id":"016b6859ca1d0f6a87c56af27f79176d"***,***"timestamp":1655239562554,"message":"InTransactionFatalLogMessage","level":"FATAL","span.id":"a7d21bd6ac67ccc2","trace.id":"4887b8ade164867378d3016421a017e9"***,***"timestamp":1655239561498,"message":"AsyncNoAwaitOutsideTransactionErrorLogMessage","level":"ERROR"***,***"timestamp":1655239560454,"message":"AsyncOutsideTransactionDebugLogMessage","level":"DEBUG"***,***"timestamp":1655239553755,"message":"OutsideTransactionDebugLogMessage","level":"DEBUG"***,***"timestamp":1655239563587,"message":"AsyncNoAwaitInTransactionWarningLogMessage","level":"WARN"***,***"timestamp":1655239561273,"message":"AsyncNoAwaitOutsideTransactionInfoLogMessage","level":"INFO"***,***"timestamp":1655239563459,"message":"AsyncNoAwaitInTransactionInfoLogMessage","level":"INFO"***,***"timestamp":1655239560147,"message":"OutsideTransactionErrorLogMessage","level":"ERROR"***,***"timestamp":1655239561196,"message":"AsyncNoAwaitOutsideTransactionDebugLogMessage","level":"DEBUG"***,***"timestamp":1655239559848,"message":"OutsideTransactionInfoLogMessage","level":"INFO"***,***"timestamp":1655239560977,"message":"AsyncOutsideTransactionFatalLogMessage","level":"FATAL"***,***"timestamp":1655239563339,"message":"AsyncNoAwaitInTransactionDebugLogMessage","level":"DEBUG"***,***"timestamp":1655239560851,"message":"AsyncOutsideTransactionErrorLogMessage","level":"ERROR"***,***"timestamp":1655239563863,"message":"AsyncNoAwaitInTransactionFatalLogMessage","level":"FATAL"***,***"timestamp":1655239560721,"message":"AsyncOutsideTransactionWarningLogMessage","level":"WARN"***,***"timestamp":1655239559999,"message":"OutsideTransactionWarningLogMessage","level":"WARN"***,***"timestamp":1655239561376,"message":"AsyncNoAwaitOutsideTransactionWarningLogMessage","level":"WARN"***,***"timestamp":1655239560282,"message":"OutsideTransactionFatalLogMessage","level":"FATAL"***,***"timestamp":1655239561615,"message":"AsyncNoAwaitOutsideTransactionFatalLogMessage","level":"FATAL"***,***"timestamp":1655239560591,"message":"AsyncOutsideTransactionInfoLogMessage","level":"INFO"***]***]
...
2022-06-14 20:46:09,520 NewRelic  DEBUG: [pid: 4548, tid: 5] Invoking "log_event_data" with : [***"common":***"attributes":***"entity.name":"IntegrationTestAppName","entity.guid":"MjczMDcwfEFQTXxBUFBMSUNBVElPTnwyMTc5NTg","hostname":"fv-az171-169.ktbnycnmqcfudbavfw4edtkmzc.xx.internal.cloudapp.net"***,"logs":[***"timestamp":1655239566268,"message":"AsyncNoAwaitWithDelayInTransactionWarningLogMessage","level":"WARN","span.id":"9ec8fb3cc29315a3","trace.id":"c2afe904a9f9dcad8568299345278469"***,***"timestamp":1655239567411,"message":"AsyncNoAwaitWithDelayInTransactionErrorLogMessage","level":"ERROR","span.id":"b4ee375ed1c24a33","trace.id":"4076d8351bef2b97c25c38eafc095748"***,***"timestamp":1655239563991,"message":"AsyncNoAwaitWithDelayInTransactionDebugLogMessage","level":"DEBUG","span.id":"98d23ac257fa3aea","trace.id":"a096882ed81fba73ed0dbe7b9dbe28e6"***,***"timestamp":1655239565142,"message":"AsyncNoAwaitWithDelayInTransactionInfoLogMessage","level":"INFO","span.id":"6af54ed861d64291","trace.id":"faca3641751493d4cb2cc02dcedbdbc1"***]***]
...
2022-06-14 20:46:14,638 NewRelic  DEBUG: [pid: 4548, tid: 12] Invoking "log_event_data" with : [***"common":***"attributes":***"entity.name":"IntegrationTestAppName","entity.guid":"MjczMDcwfEFQTXxBUFBMSUNBVElPTnwyMTc5NTg","hostname":"fv-az171-169.ktbnycnmqcfudbavfw4edtkmzc.xx.internal.cloudapp.net"***,"logs":[***"timestamp":1655239568546,"message":"AsyncNoAwaitWithDelayInTransactionFatalLogMessage","level":"FATAL","span.id":"39f736ee710331af","trace.id":"aa7afd192e2e24a4e21a3dc0243c5fd5"***,***"timestamp":1655239569809,"message":"DifferentTraceAttributesInsideTransactionLogMessage","level":"INFO","span.id":"eaa4bf44e05f75de","trace.id":"f37a8c9d008c4068f11ea96a540461e6"***,***"timestamp":1655239569811,"message":"DifferentTraceAttributesInsideTransactionLogMessage","level":"INFO","span.id":"575291b9284c1b96","trace.id":"f37a8c9d008c4068f11ea96a540461e6"***,***"timestamp":1655239569680,"message":"TraceAttributeOutsideTransactionLogMessage","level":"INFO"***]***]

If you search the sent JSON payload you will not find AsyncNoAwaitInTransactionErrorLogMessage, but all of the surrounding logs (before and after) are in the JSON payload. Interestingly, the seen metric is missing this log as well:

[***"name":"Supportability/Logging/Forwarding/Seen"***,[37,0.0,0.0,0.0,0.0,0.0]]

Initially I thought this might be a race condition during log event harvest, but after investigating more, I am not sure that is what is happening. The counts should be incremented regardless of any race condition.

This warrants more investigation and appears to be an intermittent agent bug instead of a CI flicker.

Expected Behavior
The agent captures and forwards all logs within the sampling limits.

For Maintainers Only or Hero Triaging this bug
Suggested Priority (P1,P2,P3,P4,P5): P4
Suggested T-Shirt size (S, M, L, XL, Unknown): S

@JcolemanNR JcolemanNR added the bug Something isn't working label Jun 16, 2022
@JcolemanNR JcolemanNR added this to the Major Release 10.0 milestone Jun 24, 2022
@angelatan2 angelatan2 removed this from the Major Release 10.0 milestone Jun 27, 2022
@angelatan2 angelatan2 added this to the .NET 7 Preview milestone Jul 5, 2022
@JcolemanNR JcolemanNR self-assigned this Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants