-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use cache instead of map in the plugin promwrapper #14800
Conversation
Below is an analysis created by an LLM. Be mindful of hallucinations and verify accuracy. WF: CI Core#88cc1ec1. HTTP 503 Service Temporarily Unavailable error during re-run of flakey tests:[Flakey Test Detection]Source of Error:
Why: The error occurred because the service used to re-run flakey tests was temporarily unavailable, likely due to server-side issues such as maintenance or overload. Suggested fix: Implement retry logic with exponential backoff in the test runner script to handle temporary service unavailability gracefully. Alternatively, check the service status or contact the service provider for more details. |
AER Report: CI Core ran successfully ✅AER Report: Operator UI CIaer_workflow , commit , Breaking Changes GQL Check 1. Workflow triggered but failed to complete successfully: [Breaking Changes GQL Check]Source of Error:
Why: The triggered workflow did not complete successfully. The status check returned a conclusion of "failure" instead of "success," causing the upstream job to propagate the failure. Suggested fix: Investigate the logs of the downstream workflow (ID: 11664801213) to identify the specific cause of failure and address the underlying issue. Ensure all dependencies and configurations are correct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just one question about the approach.
Quality Gate passedIssues Measures |
Context:
A memory leak was detected in CCIP nodes, traced to the
reportEndTimes
map inpromwrapper/plugin.go
. This map stores timestamps during theReport
method but relies onShouldAcceptFinalizedReport
to remove entries. However, in OCR2, if Report returns false on some nodes,ShouldAcceptFinalizedReport
may not always run, leaving stale entries that accumulate over time.Solution:
We replaced the
reportEndTimes
map with a cache that supports expiration. As the data is only used for logging, this change avoids memory leaks without affecting performance or critical paths. Entries are now automatically cleaned up, preventing unbounded memory growth.Testing:
This fix was tested in the
beta-testnet
environment with the following results:With fix: ~5 days, memory usage stabilized between 217 to 242 MB.
Without fix: ~4 days, memory usage increased from 200 to 347 MB.