-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor logcollector memory usage #2637
Conversation
self.join() | ||
try: | ||
self.join() | ||
except RuntimeError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RuntimeError being raised when agent try to stop the thread to exit gracefully. Seems like expected error ignoring this
File "bin/WALinuxAgent-9.9.9.9-py3.8.egg/azurelinuxagent/ga/collect_logs.py", line 123, in join
self.event_thread.join()
File "/usr/lib/python3.6/threading.py", line 1053, in join
raise RuntimeError("cannot join current thread")
RuntimeError: cannot join current thread
return LogCollectorMonitorHandler(cgroups) | ||
|
||
|
||
class LogCollectorMonitorHandler(ThreadHandlerInterface): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Montor thread
max_usage = metric.value | ||
|
||
current_max = max(current_usage, max_usage) | ||
if current_max > LOGCOLLECTOR_MEMORY_LIMIT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block checks the memory limit and sends exit
Codecov Report
@@ Coverage Diff @@
## develop #2637 +/- ##
===========================================
+ Coverage 71.91% 72.02% +0.11%
===========================================
Files 103 103
Lines 15654 15745 +91
Branches 2494 2501 +7
===========================================
+ Hits 11258 11341 +83
- Misses 3880 3887 +7
- Partials 516 517 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, just a few minor comments/questions
Description
Problem: The log collector process runs with 30MB of cgroup memory limit. Some vms reported OOM kills of the log collector process when it's reaches the limit.
Solution: Agent monitors the memory every 2 secs in separate thread and gracefully exit the log collector process when it's reaches 30MB limit. That way we can avoid force kills by OOM killer.
Issue #
PR information
Quality of Code and Contribution Guidelines