Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(system_monitor): multithreading support for boost::process #1714

Conversation

v-nakayama7440-esol
Copy link
Contributor

@v-nakayama7440-esol v-nakayama7440-esol commented Aug 29, 2022

Signed-off-by: v-nakayama7440-esol v-nakayama7440@esol.co.jp

Description

boost::process::pipe internally creates file descriptor. However, this file descriptor does not support multithreading because O_CLOEXEC is not set. Because of this, when each thread of SystemMontor uses boost::process to create a child process, the file descriptors of boost::process::pipe conflict between child processes. As a result, the child process waits for the file descriptor to be released.

In this PR, the SystemMonitor application creates file descriptor with O_CLOEXEC set and passes it to boost::process::pipe to eliminate the wait for child process to release file descriptor.

Related links

Tests performed

Process Monitor's /diagnostics topic message timestamp interval is essentially about 1 second (diagnostic_updater event execution cycle is 1hz). However, before the fix, the interval would occasionally be about 2 seconds.

Graph before fix (vertical axis unit is msec, horizontal axis unit is message order):
image

After the fix, Process Monitor confirmed that the timestamp interval for messages on the /diagnostics topic was not always reaching 2 seconds. Monitors other than Process Monitor were also checked in the same way.

Graph after fix (vertical axis unit is msec, horizontal axis unit is message order):
image
image

Additional tests performed

In order to check if no side effects come by this fix, verified the all of /diagnostics topics are sent periodically and notified as OK during normal operation.

Notes for reviewers

Pre-review checklist for the PR author

The PR author must check the checkboxes below when creating the PR.

In-review checklist for the PR reviewers

The PR reviewers must check the checkboxes below before approval.

  • The PR follows the pull request guidelines.
  • The PR has been properly tested.
  • The PR has been reviewed by the code owners.

Post-review checklist for the PR author

The PR author must check the checkboxes below before merging.

  • There are no open discussions or they are tracked via tickets.
  • The PR is ready for merge.

After all checkboxes are checked, anyone who has write access can merge the PR.

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>
@ito-san ito-san self-requested a review August 29, 2022 05:53
@ito-san
Copy link
Contributor

ito-san commented Aug 29, 2022

@v-nakayama7440-esol Could you please provide some measurements such as visualized interval of before and after the fix in Tests performed above?

@v-nakayama7440-esol
Copy link
Contributor Author

@ito-san
The graphs of Process Monitor before fix and Process Monitor and CPU Monitor after fix are attached to "Tests performed".

@ito-san ito-san added type:bug Software flaws or errors. component:system System design and integration. (auto-assigned) labels Aug 30, 2022
@codecov
Copy link

codecov bot commented Aug 30, 2022

Codecov Report

Merging #1714 (9420251) into main (ae56406) will decrease coverage by 0.05%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main    #1714      +/-   ##
==========================================
- Coverage   10.67%   10.61%   -0.06%     
==========================================
  Files        1109     1109              
  Lines       78619    79029     +410     
  Branches    18587    18587              
==========================================
  Hits         8392     8392              
- Misses      61312    61722     +410     
  Partials     8915     8915              
Flag Coverage Δ *Carryforward flag
differential 0.00% <0.00%> (?)
total 10.65% <0.00%> (ø) Carriedforward from ae56406

*This pull request uses carry forward flags. Click here to find out more.

Impacted Files Coverage Δ
...ystem_monitor/src/cpu_monitor/cpu_monitor_base.cpp 0.00% <0.00%> (ø)
...tem/system_monitor/src/hdd_monitor/hdd_monitor.cpp 0.00% <0.00%> (ø)
...tem/system_monitor/src/mem_monitor/mem_monitor.cpp 0.00% <0.00%> (ø)
...tem/system_monitor/src/ntp_monitor/ntp_monitor.cpp 0.00% <0.00%> (ø)
...em_monitor/src/process_monitor/process_monitor.cpp 0.00% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@ito-san
Copy link
Contributor

ito-san commented Aug 31, 2022

Verified the issue was gone after 1 hour test in cargo project.
image

@ito-san
Copy link
Contributor

ito-san commented Aug 31, 2022

Verified in bus project with humble.
image

Copy link
Contributor

@ito-san ito-san left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ito-san ito-san merged commit 8db3351 into autowarefoundation:main Aug 31, 2022
boyali referenced this pull request in boyali/autoware.universe Sep 28, 2022
…#1714)

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>
boyali referenced this pull request in boyali/autoware.universe Oct 3, 2022
…#1714)

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>
boyali referenced this pull request in boyali/autoware.universe Oct 3, 2022
…#1714)

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>
yukke42 pushed a commit to tzhong518/autoware.universe that referenced this pull request Oct 14, 2022
…arefoundation#1714)

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>
boyali referenced this pull request in boyali/autoware.universe Oct 19, 2022
…#1714)

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>
yn-mrse referenced this pull request in tier4/autoware.universe Nov 3, 2022
Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>
yn-mrse referenced this pull request in tier4/autoware.universe Nov 4, 2022
#170)

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>

Signed-off-by: v-nakayama7440-esol <v-nakayama7440@esol.co.jp>
Co-authored-by: v-nakayama7440-esol <97144416+v-nakayama7440-esol@users.noreply.github.com>
technolojin pushed a commit to technolojin/autoware.universe that referenced this pull request Dec 26, 2024
…ainst goal position (autowarefoundation#1714)

po

Signed-off-by: yuki-takagi-66 <yuki.takagi@tier4.jp>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:system System design and integration. (auto-assigned) type:bug Software flaws or errors.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants