Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronize console writes #227

Merged
merged 9 commits into from
Nov 18, 2021
Merged

Synchronize console writes #227

merged 9 commits into from
Nov 18, 2021

Conversation

mjcarroll
Copy link
Contributor

@mjcarroll mjcarroll commented Jun 22, 2021

Signed-off-by: Michael Carroll michael@openrobotics.org

🎉 New feature

Summary

Provide an internal mechanism for console writes. This will prevent messages from being interleaved.

Test it

Checklist

  • Signed all commits for DCO
  • Added tests
  • Added example and/or tutorial
  • Updated documentation (as needed)
  • Updated migration guide (as needed)
  • codecheck passed (See contributing)
  • All tests passed (See test coverage)
  • While waiting for a review on your PR, please help review another open pull request to support the maintainers

Note to maintainers: Remember to use Squash-Merge

@mjcarroll mjcarroll requested a review from azeey June 22, 2021 19:48
@github-actions github-actions bot added 🏢 edifice Ignition Edifice 🏯 fortress Ignition Fortress labels Jun 22, 2021
@codecov
Copy link

codecov bot commented Jun 22, 2021

Codecov Report

Merging #227 (2dd8bdd) into ign-common4 (b337f4e) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@               Coverage Diff               @@
##           ign-common4     #227      +/-   ##
===============================================
+ Coverage        77.07%   77.10%   +0.03%     
===============================================
  Files               75       75              
  Lines            10679    10694      +15     
===============================================
+ Hits              8231     8246      +15     
  Misses            2448     2448              
Impacted Files Coverage Δ
src/Console.cc 86.13% <100.00%> (+1.33%) ⬆️
src/Util.cc 84.00% <100.00%> (+0.24%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b337f4e...2dd8bdd. Read the comment docs.

Copy link
Contributor

@chapulina chapulina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be interested in knowing how this affects performance.

Also, this could be targeted at ign-common3, right?

src/Console.cc Outdated
Console::log << outstr;
Console::log.flush();
{
std::lock_guard<std::mutex> lk(syncMutex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider having separate mutexes for logging to disk and logging to the console? This way different loggers could write to disk and to console at the same time. It may be a small optimization, but when logging a lot it could make a difference?

@mjcarroll
Copy link
Contributor Author

I'd be interested in knowing how this affects performance.

I agree. We don't have any metric for console performance right now, so it's a bit handwavy.

Signed-off-by: Michael Carroll <michael@openrobotics.org>
@mjcarroll
Copy link
Contributor Author

I copied this latency test from a spdlog test I found: https://gist.github.com/KjellKod/4d88035f9af513eed69e

All times are in microseconds:

Without the mutex:

Num Threads: 1, Min: 3.0, Max: 742230.0, Mean: 6.500559, Median: 5.0
Num Threads: 2, Min: 2.0, Max: 134537.0, Mean: 15.518016, Median: 15.0
Num Threads: 4, Min: 2.0, Max: 103224.0, Mean: 32.359236, Median: 43.0
Num Threads: 8, Min: 2.0, Max: 59725.0, Mean: 68.890648, Median: 107.0
Num Threads: 16, Min: 2.0, Max: 57508.0, Mean: 140.911656, Median: 249.0
Num Threads: 32, Min: 2.0, Max: 56070.0, Mean: 285.820097, Median: 536.0

With the mutex:

Num Threads: 1, Min: 3.0, Max: 836221.0, Mean: 6.156619, Median: 4.0
Num Threads: 2, Min: 3.0, Max: 183572.0, Mean: 18.738199, Median: 17.0
Num Threads: 4, Min: 2.0, Max: 64967.0, Mean: 38.969061, Median: 52.0
Num Threads: 8, Min: 2.0, Max: 55175.0, Mean: 85.129123, Median: 137.0
Num Threads: 16, Min: 2.0, Max: 52507.0, Mean: 169.010793, Median: 292.0
Num Threads: 32, Min: 2.0, Max: 50190.0, Mean: 336.777897, Median: 602.0

@chapulina chapulina self-requested a review July 26, 2021 18:09
@azeey
Copy link
Contributor

azeey commented Aug 6, 2021

I found more data races with TSAN and opened a PR suggesting chagnes: #237

* More data race fixes

Signed-off-by: Addisu Z. Taddese <addisu@openrobotics.org>
Co-authored-by: Alejandro Hernández Cordero <ahcorde@gmail.com>
Copy link
Contributor

@azeey azeey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Since I wrote part of this PR, another pair of eyes might be good.

test/performance/logging.cc Show resolved Hide resolved
@azeey
Copy link
Contributor

azeey commented Aug 23, 2021

The ABI check is not happy because of the new Logger::Buffer::xsputn override and the Logger::Buffer::syncMutex member. Technically, this is true, but Logger::Buffer is an internal class and should not be instantiated by users, so I don't think it's a true ABI break.

@chapulina
Copy link
Contributor

Logger::Buffer is an internal class and should not be instantiated by users

I'm not sure if that's true. The Logger class has virtual members, which implies that it's meant to be extended. Derived classes would have access to Buffer, right?

@azeey
Copy link
Contributor

azeey commented Aug 23, 2021

I'm not sure if that's true. The Logger class has virtual members, which implies that it's meant to be extended. Derived classes would have access to Buffer, right?

So...it looks like Logger::rdbuf returns a pointer the Buffer object, so yeah, any type of user, not just derived classes, would have access to Buffer. Adding a new override would break ABI 😢. But if we can't override xsputn, I don't know how we can prevent a race condition on the internal std::string buffer.

@chapulina chapulina added the beta Targeting beta release of upcoming collection label Sep 8, 2021
@chapulina
Copy link
Contributor

But if we can't override xsputn, I don't know how we can prevent a race condition on the internal std::string buffer.

Looking at the ABI report again, it looks like the only issue is the new mutex. Can we do the static map trick to avoid adding the member variable?

@chapulina chapulina removed the beta Targeting beta release of upcoming collection label Sep 20, 2021
Signed-off-by: Nate Koenig <nate@openrobotics.org>
@nkoenig
Copy link
Contributor

nkoenig commented Oct 15, 2021

Static map used in b08f53a

@azeey
Copy link
Contributor

azeey commented Oct 15, 2021

Thanks @nkoenig for adding the map. My concern with using a global variable is that it violates the GSG guideline about objects of static duration that have non-trivial destructors. I would vote for making the map a function-local static variable that gets allocated on the heap and never gets deleted. I know this would cause errors when running under a memory profiler, but I think it's worth it to avoid the potential bugs that can be caused by having the map be a global variable.

@nkoenig
Copy link
Contributor

nkoenig commented Oct 15, 2021

Thanks @nkoenig for adding the map. My concern with using a global variable is that it violates the GSG guideline about objects of static duration that have non-trivial destructors. I would vote for making the map a function-local static variable that gets allocated on the heap and never gets deleted. I know this would cause errors when running under a memory profiler, but I think it's worth it to avoid the potential bugs that can be caused by having the map be a global variable.

Sounds good. I'll step aside, and let you or someone else take the wheel.

Signed-off-by: Michael Carroll <michael@openrobotics.org>
@mjcarroll
Copy link
Contributor Author

@azeey ready for another look, switched to function local static map pointer.

Copy link
Contributor

@azeey azeey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Checked with TSAN. Just a few nits

src/Console.cc Outdated
@@ -14,6 +14,7 @@
* limitations under the License.
*
*/
#include <map>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should this be unordered_map since that's what we're using in BufferLock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/Console.cc Outdated Show resolved Hide resolved
@azeey
Copy link
Contributor

azeey commented Nov 11, 2021

Are we planning to backport this to citadel?

Signed-off-by: Michael Carroll <michael@openrobotics.org>
@azeey
Copy link
Contributor

azeey commented Nov 15, 2021

Probably need to increase the timeout on the test. I think that's why windows CI is failing.

Signed-off-by: Michael Carroll <michael@openrobotics.org>
@mjcarroll
Copy link
Contributor Author

Probably need to increase the timeout on the test. I think that's why windows CI is failing.

Done, it looks like we are only failing in the plugin specialization test, which is known flaky.

@mjcarroll mjcarroll merged commit d93da52 into ign-common4 Nov 18, 2021
@mjcarroll mjcarroll deleted the lock_console branch November 18, 2021 22:52
scpeters pushed a commit to scpeters/ign-common that referenced this pull request Jan 19, 2022
Add PBR material class

Signed-off-by: Michael Carroll <michael@openrobotics.org>
@osrf-triage
Copy link

This pull request has been mentioned on Gazebo Community. There might be relevant details there:

https://community.gazebosim.org/t/new-ignition-releases-2022-01-24-citadel-edifice-fortress/1241/1

@azeey azeey mentioned this pull request Jun 24, 2022
8 tasks
azeey added a commit to azeey/gz-common that referenced this pull request Jun 24, 2022
Provide an internal mechanism for console writes. This will prevent messages from being interleaved.

Signed-off-by: Michael Carroll <michael@openrobotics.org>
Co-authored-by: Addisu Z. Taddese <addisu@openrobotics.org>
Co-authored-by: Alejandro Hernández Cordero <ahcorde@gmail.com>
Co-authored-by: Nate Koenig <nate@openrobotics.org>
azeey added a commit that referenced this pull request Jun 24, 2022
Provide an internal mechanism for console writes. This will prevent messages from being interleaved.

Signed-off-by: Michael Carroll <michael@openrobotics.org>
Co-authored-by: Addisu Z. Taddese <addisu@openrobotics.org>
Co-authored-by: Alejandro Hernández Cordero <ahcorde@gmail.com>
Co-authored-by: Nate Koenig <nate@openrobotics.org>
luca-della-vedova pushed a commit that referenced this pull request Jul 27, 2022
Provide an internal mechanism for console writes. This will prevent messages from being interleaved.

Signed-off-by: Michael Carroll <michael@openrobotics.org>
Co-authored-by: Addisu Z. Taddese <addisu@openrobotics.org>
Co-authored-by: Alejandro Hernández Cordero <ahcorde@gmail.com>
Co-authored-by: Nate Koenig <nate@openrobotics.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏢 edifice Ignition Edifice 🏯 fortress Ignition Fortress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants