Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable memory tools to work with address sanitizer #25

Closed
wants to merge 3 commits into from

Conversation

bhatsach
Copy link

@bhatsach bhatsach commented Feb 27, 2019

Add libasan path as a prefix to LD_PRELOAD before appending memory
tools lib path to it. This is required to enable address sanitizer to work
properly as it needs libasan to be first in the initial library load list.

Signed-off-by: Sachin Suresh Bhat bhatsach@amazon.com

Connects to ros2/ci#245

Add libasan path as a prefix to LD_PRELOAD before appending memory
tools lib path to it. This is required to enable address sanitizer to work
properly as it needs libasan to be first in the initial library load list.

Signed-off-by: Sachin Suresh Bhat <bhatsach@amazon.com>
@bhatsach
Copy link
Author

This PR is needed for enabling address sanitizer nightly job in ROS2 CI. PR link: ros2/ci#245

LD_PRELOAD=$<TARGET_FILE:memory_tools_interpose>)
set(memory_tools_path $<TARGET_FILE:memory_tools_interpose>)
if(OPTIONAL_ASAN_LD_PRELOAD_OVERRIDE)
execute_process(COMMAND bash "-c" "ldconfig -p | grep -e "libasan.*x86_64" | awk '{print $NF}'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an improvement over an ls-based filesystem approach but still not foolproof. I would really like to find a way to get the same shared object the linker would.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me explore if there is anything else we can use here and get back on this :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, the method would also work on macOS (as clang is the default compiler there).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other approach I am able to find online is to perform ldd on an executable. So it will be on the lines of running ldd on the "test_runner" executable inside the build directory for osrf_testing_tools. Trying to see a good way to do this or any other alternative way.

@wjwwood, I limited the above logic to linux mainly because ldconfig is present only on linux

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ldd approach will be safer for systems which have more than one copy of clang/asan installed.

You could use otool -L on macOS instead of ldd.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO the best way is to use -print-file-name from GCC which can be used to get the location of GCC controlled files.
My understanding is that this option is recognized by both clang and GCC.
It's use e.g. here: https://github.com/gimli-org/gimli/blob/master/cmake/FindCHOLMOD.cmake to find gfortran.

Ultimately, what we want is to keep the compiler instrumentation in sync with the supporting library. To achieve that, querying the compiler is probably the safest path forward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomas-moulard that sounds ideal, but I'm not sure how it is supposed to work, on my macOS laptop it just seems to always return what you give it:

william@kisra ~/ros2_ws
% clang++ -print-file-name=libasan
libasan

william@kisra ~/ros2_ws
% clang++ -print-file-name=libasan.so
libasan.so

william@kisra ~/ros2_ws
% clang++ -print-file-name=libasan.dylib
libasan.dylib

william@kisra ~/ros2_ws
% gcc -print-file-name=libasan.dylib
libasan.dylib

william@kisra ~/ros2_ws
% gcc -print-file-name=libasan.so
libasan.so

william@kisra ~/ros2_ws
% gcc -print-file-name=libasan.so.3
libasan.so.3

william@kisra ~/ros2_ws
% gcc -print-file-name=libgfortran.so
libgfortran.so

william@kisra ~/ros2_ws
% gcc -print-file-name=libgfortran.dylib
libgfortran.dylib

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on @thomas-moulard 's suggestion. Will test the changes with gcc --print-file-name once and then update here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I use a newer gcc it works, but is kind of fragile:

william@kisra ~/ros2_ws
% /usr/local/Cellar/gcc/8.2.0/bin/gcc-8 -print-file-name=libasan.dylib
/usr/local/Cellar/gcc/8.2.0/lib/gcc/8/gcc/x86_64-apple-darwin18.2.0/8.2.0/../../../libasan.dylib

william@kisra ~/ros2_ws
% /usr/local/Cellar/gcc/8.2.0/bin/gcc-8 -print-file-name=libasan
libasan

william@kisra ~/ros2_ws
% /usr/local/Cellar/gcc/8.2.0/bin/gcc-8 -print-file-name=libasan.so
libasan.so

So use it with care.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wjwwood - Updated the PR with the above changes. I am still carrying out the testing for the same.

@nuclearsandwich
Copy link
Member

This PR mitigates problems enabling both memory_testing_tools and asan by ensuring that LD_PRELOAD includes libasan first.

If the memory testing tools override the same functions as asan, I don't believe that they can function as expected with libasan present and taking precedence in LD_PRELOAD (@wjwwood do you know definitively?) and so disabling the memory tests might still be the preferable long-term solution.

One further concern I have with this approach is that it doesn't cover the other sanitizers on the roadmap that would also be affected. Expanding this approach naively would require N cmake flags for N sanitizers.

@bhatsach
Copy link
Author

If the memory testing tools override the same functions as asan, I don't believe that they can function as expected with libasan present and taking precedence in LD_PRELOAD (@wjwwood do you know definitively?) and so disabling the memory tests might still be the preferable long-term solution.

I can attempt a job run without the asan mixins but the cmake option set to true. Let me get back on the results for the same.

One further concern I have with this approach is that it doesn't cover the other sanitizers on the roadmap that would also be affected. Expanding this approach naively would require N cmake flags for N sanitizers.

Good point! One thing to note is that all sanitizers do not need their library to be the first in the load list for e.g. TSan does not need this. I think it is mainly the sanitizers which override malloc()

@wjwwood
Copy link
Member

wjwwood commented Feb 27, 2019

If the memory testing tools override the same functions as asan, I don't believe that they can function as expected with libasan present and taking precedence in LD_PRELOAD (@wjwwood do you know definitively?) and so disabling the memory tests might still be the preferable long-term solution.

I don't know, it might work, as memory tools just stores a pointer to malloc, et al which would probably point to the malloc defined in libasan.

Signed-off-by: Sachin Suresh Bhat <bhatsach@amazon.com>
@bhatsach
Copy link
Author

If the memory testing tools override the same functions as asan, I don't believe that they can function as expected with libasan present and taking precedence in LD_PRELOAD (@wjwwood do you know definitively?) and so disabling the memory tests might still be the preferable long-term solution.

I can attempt a job run without the asan mixins but the cmake option set to true. Let me get back on the results for the same.

A few of the memory tool tests are failing. I agree that the long term solution should be to see how/if at all these two can work well together. We can try to enhance memory tool tests to cover some/all of the use cases of address sanitizer as well.

@wjwwood
Copy link
Member

wjwwood commented Feb 28, 2019

A few of the memory tool tests are failing.

Can you share the failures? Do you have any idea why it might be failing based on the output?

We can try to enhance memory tool tests to cover some/all of the use cases of address sanitizer as well.

Maybe I misunderstand, but if you're suggesting we try to eliminate the need for asan with memory tools, then I would say we shouldn't try to do that. In fact, if asan let us get a callback when memory methods are called, I'd consider using it as the solution by default (though there are still some corner cases regarding static initialization that I'd have to see working).

@bhatsach
Copy link
Author

bhatsach commented Feb 28, 2019

A few of the memory tool tests are failing.

Can you share the failures? Do you have any idea why it might be failing based on the output?

Here's a sample CI job which ran by prefixing LD_PRELOAD with libasan path: http://ec2-18-144-12-181.us-west-1.compute.amazonaws.com/job/nightly_linux_sanitizer_mixin/28

I can see a few failures which I am assuming are due to the conflict between memory tools and sanitizer methods:

A few passing tests:

We can try to enhance memory tool tests to cover some/all of the use cases of address sanitizer as well.

Maybe I misunderstand, but if you're suggesting we try to eliminate the need for asan with memory tools, then I would say we shouldn't try to do that. In fact, if asan let us get a callback when memory methods are called, I'd consider using it as the solution by default (though there are still some corner cases regarding static initialization that I'd have to see working).

I am sorry I am not too familiar with memory tools, hence proposed the idea. I totally agree that we should use such a callback feature if it exists in asan. For the short term, having asan lib linked in LD_PRELOAD will help capture all or most of the sanitizer detected issues during address sanitizer run. Keeping the cmake option turned off will allow memory tool tests to run and pass as expected. I am not sure about the long term strategy here as it will require more thinking and a clear understanding of both sanitizer and memory tools, and their future planned changes.

@nuclearsandwich
Copy link
Member

The test_example_memory_tools report has but one test "is_working" which comes back false. I am not sure this is because libasan's instrumented functions are being loaded instead of memory_testing_tools or if there's some implementation issue between them. I would have guessed at the former but I haven't actually determined whether all LD_PRELOAD libraries are loaded or if there's a first come first serve thing happening.


/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/test_osrf_testing_tools_cpp/test/test_example_memory_tools.cpp:62: Failure
Value of: osrf_testing_tools_cpp::memory_tools::is_working()
  Actual: false
Expected: true
[ RUN      ] TestMemoryTools.test_allocation_checking_tools
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:95: Failure
      Expected: 1u
      Which is: 1
To be equal to: unexpected_mallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:96: Failure
      Expected: 1u
      Which is: 1
To be equal to: unexpected_reallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:97: Failure
      Expected: 1u
      Which is: 1
To be equal to: unexpected_callocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:98: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_frees
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:104: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_mallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:105: Failure
      Expected: 1u
      Which is: 1
To be equal to: unexpected_reallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:106: Failure
      Expected: 1u
      Which is: 1
To be equal to: unexpected_callocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:107: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_frees
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:113: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_mallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:114: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_reallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:115: Failure
      Expected: 1u
      Which is: 1
To be equal to: unexpected_callocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:116: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_frees
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:122: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_mallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:123: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_reallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:124: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_callocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:125: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_frees
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:131: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_mallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:132: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_reallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:133: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_callocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:134: Failure
      Expected: 4u
      Which is: 4
To be equal to: unexpected_frees
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:138: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_mallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:139: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_reallocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:140: Failure
      Expected: 2u
      Which is: 2
.... elided several other tests all which had nonzero expected values and an actual value of zero ....
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:148: Failure
      Expected: 2u
      Which is: 2
To be equal to: unexpected_callocs
      Which is: 0
/home/rosbuild/ci_scripts/ws/src/osrf/osrf_testing_tools_cpp/osrf_testing_tools_cpp/test/memory_tools/test_memory_tools.cpp:149: Failure
      Expected: 4u
      Which is: 4
To be equal to: unexpected_frees
      Which is: 0
[  FAILED  ] TestMemoryTools.test_allocation_checking_tools (11 ms)
[ RUN      ] TestMemoryTools.test_example
/home/rosbuild/ci_scripts/ws/build/osrf_testing_tools_cpp/googletest-1.8.0-extracted/googletest-1.8.0-src/googletest/src/gtest.cc:683: Failure
Expected: 1 non-fatal failure
  Actual: 0 failures
/home/rosbuild/ci_scripts/ws/build/osrf_testing_tools_cpp/googletest-1.8.0-extracted/googletest-1.8.0-src/googletest/src/gtest.cc:683: Failure
Expected: 1 non-fatal failure
  Actual: 0 failures
[  FAILED  ] TestMemoryTools.test_example (26 ms)
[----------] 2 tests from TestMemoryTools (37 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test case ran. (38 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] TestMemoryTools.test_allocation_checking_tools
[  FAILED  ] TestMemoryTools.test_example

@nuclearsandwich
Copy link
Member

@wjwwood is there enough information there for you to tell whether getting both ASan and memory testing tools to interoperate is a reasonable endeavor? If this isn't going to work we need to go back to trying to disable memory_testing_tools when ASan is invoked.

@wjwwood
Copy link
Member

wjwwood commented Mar 5, 2019

@wjwwood is there enough information there for you to tell whether getting both ASan and memory testing tools to interoperate is a reasonable endeavor?

It just tells me it isn't working right now. I don't know why based on the output, and so I can't really say if it's a reasonable endeavor or not.

@thomas-moulard
Copy link

Here is my take on it: both ASAN and osrf_memory_testing_tool are using the same approach. I.e. using dlsym to capture malloc address, define their own malloc version and call the original malloc implementation to "wrap" its behavior.

As ASAN is also instrumenting the code, it's not using the osrf_memory_testing_tool black-box approach where injecting the lib with LD_PRELOAD is enough though.

The "right" behavior would be:

  1. ASAN static init runs, real malloc address is capture, ASAN malloc replaces real malloc
  2. osrf_memory_testing_tool init runs, ASAN malloc is captured (as original_malloc) and osrf_memory_testing_tool replaces ASAN malloc.

What I don't really understand is that given we'll have 3 versions of malloc which will all be IIUC defined as weak symbols, which one will be chosen when the process starts to resolve malloc calls.

My feeling is that the issue we're seeing is due to the original or TSAN malloc being chosen instead of the osrf_memory_testing_tool one. Maybe it's an ordering thing? TSAN lib is passed with LD_PRELOAD last so it wins?

@nuclearsandwich
Copy link
Member

This description of the "right" behavior matches what I'd expect as well. I'm setting up a workspace now to try and figure out why the second step doesn't appear to be functioning correctly (as indicated by the build output)

@wjwwood
Copy link
Member

wjwwood commented Mar 5, 2019

are using the same approach. I.e. using dlsym to capture malloc address, define their own malloc version and call the original malloc implementation to "wrap" its behavior.

This package is specifically using the RTLD_NEXT option to dlsym, which will get the address of the function from the next shared library in the list, so if asan is before memory tools in the preload list, and it uses the same method, then it should get the address of malloc in the memory tools implementation, and memory tools should then get malloc from the system library (which comes later because it is linked against it rather than having it in the ld preload list).

If instead, it is looking up the malloc symbol directly from the system library it would be circumventing memory tools.

@wjwwood
Copy link
Member

wjwwood commented Mar 5, 2019

Thanks for the links @thomas-moulard, I never actually looked at their implementation (it didn't dawn on me that they were doing the same thing) and it's super interesting. There's all the same PTSD inducing things like static memory allocations for dlsym calls as I have in memory tools. :D

@nuclearsandwich
Copy link
Member

Further investigation would be required to figure out exactly why both tools can't coexist. Rather than pursue that at this time, #28 will workaround the issue by improving the control over when memory_tools will be used, allowing us to disable it when running with ASan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants