Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ASan][libc++] std::basic_string annotations #72677

Merged
merged 10 commits into from
Dec 13, 2023

Commits on Dec 12, 2023

  1. [ASan][libc++] std::basic_string annotations

    This commit turns on basic annotations for std::basic_string, similar to those in std::vector and std::deque.
    
    Originally proposed here: https://reviews.llvm.org/D132769
    
    Related patches:
    Turning on annotations for short strings: https://reviews.llvm.org/D147680
    
    Turning on annotations for all allocators: https://reviews.llvm.org/D146214
    
    This revision is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in std::vector, to std::string and std::deque collections. These changes allow ASan to detect cases when the instrumented program accesses memory which is internally allocated by the collection but is still not in-use (accesses before or after the stored elements for std::deque, or between the size and capacity bounds for std::string).
    
    The motivation for the research and those changes was a bug, found by Trail of Bits, in a real code where an out-of-bounds read could happen as two strings were compared via a std::equals function that took iter1_begin, iter1_end, iter2_begin iterators (with a custom comparison function). When object iter1 was longer than iter2, read out-of-bounds on iter2 could happen. Container sanitization would detect it.
    
    This Pull Request adds annotations for `std::basic_string`. Long string is very similar to std::vector, and therefore works well with `__sanitizer_annotate_contiguous_container` from LLVM 15 and therefore annotations works if that implementation is compiled with previous LLVM.
    However, only the standard allocator is supported.
    
    As D132522 extended possibilities of `__sanitizer_annotate_contiguous_container`, now annotations may be supported with all allocators, but that support will be added in a next PR. Only strings with default allocator are annotated with that commit.
    
    If you have any questions, please email:
    - advenam.tacet@trailofbits.com
    - disconnect3d@trailofbits.com
    Advenam Tacet committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    3d48196 View commit details
    Browse the repository at this point in the history
  2. Refactor: remove duplicated code with [[__maybe_unused__]]

    This commit refactors the ASan annotation function to reduce unnecessary code duplication.
    
    Suggested here: llvm#73043
    Related PR: llvm#74023
    Advenam Tacet committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    a3ecbdb View commit details
    Browse the repository at this point in the history
  3. Address code review feedback for existing code

    Requested changes for existing code.
    Additional changes related to dylibs in the next commit.
    
    This commit extends a few more tests to cover long strings in more situations.
    Advenam Tacet committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    002ce2d View commit details
    Browse the repository at this point in the history
  4. Add flag informing about dylib ASan instrumentation and guard string …

    …annotations
    
    * CMake files are updated to automatically add the `_LIBCPP_INSTRUMENTED_WITH_ASAN`
    macro to `__config_site`. This macro is used to inform the compiler (in header files)
    that the library has been built with ASan and is instrumented for memory safety checks.
    
    * The `-D_LIBCPP_INSTRUMENTED_WITH_ASAN` flag is added to the libc++ compilation
    command when building with ASan. This flag ensures that basic_string annotations
    are included in the binary, which are needed for ASan to track memory usage
    correctly.
    
    * The string file is updated to use string annotations from headers if and only
    if the code is compiled with `-fsanitize=address` and the dylib is compiled
    with ASan (so, `_LIBCPP_INSTRUMENTED_WITH_ASAN` is defined). This ensures that
    the string annotations are only included when they are supported.
    Advenam Tacet committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    70dcc37 View commit details
    Browse the repository at this point in the history
  5. Add missing annotations checks

    Some files are missing annotations checks, this commit adds those checks.
    Advenam Tacet committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    2fd41a5 View commit details
    Browse the repository at this point in the history
  6. Additional test cases

    This commit adds a few test cases to check ASan annotations for SSO/no-SSO.
    Advenam Tacet committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    2031dfb View commit details
    Browse the repository at this point in the history
  7. [ASan] Annotations fix

    Resolved an issue with ASan annotations that was causing failure of `substr_rvalue` tests.
    Advenam Tacet committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    c1c7af6 View commit details
    Browse the repository at this point in the history
  8. [ASan] SSO annotations fixes and refactor

    This commit fixes issues related to annotations and instrumentation, causing problems only with SSO annotations.
    
    First, it fixes an issue where `_LIBCPP_STRING_INTERNAL_MEMORY_ACCESS` doesn't work in `basic_string<_CharT, _Traits, _Allocator>::__move_assign(basic_string& __str, true_type)``.
    
    Second, it adds missing SSO annotations and refactors annotations around `__uninitialized_size_tag`.
    
    To reproduce the errors in this commit, change `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED` to true. This flag will be changed to true (and probably deleted) soon.
    Advenam Tacet committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    1dd7e85 View commit details
    Browse the repository at this point in the history
  9. [ASan] Cleaning _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS

    Advenam Tacet committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    83865c0 View commit details
    Browse the repository at this point in the history
  10. Updating CMake files related to __config_site

    Based on code review feedback.
    Advenam Tacet committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    8bb077f View commit details
    Browse the repository at this point in the history