-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ASan][libc++] std::basic_string annotations (#72677)
This commit introduces basic annotations for `std::basic_string`, mirroring the approach used in `std::vector` and `std::deque`. Initially, only long strings with the default allocator will be annotated. Short strings (_SSO - short string optimization_) and strings with non-default allocators will be annotated in the near future, with separate commits dedicated to enabling them. The process will be similar to the workflow employed for enabling annotations in `std::deque`. **Please note**: these annotations function effectively only when libc++ and libc++abi dylibs are instrumented (with ASan). This aligns with the prevailing behavior of Memory Sanitizer. To avoid breaking everything, this commit also appends `_LIBCPP_INSTRUMENTED_WITH_ASAN` to `__config_site` whenever libc++ is compiled with ASan. If this macro is not defined, string annotations are not enabled. However, linking a binary that does **not** annotate strings with a dynamic library that annotates strings, is not permitted. Originally proposed here: https://reviews.llvm.org/D132769 Related patches on Phabricator: - Turning on annotations for short strings: https://reviews.llvm.org/D147680 - Turning on annotations for all allocators: https://reviews.llvm.org/D146214 This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. This Pull Request introduces basic annotations for `std::basic_string`. Long strings exhibit structural similarities to `std::vector` and will be annotated accordingly. Short strings are already implemented, but will be turned on separately in a forthcoming commit. Look at [a comment](llvm/llvm-project#72677 (comment)) below to read about SSO issues at current moment. Due to the functionality introduced in [D132522](llvm/llvm-project@dd1b7b7), the `__sanitizer_annotate_contiguous_container` function now offers compatibility with all allocators. However, enabling this support will be done in a subsequent commit. For the time being, only strings with the default allocator will be annotated. If you have any questions, please email: - advenam.tacet@trailofbits.com - disconnect3d@trailofbits.com NOKEYCHECK=True GitOrigin-RevId: 9ed20568e7de53dce85f1631d7d8c1415e7930ae
- Loading branch information
1 parent
1f70899
commit e7db482
Showing
92 changed files
with
862 additions
and
64 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
63 changes: 63 additions & 0 deletions
63
test/std/strings/basic.string/string.capacity/reserve_size.asan.pass.cpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
//===----------------------------------------------------------------------===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
// <string> | ||
|
||
// This test verifies that the ASan annotations for basic_string objects remain accurate | ||
// after invoking basic_string::reserve(size_type __requested_capacity). | ||
// Different types are used to confirm that ASan works correctly with types of different sizes. | ||
#include <string> | ||
#include <cassert> | ||
|
||
#include "test_macros.h" | ||
#include "asan_testing.h" | ||
|
||
template <class S> | ||
void test() { | ||
S short_s1(3, 'a'), long_s1(100, 'c'); | ||
short_s1.reserve(0x1337); | ||
long_s1.reserve(0x1337); | ||
|
||
LIBCPP_ASSERT(is_string_asan_correct(short_s1)); | ||
LIBCPP_ASSERT(is_string_asan_correct(long_s1)); | ||
|
||
short_s1.clear(); | ||
long_s1.clear(); | ||
|
||
LIBCPP_ASSERT(is_string_asan_correct(short_s1)); | ||
LIBCPP_ASSERT(is_string_asan_correct(long_s1)); | ||
|
||
short_s1.reserve(0x1); | ||
long_s1.reserve(0x1); | ||
|
||
LIBCPP_ASSERT(is_string_asan_correct(short_s1)); | ||
LIBCPP_ASSERT(is_string_asan_correct(long_s1)); | ||
|
||
S short_s2(3, 'a'), long_s2(100, 'c'); | ||
short_s2.reserve(0x1); | ||
long_s2.reserve(0x1); | ||
|
||
LIBCPP_ASSERT(is_string_asan_correct(short_s2)); | ||
LIBCPP_ASSERT(is_string_asan_correct(long_s2)); | ||
} | ||
|
||
int main(int, char**) { | ||
test<std::string>(); | ||
#ifndef TEST_HAS_NO_WIDE_CHARACTERS | ||
test<std::wstring>(); | ||
#endif | ||
#if TEST_STD_VER >= 11 | ||
test<std::u16string>(); | ||
test<std::u32string>(); | ||
#endif | ||
#if TEST_STD_VER >= 20 | ||
test<std::u8string>(); | ||
#endif | ||
|
||
return 0; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.