This commit implements conditional compilation for ASan helper code.
As convey to me by @EricWF, string benchmarks with UBSan have been
experiencing significant performance hit after the commit with ASan
string annotations. This is likely due to the fact that no-op ASan code
is not optimized out with UBSan. To address this issue, this commit
conditionalizes the inclusion of ASan helper function bodies using
`#ifdef` directives. This approach allows us to selectively include only
the ASan code when it's actually required, thereby enhancing
optimizations and improving performance.
While issue was noticed in string benchmarks, I expect same overhead
(just less noticeable) in other containers, therefore `std::vector` and
`std::deque` have same changes.
To see impact of that change run `string.libcxx.out` with UBSan and
`--benchmark_filter=BM_StringAssign` or
`--benchmark_filter=BM_StringConstruct`.
std::midpoint is specified by having a pointer overload in
[numeric.ops.midpoint].
With the way the pointer overload is specified, users can expect that
calling
std::midpoint as `std::midpoint<T>(a, b)` should work, but it didn't in
libc++
due to the way the pointer overload was specified.
Fixes#67046
Notable things in this commit:
* refactors `__indirect_binary_left_foldable`, making it slightly
different (but equivalent) to _`indirect-binary-left-foldable`_, which
improves readability (a [patch to the Working Paper][patch] was made)
* omits `__cpo` namespace, since it is not required for implementing
niebloids (a cleanup should happen in 2024)
* puts tests ensuring invocable robustness and dangling correctness
inside the correctness testing to ensure that the algorithms' results
are still correct
[patch]: https://github.com/cplusplus/draft/pull/6734
Finishes implementation of
- P2093R14 Formatted output
- P2539R4 Should the output of std::print to a terminal be synchronized
with the underlying stream?
Differential Revision: https://reviews.llvm.org/D156609
Currently, when libc++'s views::take specially handles an iota_view, the
addition is done after dereferencing the beginning iterator. However, in
[range.take.overview]/2.3, the addition is done before the dereferencing,
which means that the standard requires the returned iota_view to have
the same W and Bound type in such cases.
This patch fixes that, and also fixes a test that was testing the
incorrect behavior.
Fixes#75611
This patch runs clang-format on all of libcxx/include and libcxx/src, in
accordance with the RFC discussed at [1]. Follow-up patches will format
the benchmarks, the test suite and remaining parts of the code. I'm
splitting this one into its own patch so the diff is a bit easier to
review.
This patch was generated with:
find libcxx/include libcxx/src -type f \
| grep -v 'module.modulemap.in' \
| grep -v 'CMakeLists.txt' \
| grep -v 'README.txt' \
| grep -v 'libcxx.imp' \
| grep -v '__config_site.in' \
| xargs clang-format -i
A Git merge driver is available in libcxx/utils/clang-format-merge-driver.sh
to help resolve merge and rebase issues across these formatting changes.
[1]: https://discourse.llvm.org/t/rfc-clang-formatting-all-of-libc-once-and-for-all
This patch removes assumptions that std::array's iterators are raw
pointers in the source code and in our test suite. While this is true
right now, this doesn't have to be true and ion the future we might want
to enable bounded iterators in std::array, which would require this
change.
This is a pre-requisite for landing #74482
It's not that I have much love for C++03, but we should ensure that it
works. Some recent changes broke this configuration because slightly
older Clang versions don't support attribute syntax in C++03 mode.
Even though constexpr implicitly makes functions inline, we try not to
rely on this implicit effect in the code base. We are mostly consistent
about using `inline` on non-template free-functions to make it clear
that we don't have an ODR violation.
This patch simply fixes a few places where we didn't explicitly use
inline on non-template free functions, presumably because they were
constexpr.
Fixes#75227
Fixes#75002. Found while running libc++'s tests with MSVC's STL.
This is a superset of #74961 that also fixes the product code
and adds a regression test. Thanks again, @cpplearner!
To summarize: `views::split` and `views::lazy_split` aren't unary,
aren't range adaptor **closure** objects, and can't be piped. However,
\[range.adaptor.object\]/8 says that `views::split(pattern)` and
`views::lazy_split(pattern)` produce unary, pipeable, range adaptor
closure objects.
This PR adjusts the test coverage accordingly, allowing it to portably
pass for libc++ and MSVC's STL.
This commit introduces basic annotations for `std::basic_string`,
mirroring the approach used in `std::vector` and `std::deque`.
Initially, only long strings with the default allocator will be
annotated. Short strings (_SSO - short string optimization_) and strings
with non-default allocators will be annotated in the near future, with
separate commits dedicated to enabling them. The process will be similar
to the workflow employed for enabling annotations in `std::deque`.
**Please note**: these annotations function effectively only when libc++
and libc++abi dylibs are instrumented (with ASan). This aligns with the
prevailing behavior of Memory Sanitizer.
To avoid breaking everything, this commit also appends
`_LIBCPP_INSTRUMENTED_WITH_ASAN` to `__config_site` whenever libc++ is
compiled with ASan. If this macro is not defined, string annotations are
not enabled. However, linking a binary that does **not** annotate
strings with a dynamic library that annotates strings, is not permitted.
Originally proposed here: https://reviews.llvm.org/D132769
Related patches on Phabricator:
- Turning on annotations for short strings:
https://reviews.llvm.org/D147680
- Turning on annotations for all allocators:
https://reviews.llvm.org/D146214
This PR is a part of a series of patches extending AddressSanitizer C++
container overflow detection capabilities by adding annotations, similar
to those existing in `std::vector` and `std::deque` collections. These
enhancements empower ASan to effectively detect instances where the
instrumented program attempts to access memory within a collection's
internal allocation that remains unused. This includes cases where
access occurs before or after the stored elements in `std::deque`, or
between the `std::basic_string`'s size (including the null terminator)
and capacity bounds.
The introduction of these annotations was spurred by a real-world
software bug discovered by Trail of Bits, involving an out-of-bounds
memory access during the comparison of two strings using the
`std::equals` function. This function was taking iterators
(`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison,
using a custom comparison function. When the `iter1` object exceeded the
length of `iter2`, an out-of-bounds read could occur on the `iter2`
object. Container sanitization, upon enabling these annotations, would
effectively identify and flag this potential vulnerability.
This Pull Request introduces basic annotations for `std::basic_string`.
Long strings exhibit structural similarities to `std::vector` and will
be annotated accordingly. Short strings are already implemented, but
will be turned on separately in a forthcoming commit. Look at [a
comment](https://github.com/llvm/llvm-project/pull/72677#issuecomment-1850554465)
below to read about SSO issues at current moment.
Due to the functionality introduced in
[D132522](dd1b7b797a),
the `__sanitizer_annotate_contiguous_container` function now offers
compatibility with all allocators. However, enabling this support will
be done in a subsequent commit. For the time being, only strings with
the default allocator will be annotated.
If you have any questions, please email:
- advenam.tacet@trailofbits.com
- disconnect3d@trailofbits.com
Adding months to a year_month should wrap the year when the month
becomes greater than twelve or less than one.
This fixes the issue for year_month. Other classes with a year and month
do not have this issue. This has been verified and tests are added to
avoid possible regressions.
Also fixes some variable copy-paste errors in the tests.
Fixes https://github.com/llvm/llvm-project/issues/73162
Found while running libc++'s test suite with MSVC's STL.
* I've filed [LWG-4021](https://cplusplus.github.io/LWG/issue4021)
"`mdspan::is_always_meow()` should be `noexcept`" and implemented this
in libc++'s product and test code.
* Use `LIBCPP_STATIC_ASSERT` to avoid issues with `noexcept`
strengthening in MSVC's STL.
+ As permitted by the Standard, MSVC's STL conditionally strengthens
`mdspan` construction/`is_meow`/`stride` and `elements_view` iterator
`base() &&`, and always strengthens `basic_stringbuf` `swap`.
+ In `mdspan/properties.pass.cpp`, this also upgrades runtime `assert`s
to `static_assert`s.
* Improvement: Upgrade `assert` to `static_assert` when inspecting the
`noexcept`ness of `std::ranges::iter_move`. (These `!noexcept` tests
weren't causing issues for MSVC's STL, so I didn't change them to be
libc++-specific.)
This change requires quite a number of changes in the tests; this is not
code I expect people to use in the wild. So I don't expect breakage for
users.
Implements:
- P2905R2 Runtime format strings, as a Defect Report
'isctype' fails in arm64-big-endian because the __regex_word involved
in mask operation is not changed based on the platform endianness, while
the character mask does change.
This makes libc++'s <filesystem> tests compatible with MSVC's STL.
In msvc_stdlib_force_include.h, we need to define 3 more macros:
- _CRT_DECLARE_NONSTDC_NAMES activates the POSIX names of
`getcwd` etc. As the comment explains, we need this because
we test with Clang `-fno-ms-compatibility`, which defines
`__STDC__` to `1`, which causes the UCRT headers to disable
the POSIX names by default.
- Then we need _CRT_NONSTDC_NO_WARNINGS to avoid emitting
deprecation warnings about the POSIX names.
- Finally, we need `NOMINMAX` to seal away the ancient evil.
These macros are documented in https://learn.microsoft.com/en-us/cpp/c-runtime-library/compatibility?view=msvc-170.
As a drive-by change, the patch adds a "simulated" macro for
__has_feature(hwaddress_sanitizer). It also clang-formats all
of msvc_stdlib_force_include.h and removes guards for
__has_builtin(__builtin_source_location) in <source_location>,
since those are not needed anymore.
Found while running libc++'s tests with MSVC's STL.
`ranges::rotate_copy` takes `forward_iterator`s as this test's comment
banner correctly depicts. However, this test had bogus assertions
expecting that `ranges::rotate_copy` would be constrained away for
not-quite-**bidi** iterators. @philnik777 confirmed that these were
copy-paste relics from the `ranges::reverse_copy` test.
I fixed this by replacing the assertions with the test types that aren't
quite **forward** iterators/ranges. Additionally, I noticed that the
top-level `test()` function was missing coverage with the weakest
possible `forward_iterator<int*>`.
This revealed that the product code in `ranges_rotate_copy.h` was
similarly damaged. In addition to fixing it by taking `forward_iterator`
and `forward_range` as depicted in the Standard, this drops the
inclusion of `<__iterator/reverse_iterator.h>` as this algorithm doesn't
need `std::__reverse_range`.
This commit refactors the ASan annotation functions in libc++ to reduce
unnecessary code duplication. Additionally it adds a small optimization.
- Eliminates two redundant function versions by utilizing the
`[[maybe_unused]]` attribute and guarding function bodies with `#ifndef
_LIBCPP_HAS_NO_ASAN`.
- Introduces an additional guard to an auxiliary function, allowing the
removal of a no-ops function body. This approach avoids relying on the
optimizer for code elimination.
Fixes#73043
This is in preparation for clang-formatting the whole code base. These
annotations are required either to avoid clang-format bugs or because
the manually formatted code is significantly more readable than the
clang-formatted alternative. All in all, it seems like very few
annotations are required, which means that clang-format is doing a very
good job in most cases.
In preparation for running clang-format on the whole code base, we are
also removing mentions of the legacy _LIBCPP_INLINE_VISIBILITY macro in
favor of the newer _LIBCPP_HIDE_FROM_ABI.
We're still leaving the definition of _LIBCPP_INLINE_VISIBILITY to avoid
creating needless breakage in case some older patches are checked-in
with mentions of the old macro. After we branch for LLVM 18, we can do
another pass to clean up remaining uses of the macro that might have
gotten introduced by mistake (if any) and remove the macro itself at the
same time. This is just a minor convenience to smooth out the transition
as much as possible.
See
https://discourse.llvm.org/t/rfc-clang-formatting-all-of-libc-once-and-for-all
for the clang-format proposal.
The header is used in the dylib, this is not an issue at the moment
since the dylib is built using C++23 but it would be when building
with C++26.
Post release comments in #72496 seem to indicate this removal is an
issue for Fuchsia.
This paper was voted in as a DR, so it's retroactively enabled back to
C++20; the C++ version that introduced std::format.
Implements:
- P2909R4 Fix formatting of code units as integers (Dude, where’s my
``char``?)
Locale objects use atomic reference counting, which may be very
expensive in parallel applications. The classic locale is used by
default by all streams and can be very contended. But it's never
destroyed, so the reference counting is also completely pointless on the
classic locale. Currently ~70% of time in the parallel stringstream
benchmarks is spent in locale ctor/dtor. And the execution radically
slows down with more threads.
Avoid reference counting on the classic locale. With this change
parallel benchmarks start to scale with threads.
This is a re-application of f8afc53d64 (aka PR #72112) which was
reverted in 4e0c48b907 because it broke the sanitizer builds due
to an initialization order fiasco. This issue has now been fixed by
ensuring that the locale is constinit'ed.
Co-authored-by: Dmitry Vyukov <dvyukov@google.com>
<filesystem> is a C++17 addition. In C++11 and C++14 modes, we actually
have all the code for <filesystem> but it is hidden behind a non-inline
namespace __fs so it is not accessible. Instead of doing this unusual
dance, just guard the code for filesystem behind a classic C++17 check
like we normally do.
The intent of these particular functions, since their introduction, was
to NOT be inlinable.
However, the mechanism by which this was accomplished was non-obvious,
and stopped working when string is compiled for C++20.
A longstanding behavior specified by the C++ standard is that
instantiation of the body of a template function is suppressed by an
extern template declaration -- unless the function is explicitly marked
either constexpr or inline. Of course, if the body is not instantiated,
then it cannot possibly be inlined, and thus all the functions listed in
libcxx/include/__string/extern_template_lists.h were uninlineable.
But, in C++20 mode, string functions were annotated constexpr, which
means they _are_ instantiated, and do become inlineable. And, in fact,
they do get inlined, which has caused noticeable binary-size growth for
users.
For example, in C++17,
`std::string f(std::string *in) { return *in; }`
does not inline the copy-constructor call, and instead generates a call
to the exported function defined in the libc++ shared library.
I think we probably don't want to mark all functions that are currently
in the extern template list as noinline, as many of them really are
reasonable inlining candidates. Thus, I've restricted this change to
only the few functions that were clearly intended to be outlined.
See commits like b019c5c037 (and some
others like it) for background, in which functions were removed from the
extern template list in the unstable ABI in order to allow the
short-string case to be inlined, while moving the long-string case to a
separate function, added to the extern template list.
Several experimental headers around std::pmr have been slated for
removal for a while now. This patch actually performs the removal and
cleanups from the code base.
I don't know when, but at some point we lost test coverage to ensue that
all the headers are in the modulemap. This adds a test to make sure all
the headers (excluding a few which shouldn't be part of the modulemap)
are at least mentioned. This also fixes a few headers which bit-rotted
while we were missing the coverage.