1949 Commits

Author SHA1 Message Date
Jon Chesterfield
65a4ce09f8 [libc] Can build amdgpu libc even if rocm is missing
Clang defaults to failing to build if it can't find rocm device libs

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D153581
2023-06-22 21:18:44 +01:00
Jon Chesterfield
578d229a1a [libc] Move fences into outbox/wait-for-ownership test
Also moves the wait-until-inbox-changes test into a shared method.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D153573
2023-06-22 18:14:41 +01:00
Jun Zhang
ce378fcb9e
[libc][NFC] Simplify return value logic in set_thread_ptr()
Signed-off-by: Jun Zhang <jun@junz.org>

Differential Revision: https://reviews.llvm.org/D153572
2023-06-23 00:47:48 +08:00
Jon Chesterfield
ba01a2c608 [libc] Add memory fences to device-local locking calls
This makes the interface less error prone. The acquire was previously
forgotten. Release is currently missing if recv() is the last operation made
before close.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D153571
2023-06-22 17:46:09 +01:00
Petr Hosek
f3b64887de [libc] Place headers in the right include directory
When LLVM_ENABLE_PER_TARGET_RUNTIME_DIR is enabled, place headers
in `include/<target>` directory, otherwise use `include/`.

Differential Revision: https://reviews.llvm.org/D152592
2023-06-22 06:22:32 +00:00
Joseph Huber
e0b487bfc0 [libc] Rename and install the RPC server interface
This patch prepares the RPC interface to be installed. We place this in
the existing `llvm-gpu-none` directory as it will also give us access to
the generated `libc` headers for the opcodes.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D153040
2023-06-21 11:26:24 -05:00
Joseph Huber
4272d09196 [libc][NFC] Cleanup the RPC server implementation prior to installing
This does some simple cleanup prior to landing the patch to install
these.

Differential Revision: https://reviews.llvm.org/D153439
2023-06-21 11:14:20 -05:00
Joseph Huber
1f99526d9d [libc][NFC] Move __has_builtin to LIBC_HAS_BUILTIN
Summary:
These should use the common `LIBC_HAS_BUILTIN` even if we will only
compile this with `clang`.
2023-06-21 09:50:40 -05:00
Guillaume Chatelet
bd1cba9f4f Revert D148717 "[libc] Improve memcmp latency and codegen"
Once integrated in our codebase the patch triggered a bunch of failing
tests. We do not yet understand where the bug is but we revert it to
move forward with integration.
This reverts commit 5e32765c15ab8df3d2635a2bb5078c5b1d5714d5.
2023-06-21 12:37:14 +00:00
Petr Hosek
9fa7998555 [libc] Support for riscv32
This change adds basic support for baremetal riscv32 configuration.

Differential Revision: https://reviews.llvm.org/D152563
2023-06-21 07:11:22 +00:00
Siva Chandra Reddy
75d70b7306 [libc] Make close function of the internal File class cleanup the file object.
Before this change, a separate static method named cleanup was used to
cleanup the file. Instead, now the close method cleans up the full file
object using the platform's close function.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D153377
2023-06-21 05:05:04 +00:00
Joseph Huber
ee6ace27e0 [libc] Remove disabled pass after performance improvement
This pass used to cause huge compile time regressions, That has been
address and can now be re-added.

Differential Revision: https://reviews.llvm.org/D153374
2023-06-20 15:48:02 -05:00
Joseph Huber
964a535bfa [libc] Remove flexible array and replace with a template
Currently the implementation of the RPC interface requires a flexible
struct. This caused problems when compilling the RPC server with GCC as
would be required if trying to export the RPC server interface. This
required that we either move to the `x[1]` workaround or make it a
template parameter. While just using `x[1]` would be much less noisy,
this is technically undefined behavior. For this reason I elected to use
templates.

The downside to using templates is that the server code must now be able
to handle multiple different types at runtime. I was unable to find a
good solution that didn't rely on type erasure so I simply branch off of
the given value.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D153304
2023-06-20 15:22:37 -05:00
Mikhail R. Gadelha
a2df87c2b0 [libc] Fix libmath test compilation when using UInt<T>
This patch:
(1) adds the add_with_carry_const and sub_with_borrow_const constexpr calls
to add and sub, respectively. Both add and sub are constexpr calls and
were call the non-constexpr version of add/sub_with_borrow.
(2) adds explicit UIntType construct calls in some fp tests.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D150223
2023-06-20 15:41:18 -03:00
Tue Ly
46aa659a32 [libc][math] Improve exp2f performance.
Re-organize special cases and add a special case when `|x| < 2^-5`.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153134
2023-06-20 09:34:20 -04:00
Tue Ly
0ae409c0d7 [libc][math] Slightly improve sinhf and coshf performance.
Re-order exceptional branches and slightly adjust the evaluation.
Depends on https://reviews.llvm.org/D153026 .

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153062
2023-06-20 09:27:28 -04:00
Tue Ly
5dbd5118ec [libc][math] Improve tanhf performance.
Re-order exceptional branches and slightly adjust the evaluation.

Performance tested with the CORE-MATH project on AMD EPYC 7B12 (clocks/op)

Reciprocal throughputs:
```
--- BEFORE ---

$ CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 7.794 + 0.102 clc/call; Median-Min = 0.066 clc/call; Max = 8.267 clc/call;
[####################] 100 %. (with -msse4.2)
Ntrial = 20 ; Min = 10.783 + 0.172 clc/call; Median-Min = 0.144 clc/call; Max = 11.446 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 18.926 + 0.381 clc/call; Median-Min = 0.342 clc/call; Max = 19.623 clc/call;

--- AFTER ---

$ CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 6.598 + 0.085 clc/call; Median-Min = 0.052 clc/call; Max = 6.868 clc/call;
[####################] 100 %  (with -msse4.2)
Ntrial = 20 ; Min = 9.245 + 0.304 clc/call; Median-Min = 0.248 clc/call; Max = 10.675 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 11.724 + 0.440 clc/call; Median-Min = 0.444 clc/call; Max = 12.262 clc/call;
```

Latency:
```
--- BEFORE ---

$ PERF_ARGS="--latency" CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 38.821 + 0.157 clc/call; Median-Min = 0.122 clc/call; Max = 39.539 clc/call;
[####################] 100 %. (with -msse4.2)
Ntrial = 20 ; Min = 44.767 + 0.766 clc/call; Median-Min = 0.681 clc/call; Max = 45.951 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 55.055 + 1.512 clc/call; Median-Min = 1.571 clc/call; Max = 57.039 clc/call;

--- AFTER ---

$ PERF_ARGS="--latency" CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 36.147 + 0.194 clc/call; Median-Min = 0.181 clc/call; Max = 36.536 clc/call;
[####################] 100 %  (with -msse4.2)
Ntrial = 20 ; Min = 40.904 + 0.728 clc/call; Median-Min = 0.557 clc/call; Max = 42.231 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 55.776 + 0.557 clc/call; Median-Min = 0.542 clc/call; Max = 56.551 clc/call;
```

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153026
2023-06-20 09:25:07 -04:00
Siva Chandra Reddy
21e1651c0c [libc] Remove the requirement of a platform-flush operation in File abstraction.
The libc flush operation is not supposed to trigger a platform level
flush operation. See "Notes" on this Linux man page:
    https://man7.org/linux/man-pages/man3/fflush.3.html

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153182
2023-06-19 18:38:29 +00:00
Joseph Huber
5a8fc41937 [libc] Disable atomic optimizations for libc AMDGPU builds
Recently the AMDGPU backend automatically enables a pass to optimize
atomics. This results in the LTO build taking about 10x longer in all
cases. For now we disable this by default as was the case before the
patch in D152649.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D153232
2023-06-19 03:25:51 -05:00
Alfred Persson Forsberg
c32ba7d5e0 [libc] [NFC] malloc.h: fix include guard typo
Differential Revision: https://reviews.llvm.org/D153231
2023-06-18 23:08:25 +01:00
Joseph Huber
70b1c3999c [libc][Docs] Add some motivation for the GPU libc
This provides some basic motivation behind the GPU libc. Suggests are welcome.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D152028
2023-06-16 15:19:45 -05:00
Joseph Huber
d663da07e3 [libc][Obvious] Fix problem with the variable used for the jobs
Summary:
There was an issue with the variable we were using to conditonally set
the job number for the GPU.
2023-06-16 14:11:53 -05:00
Joseph Huber
27f326334f [libc] Add an option to use a job pool for GPU tests
Currently the GPU has restrictions on how many tests can be run in
parallel due to resource constraints. However, building these tests can
take a long time so we want to be able to build them in parallel. This
patch introduces the option `LIBC_GPU_TEST_JOBS` which is set to the
number of threads to run in parallel.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D153157
2023-06-16 14:06:16 -05:00
Joseph Huber
485e2de6d5 [libc][nfc] Silence two warnings in tests
These currently give warnings for unused variables or a default case
where everything is covered.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D153137
2023-06-16 12:52:06 -05:00
Joseph Huber
ed34cb2cd7 [libc] Add a test for fputs to check using stdout and stderr
This patch adds a test directly for the `fputs` function similar to the
existing `puts` test. This lets us know that the default file pointers
are function and the `fputs` interface works.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D152288
2023-06-16 11:01:55 -05:00
Alex Brachet
61c9052cec [libc] Add LIBC_INLINE_VAR for inline variables
These are the only variables I could find that use LIBC_INLINE. Note, these are namespace scoped constexpr so local linkage is implied. inline is useful here to silence clang's unused-const-variable variable. For Fuchsia, the distinction between LIBC_INLINE and LIBC_INLINE_VAR is helpful because we define LIBC_INLINE as `[[gnu::always_inline]] inline` when building with gcc. This isn't meaningful on variables.

Alternatively, we could make these variables simply constexpr and also add `[[maybe_unused]]`

Reviewed By: sivachandra, mcgrathr

Differential Revision: https://reviews.llvm.org/D152951
2023-06-16 15:46:32 +00:00
Joseph Huber
490958b9ea [libc][obvious] Actually return the value from malloc for NVPTX
Switching to this interface we neglected to actually write the output
from the malloc call to the RPC buffer. Fix this so the tests pass
again.

Differential Revision: https://reviews.llvm.org/D153069
2023-06-15 15:13:11 -05:00
Joseph Huber
7e8b0c27f2 [libc] Disable the strtod and strtold tests on NVPTX
These tests have a single line that fails with a value off-by-one, see
https://lab.llvm.org/buildbot/#/builders/46/builds/50055/steps/12/logs/stdio .
Disable these for now so we can figure out what the error is later.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D153056
2023-06-15 13:29:42 -05:00
Joseph Huber
dcdfc963d7 [libc] Export GPU extensions to libc for external use
The GPU port of the LLVM C library needs to export a few extensions to
the interface such that users can interface with it. This patch adds the
necessary logic to define a GPU extension. Currently, this only exports
a `rpc_reset_client` function. This allows us to use the server in
D147054 to set up the RPC interface outside of `libc`.

Depends on https://reviews.llvm.org/D147054

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D152283
2023-06-15 11:02:24 -05:00
Joseph Huber
719d77ed28 [libc] Begin implementing a library for the RPC server
This patch begins providing a generic static library that wraps around
the raw `rpc.h` interface. As discussed in the corresponding RFC,
https://discourse.llvm.org/t/rfc-libc-exporting-the-rpc-interface-for-the-gpu-libc/71030,
we want to begin exporting RPC services to external users. In order to
do this we decided to not expose the `rpc.h` header by wrapping around
its functionality. This is done with a C-interface as we make heavy use
of callbacks and allows us to provide a predictable interface.

Reviewed By: JonChesterfield, sivachandra

Differential Revision: https://reviews.llvm.org/D147054
2023-06-15 11:02:23 -05:00
Joseph Huber
fd14f7adbe [libc] Enable conversion functions on the GPU
These functions were previously removed due to problems running the
tests with `errno` in them. This was resolved previously by making the
internal implementation of these functions use a global `errno` so that
tests can still use `errno` functionality as long as they are run with a
single thread. This allows us to re-enable these tests as a previous
patch has also resolved the issue where the `stdlib` tests could not be
hermetic due to the dependence on system rounding functions.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D153016
2023-06-15 09:38:12 -05:00
Joseph Huber
a09bec6459 [libc] Move the definitions of the standard IO streams to the platform
This patch moves the definitions of the standard IO streams to the
platform file definition. This is necessary because previously we had a
level of indirection where the stream's `FILE *` was initialized based
on the pointer to the internal `__llvm_libc` version. This cannot be
resolved ahead of time by the linker because the address will not be
known until runtime. This caused the previous implementation to emit a
global constructor to initialize the pointer to the actual `FILE *`. By
moving these definitions so that we can bind their address to the
original file type we can avoid this global constructor.

This file keeps the entrypoints, but makes them empty files only
containing an external reference. This is so they still appear as
entrypoints and get emitted as declarations in the generated headers.

Reviewed By: lntue, sivachandra

Differential Revision: https://reviews.llvm.org/D152983
2023-06-15 07:06:43 -05:00
Joseph Huber
505829eacf [libc][obvious] Fix the FMA implementation on the GPU
Summary:
This doesn't include the type_traits to perform the indirection, nor
does it return the value.
2023-06-14 13:33:25 -05:00
Joseph Huber
f205fbbb01 [libc] Add support for FMA in the GPU utilities
This adds the generic FMA utilities for the GPU. We implement these
through the builtins which map to the FMA instructions in the ISA. These
may not have strict compliance with other assumptions in the the `libc`
such as rounding modes. I've included the relevant information on how
the GPU vendors map the behaviour. This should help make it easier to
implement some future generic versions.

Depends on D152486

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D152923
2023-06-14 12:59:18 -05:00
Joseph Huber
8060d96aed [libc] Begin implementing a 'libmgpu.a' for math on the GPU
This patch adds an outline to begin adding a `libmgpu.a` file for
provindg math on the GPU. Currently, this is most likely going to be
wrapping around existing vendor libraries and placing them in a more
usable format. Long term, we would like to provide our own
implementations of math functions that can be used instead.

This patch works by simply forwarding the calls to the standard C math
library calls like `sin` to the appropriate vendor call like `__nv_sin`.
Currently, we will use the vendor libraries directly and link them in
via `-mlink-builtin-bitcode`. This is necessary because of bizarre
interactions with the generic bitcode, `-mlink-builtin-bitcode`
internalizes and only links in the used symbols, furthermore is
propagates the target's default attributes and its the only "truly"
correct way to pull in these vendor bitcode libraries without error.

If the vendor libraries are not availible at build time, we will still
create the `libmgpu.a`, but we will expect that the vendor library
definitions will be provided by the user's compilation as is made
possible by https://reviews.llvm.org/D152442.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D152486
2023-06-14 12:59:15 -05:00
Tue Ly
53d4057622 [libc] Fix merging issue with test/src/math/exhaustive/expm1f_test 2023-06-14 11:00:13 -04:00
Tue Ly
055be3c30c [libc] Enable hermetic floating point tests again.
Fixing an issue with LLVM libc's fenv.h defined rounding mode macros
differently from system libc, making get_round() return different values from
fegetround().  Also letting math tests to skip rounding modes that cannot be
set.  This should allow math tests to be run on platforms in which fenv.h is not
implemented yet.

This allows us to re-enable hermatic floating point tests in
https://reviews.llvm.org/D151123 and reverting https://reviews.llvm.org/D152742.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D152873
2023-06-14 10:53:35 -04:00
Alex Brachet
10e7b451ad [libc][NFC] Fix some issues with LIBC_INLINE
We define LIBC_INLINE to include [[clang::internal_linkage]], and these
must appear before other specifiers. Additionally, there was also a
missing cast that was causing warnings.

Differential Revision: https://reviews.llvm.org/D152865
2023-06-14 14:09:11 +00:00
Guillaume Chatelet
9902fc8dad [libc] Enable custom logging in LibcTest
This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152630
2023-06-14 13:37:50 +00:00
Guillaume Chatelet
bdb07c98c4 Revert D152630 "[libc] Enable custom logging in LibcTest"
Failing buildbot https://lab.llvm.org/buildbot/#/builders/73/builds/49707
This reverts commit 9a7b4c934893d6bc571e1ce8efab2127ae5f4e45.
2023-06-14 10:31:49 +00:00
Guillaume Chatelet
9a7b4c9348 [libc] Enable custom logging in LibcTest
This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152630
2023-06-14 10:26:18 +00:00
Guillaume Chatelet
2cfae7cdf4 [libc] Dispatch memmove to memcpy when buffers are disjoint
Most of the time `memmove` is called on buffers that are disjoint, in that case we can use `memcpy` which is faster.
The additional test is branchless on x86, aarch64 and RISCV with the zbb extension (bitmanip).
On x86 this patch adds a latency of 2 to 3 cycles.

Before
```
--------------------------------------------------------------------------------
Benchmark                      Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------
BM_Memmove/0/0_median       5.00 ns         5.00 ns           10 bytes_per_cycle=1.25477/s bytes_per_second=2.62933G/s items_per_second=199.87M/s __llvm_libc::memmove,memmove Google A
BM_Memmove/1/0_median       6.21 ns         6.21 ns           10 bytes_per_cycle=3.22173/s bytes_per_second=6.75106G/s items_per_second=160.955M/s __llvm_libc::memmove,memmove Google B
BM_Memmove/2/0_median       8.09 ns         8.09 ns           10 bytes_per_cycle=5.31462/s bytes_per_second=11.1366G/s items_per_second=123.603M/s __llvm_libc::memmove,memmove Google D
BM_Memmove/3/0_median       5.95 ns         5.95 ns           10 bytes_per_cycle=2.71865/s bytes_per_second=5.69687G/s items_per_second=167.967M/s __llvm_libc::memmove,memmove Google L
BM_Memmove/4/0_median       5.63 ns         5.63 ns           10 bytes_per_cycle=2.28294/s bytes_per_second=4.78383G/s items_per_second=177.615M/s __llvm_libc::memmove,memmove Google M
BM_Memmove/5/0_median       5.68 ns         5.68 ns           10 bytes_per_cycle=2.16798/s bytes_per_second=4.54295G/s items_per_second=176.015M/s __llvm_libc::memmove,memmove Google Q
BM_Memmove/6/0_median       7.46 ns         7.46 ns           10 bytes_per_cycle=3.97619/s bytes_per_second=8.332G/s items_per_second=134.044M/s __llvm_libc::memmove,memmove Google S
BM_Memmove/7/0_median       5.40 ns         5.40 ns           10 bytes_per_cycle=1.79695/s bytes_per_second=3.76546G/s items_per_second=185.211M/s __llvm_libc::memmove,memmove Google U
BM_Memmove/8/0_median       5.62 ns         5.62 ns           10 bytes_per_cycle=3.18747/s bytes_per_second=6.67927G/s items_per_second=177.983M/s __llvm_libc::memmove,memmove Google W
BM_Memmove/9/0_median        101 ns          101 ns           10 bytes_per_cycle=9.77359/s bytes_per_second=20.4803G/s items_per_second=9.9333M/s __llvm_libc::memmove,uniform 384 to 4096
```
After
```
BM_Memmove/0/0_median       3.57 ns         3.57 ns           10 bytes_per_cycle=1.71375/s bytes_per_second=3.59112G/s items_per_second=280.411M/s __llvm_libc::memmove,memmove Google A
BM_Memmove/1/0_median       4.52 ns         4.52 ns           10 bytes_per_cycle=4.47557/s bytes_per_second=9.37843G/s items_per_second=221.427M/s __llvm_libc::memmove,memmove Google B
BM_Memmove/2/0_median       5.70 ns         5.70 ns           10 bytes_per_cycle=7.37396/s bytes_per_second=15.4519G/s items_per_second=175.399M/s __llvm_libc::memmove,memmove Google D
BM_Memmove/3/0_median       4.47 ns         4.47 ns           10 bytes_per_cycle=3.4148/s bytes_per_second=7.15563G/s items_per_second=223.743M/s __llvm_libc::memmove,memmove Google L
BM_Memmove/4/0_median       4.53 ns         4.53 ns           10 bytes_per_cycle=2.86071/s bytes_per_second=5.99454G/s items_per_second=220.69M/s __llvm_libc::memmove,memmove Google M
BM_Memmove/5/0_median       4.19 ns         4.19 ns           10 bytes_per_cycle=2.5484/s bytes_per_second=5.3401G/s items_per_second=238.924M/s __llvm_libc::memmove,memmove Google Q
BM_Memmove/6/0_median       5.02 ns         5.02 ns           10 bytes_per_cycle=5.94164/s bytes_per_second=12.4505G/s items_per_second=199.14M/s __llvm_libc::memmove,memmove Google S
BM_Memmove/7/0_median       4.03 ns         4.03 ns           10 bytes_per_cycle=2.47028/s bytes_per_second=5.17641G/s items_per_second=247.906M/s __llvm_libc::memmove,memmove Google U
BM_Memmove/8/0_median       4.70 ns         4.70 ns           10 bytes_per_cycle=3.84975/s bytes_per_second=8.06706G/s items_per_second=212.72M/s __llvm_libc::memmove,memmove Google W
BM_Memmove/9/0_median       90.7 ns         90.7 ns           10 bytes_per_cycle=10.8681/s bytes_per_second=22.7739G/s items_per_second=11.02M/s __llvm_libc::memmove,uniform 384 to 4096
```

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D152811
2023-06-14 08:29:15 +00:00
Tue Ly
1557256ab0 [libc] Add Int<> type and fix (U)Int<128> compatibility issues.
Add Int<> and Int128 types to replace the usage of __int128_t in math
functions.  Clean up to make sure that (U)Int128 and __(u)int128_t are
interchangeable in the code base.

Reviewed By: sivachandra, mikhail.ramalho

Differential Revision: https://reviews.llvm.org/D152459
2023-06-13 09:40:48 -04:00
Joseph Huber
746e72910f [libc] Fix floating point test failing to build on the GPU
A patch enabled this test which uses that `add_fp_unittest`.
Unfortunately we do not support these on the GPU because it attempts to
link in the floating point utils which are not built supporting
hermetic tests. This was attempted to be fixed in D151123 but that had
to be reverted. For now disable these so the tests pass.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D152742
2023-06-12 15:06:33 -05:00
Michael Jones
99686c5ed1 [libc][docs] Add undefined behavior doc to site
This document is based on the RFC posted to discourse:
https://discourse.llvm.org/t/rfc-defining-undefined-behavior-in-libc/

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152588
2023-06-12 11:13:51 -07:00
Michael Jones
d3074f16a6 [libc] Add qsort_r
This patch adds the reentrent qsort entrypoint, qsort_r. This is done by
extending the qsort functionality and moving it to a shared utility
header. For this reason the qsort_r tests focus mostly on the places
where it differs from qsort, since they share the same sorting code.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152467
2023-06-12 11:12:17 -07:00
Alfred Persson Forsberg
08da9ceb64 [libc] Fix argument types for {f,}truncate specs
The current argument types are currently switched around for ftruncate
and truncate. Currently passes tests because the internal definitions
inside the __llvm_libc namespace are fine.

Reviewed By: michaelrj, thesamesam, sivachandra

Differential Revision: https://reviews.llvm.org/D152664
2023-06-12 18:00:35 +01:00
Guillaume Chatelet
5e32765c15 [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-12 13:47:16 +00:00
Tue Ly
a982431295 [libc] Add platform independent floating point rounding mode checks.
Many math functions need to check for floating point rounding modes to
return correct values.  Currently most of them use the internal implementation
of `fegetround`, which is platform-dependent and blocking math functions to be
enabled on platforms with unimplemented `fegetround`.  In this change, we add
platform independent rounding mode checks and switching math functions to use
them instead. https://github.com/llvm/llvm-project/issues/63016

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D152280
2023-06-12 09:36:41 -04:00
Guillaume Chatelet
1ec995cc1c Revert D148717 "[libc] Improve memcmp latency and codegen"
This broke aarch64 debug buildbot https://lab.llvm.org/buildbot/#/builders/223/builds/21703
This reverts commit bd4f978754758d5ef29d1f10370f45362da3de37.
2023-06-12 08:32:00 +00:00