llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2025-02-12 04:43:48 +00:00

Author	SHA1	Message	Date
Jon Chesterfield	65a4ce09f8	[libc] Can build amdgpu libc even if rocm is missing Clang defaults to failing to build if it can't find rocm device libs Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D153581	2023-06-22 21:18:44 +01:00
Jon Chesterfield	578d229a1a	[libc] Move fences into outbox/wait-for-ownership test Also moves the wait-until-inbox-changes test into a shared method. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D153573	2023-06-22 18:14:41 +01:00
Jun Zhang	ce378fcb9e	[libc][NFC] Simplify return value logic in set_thread_ptr() Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D153572	2023-06-23 00:47:48 +08:00
Jon Chesterfield	ba01a2c608	[libc] Add memory fences to device-local locking calls This makes the interface less error prone. The acquire was previously forgotten. Release is currently missing if recv() is the last operation made before close. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D153571	2023-06-22 17:46:09 +01:00
Petr Hosek	f3b64887de	[libc] Place headers in the right include directory When LLVM_ENABLE_PER_TARGET_RUNTIME_DIR is enabled, place headers in `include/<target>` directory, otherwise use `include/`. Differential Revision: https://reviews.llvm.org/D152592	2023-06-22 06:22:32 +00:00
Joseph Huber	e0b487bfc0	[libc] Rename and install the RPC server interface This patch prepares the RPC interface to be installed. We place this in the existing `llvm-gpu-none` directory as it will also give us access to the generated `libc` headers for the opcodes. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D153040	2023-06-21 11:26:24 -05:00
Joseph Huber	4272d09196	[libc][NFC] Cleanup the RPC server implementation prior to installing This does some simple cleanup prior to landing the patch to install these. Differential Revision: https://reviews.llvm.org/D153439	2023-06-21 11:14:20 -05:00
Joseph Huber	1f99526d9d	[libc][NFC] Move `__has_builtin` to `LIBC_HAS_BUILTIN` Summary: These should use the common `LIBC_HAS_BUILTIN` even if we will only compile this with `clang`.	2023-06-21 09:50:40 -05:00
Guillaume Chatelet	bd1cba9f4f	Revert D148717 "[libc] Improve memcmp latency and codegen" Once integrated in our codebase the patch triggered a bunch of failing tests. We do not yet understand where the bug is but we revert it to move forward with integration. This reverts commit 5e32765c15ab8df3d2635a2bb5078c5b1d5714d5.	2023-06-21 12:37:14 +00:00
Petr Hosek	9fa7998555	[libc] Support for riscv32 This change adds basic support for baremetal riscv32 configuration. Differential Revision: https://reviews.llvm.org/D152563	2023-06-21 07:11:22 +00:00
Siva Chandra Reddy	75d70b7306	[libc] Make close function of the internal File class cleanup the file object. Before this change, a separate static method named cleanup was used to cleanup the file. Instead, now the close method cleans up the full file object using the platform's close function. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D153377	2023-06-21 05:05:04 +00:00
Joseph Huber	ee6ace27e0	[libc] Remove disabled pass after performance improvement This pass used to cause huge compile time regressions, That has been address and can now be re-added. Differential Revision: https://reviews.llvm.org/D153374	2023-06-20 15:48:02 -05:00
Joseph Huber	964a535bfa	[libc] Remove flexible array and replace with a template Currently the implementation of the RPC interface requires a flexible struct. This caused problems when compilling the RPC server with GCC as would be required if trying to export the RPC server interface. This required that we either move to the `x[1]` workaround or make it a template parameter. While just using `x[1]` would be much less noisy, this is technically undefined behavior. For this reason I elected to use templates. The downside to using templates is that the server code must now be able to handle multiple different types at runtime. I was unable to find a good solution that didn't rely on type erasure so I simply branch off of the given value. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D153304	2023-06-20 15:22:37 -05:00
Mikhail R. Gadelha	a2df87c2b0	[libc] Fix libmath test compilation when using UInt<T> This patch: (1) adds the add_with_carry_const and sub_with_borrow_const constexpr calls to add and sub, respectively. Both add and sub are constexpr calls and were call the non-constexpr version of add/sub_with_borrow. (2) adds explicit UIntType construct calls in some fp tests. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D150223	2023-06-20 15:41:18 -03:00
Tue Ly	46aa659a32	[libc][math] Improve exp2f performance. Re-organize special cases and add a special case when `\|x\| < 2^-5`. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D153134	2023-06-20 09:34:20 -04:00
Tue Ly	0ae409c0d7	[libc][math] Slightly improve sinhf and coshf performance. Re-order exceptional branches and slightly adjust the evaluation. Depends on https://reviews.llvm.org/D153026 . Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D153062	2023-06-20 09:27:28 -04:00
Tue Ly	5dbd5118ec	[libc][math] Improve tanhf performance. Re-order exceptional branches and slightly adjust the evaluation. Performance tested with the CORE-MATH project on AMD EPYC 7B12 (clocks/op) Reciprocal throughputs: ``` --- BEFORE --- $ CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf [####################] 100 % (with -mavx2 -mfma) Ntrial = 20 ; Min = 7.794 + 0.102 clc/call; Median-Min = 0.066 clc/call; Max = 8.267 clc/call; [####################] 100 %. (with -msse4.2) Ntrial = 20 ; Min = 10.783 + 0.172 clc/call; Median-Min = 0.144 clc/call; Max = 11.446 clc/call; [####################] 100 %. (SSE2) Ntrial = 20 ; Min = 18.926 + 0.381 clc/call; Median-Min = 0.342 clc/call; Max = 19.623 clc/call; --- AFTER --- $ CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf [####################] 100 % (with -mavx2 -mfma) Ntrial = 20 ; Min = 6.598 + 0.085 clc/call; Median-Min = 0.052 clc/call; Max = 6.868 clc/call; [####################] 100 % (with -msse4.2) Ntrial = 20 ; Min = 9.245 + 0.304 clc/call; Median-Min = 0.248 clc/call; Max = 10.675 clc/call; [####################] 100 %. (SSE2) Ntrial = 20 ; Min = 11.724 + 0.440 clc/call; Median-Min = 0.444 clc/call; Max = 12.262 clc/call; ``` Latency: ``` --- BEFORE --- $ PERF_ARGS="--latency" CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf [####################] 100 % (with -mavx2 -mfma) Ntrial = 20 ; Min = 38.821 + 0.157 clc/call; Median-Min = 0.122 clc/call; Max = 39.539 clc/call; [####################] 100 %. (with -msse4.2) Ntrial = 20 ; Min = 44.767 + 0.766 clc/call; Median-Min = 0.681 clc/call; Max = 45.951 clc/call; [####################] 100 %. (SSE2) Ntrial = 20 ; Min = 55.055 + 1.512 clc/call; Median-Min = 1.571 clc/call; Max = 57.039 clc/call; --- AFTER --- $ PERF_ARGS="--latency" CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf [####################] 100 % (with -mavx2 -mfma) Ntrial = 20 ; Min = 36.147 + 0.194 clc/call; Median-Min = 0.181 clc/call; Max = 36.536 clc/call; [####################] 100 % (with -msse4.2) Ntrial = 20 ; Min = 40.904 + 0.728 clc/call; Median-Min = 0.557 clc/call; Max = 42.231 clc/call; [####################] 100 %. (SSE2) Ntrial = 20 ; Min = 55.776 + 0.557 clc/call; Median-Min = 0.542 clc/call; Max = 56.551 clc/call; ``` Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D153026	2023-06-20 09:25:07 -04:00
Siva Chandra Reddy	21e1651c0c	[libc] Remove the requirement of a platform-flush operation in File abstraction. The libc flush operation is not supposed to trigger a platform level flush operation. See "Notes" on this Linux man page: https://man7.org/linux/man-pages/man3/fflush.3.html Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D153182	2023-06-19 18:38:29 +00:00
Joseph Huber	5a8fc41937	[libc] Disable atomic optimizations for `libc` AMDGPU builds Recently the AMDGPU backend automatically enables a pass to optimize atomics. This results in the LTO build taking about 10x longer in all cases. For now we disable this by default as was the case before the patch in D152649. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D153232	2023-06-19 03:25:51 -05:00
Alfred Persson Forsberg	c32ba7d5e0	[libc] [NFC] malloc.h: fix include guard typo Differential Revision: https://reviews.llvm.org/D153231	2023-06-18 23:08:25 +01:00
Joseph Huber	70b1c3999c	[libc][Docs] Add some motivation for the GPU libc This provides some basic motivation behind the GPU libc. Suggests are welcome. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D152028	2023-06-16 15:19:45 -05:00
Joseph Huber	d663da07e3	[libc][Obvious] Fix problem with the variable used for the jobs Summary: There was an issue with the variable we were using to conditonally set the job number for the GPU.	2023-06-16 14:11:53 -05:00
Joseph Huber	27f326334f	[libc] Add an option to use a job pool for GPU tests Currently the GPU has restrictions on how many tests can be run in parallel due to resource constraints. However, building these tests can take a long time so we want to be able to build them in parallel. This patch introduces the option `LIBC_GPU_TEST_JOBS` which is set to the number of threads to run in parallel. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D153157	2023-06-16 14:06:16 -05:00
Joseph Huber	485e2de6d5	[libc][nfc] Silence two warnings in tests These currently give warnings for unused variables or a default case where everything is covered. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D153137	2023-06-16 12:52:06 -05:00
Joseph Huber	ed34cb2cd7	[libc] Add a test for `fputs` to check using `stdout` and `stderr` This patch adds a test directly for the `fputs` function similar to the existing `puts` test. This lets us know that the default file pointers are function and the `fputs` interface works. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D152288	2023-06-16 11:01:55 -05:00
Alex Brachet	61c9052cec	[libc] Add LIBC_INLINE_VAR for inline variables These are the only variables I could find that use LIBC_INLINE. Note, these are namespace scoped constexpr so local linkage is implied. inline is useful here to silence clang's unused-const-variable variable. For Fuchsia, the distinction between LIBC_INLINE and LIBC_INLINE_VAR is helpful because we define LIBC_INLINE as `[[gnu::always_inline]] inline` when building with gcc. This isn't meaningful on variables. Alternatively, we could make these variables simply constexpr and also add `[[maybe_unused]]` Reviewed By: sivachandra, mcgrathr Differential Revision: https://reviews.llvm.org/D152951	2023-06-16 15:46:32 +00:00
Joseph Huber	490958b9ea	[libc][obvious] Actually return the value from `malloc` for NVPTX Switching to this interface we neglected to actually write the output from the malloc call to the RPC buffer. Fix this so the tests pass again. Differential Revision: https://reviews.llvm.org/D153069	2023-06-15 15:13:11 -05:00
Joseph Huber	7e8b0c27f2	[libc] Disable the strtod and strtold tests on NVPTX These tests have a single line that fails with a value off-by-one, see https://lab.llvm.org/buildbot/#/builders/46/builds/50055/steps/12/logs/stdio . Disable these for now so we can figure out what the error is later. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D153056	2023-06-15 13:29:42 -05:00
Joseph Huber	dcdfc963d7	[libc] Export GPU extensions to `libc` for external use The GPU port of the LLVM C library needs to export a few extensions to the interface such that users can interface with it. This patch adds the necessary logic to define a GPU extension. Currently, this only exports a `rpc_reset_client` function. This allows us to use the server in D147054 to set up the RPC interface outside of `libc`. Depends on https://reviews.llvm.org/D147054 Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D152283	2023-06-15 11:02:24 -05:00
Joseph Huber	719d77ed28	[libc] Begin implementing a library for the RPC server This patch begins providing a generic static library that wraps around the raw `rpc.h` interface. As discussed in the corresponding RFC, https://discourse.llvm.org/t/rfc-libc-exporting-the-rpc-interface-for-the-gpu-libc/71030, we want to begin exporting RPC services to external users. In order to do this we decided to not expose the `rpc.h` header by wrapping around its functionality. This is done with a C-interface as we make heavy use of callbacks and allows us to provide a predictable interface. Reviewed By: JonChesterfield, sivachandra Differential Revision: https://reviews.llvm.org/D147054	2023-06-15 11:02:23 -05:00
Joseph Huber	fd14f7adbe	[libc] Enable conversion functions on the GPU These functions were previously removed due to problems running the tests with `errno` in them. This was resolved previously by making the internal implementation of these functions use a global `errno` so that tests can still use `errno` functionality as long as they are run with a single thread. This allows us to re-enable these tests as a previous patch has also resolved the issue where the `stdlib` tests could not be hermetic due to the dependence on system rounding functions. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D153016	2023-06-15 09:38:12 -05:00
Joseph Huber	a09bec6459	[libc] Move the definitions of the standard IO streams to the platform This patch moves the definitions of the standard IO streams to the platform file definition. This is necessary because previously we had a level of indirection where the stream's `FILE ` was initialized based on the pointer to the internal `__llvm_libc` version. This cannot be resolved ahead of time by the linker because the address will not be known until runtime. This caused the previous implementation to emit a global constructor to initialize the pointer to the actual `FILE `. By moving these definitions so that we can bind their address to the original file type we can avoid this global constructor. This file keeps the entrypoints, but makes them empty files only containing an external reference. This is so they still appear as entrypoints and get emitted as declarations in the generated headers. Reviewed By: lntue, sivachandra Differential Revision: https://reviews.llvm.org/D152983	2023-06-15 07:06:43 -05:00
Joseph Huber	505829eacf	[libc][obvious] Fix the FMA implementation on the GPU Summary: This doesn't include the type_traits to perform the indirection, nor does it return the value.	2023-06-14 13:33:25 -05:00
Joseph Huber	f205fbbb01	[libc] Add support for FMA in the GPU utilities This adds the generic FMA utilities for the GPU. We implement these through the builtins which map to the FMA instructions in the ISA. These may not have strict compliance with other assumptions in the the `libc` such as rounding modes. I've included the relevant information on how the GPU vendors map the behaviour. This should help make it easier to implement some future generic versions. Depends on D152486 Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D152923	2023-06-14 12:59:18 -05:00
Joseph Huber	8060d96aed	[libc] Begin implementing a 'libmgpu.a' for math on the GPU This patch adds an outline to begin adding a `libmgpu.a` file for provindg math on the GPU. Currently, this is most likely going to be wrapping around existing vendor libraries and placing them in a more usable format. Long term, we would like to provide our own implementations of math functions that can be used instead. This patch works by simply forwarding the calls to the standard C math library calls like `sin` to the appropriate vendor call like `__nv_sin`. Currently, we will use the vendor libraries directly and link them in via `-mlink-builtin-bitcode`. This is necessary because of bizarre interactions with the generic bitcode, `-mlink-builtin-bitcode` internalizes and only links in the used symbols, furthermore is propagates the target's default attributes and its the only "truly" correct way to pull in these vendor bitcode libraries without error. If the vendor libraries are not availible at build time, we will still create the `libmgpu.a`, but we will expect that the vendor library definitions will be provided by the user's compilation as is made possible by https://reviews.llvm.org/D152442. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D152486	2023-06-14 12:59:15 -05:00
Tue Ly	53d4057622	[libc] Fix merging issue with test/src/math/exhaustive/expm1f_test	2023-06-14 11:00:13 -04:00
Tue Ly	055be3c30c	[libc] Enable hermetic floating point tests again. Fixing an issue with LLVM libc's fenv.h defined rounding mode macros differently from system libc, making get_round() return different values from fegetround(). Also letting math tests to skip rounding modes that cannot be set. This should allow math tests to be run on platforms in which fenv.h is not implemented yet. This allows us to re-enable hermatic floating point tests in https://reviews.llvm.org/D151123 and reverting https://reviews.llvm.org/D152742. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D152873	2023-06-14 10:53:35 -04:00
Alex Brachet	10e7b451ad	[libc][NFC] Fix some issues with LIBC_INLINE We define LIBC_INLINE to include [[clang::internal_linkage]], and these must appear before other specifiers. Additionally, there was also a missing cast that was causing warnings. Differential Revision: https://reviews.llvm.org/D152865	2023-06-14 14:09:11 +00:00
Guillaume Chatelet	9902fc8dad	[libc] Enable custom logging in LibcTest This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D152630	2023-06-14 13:37:50 +00:00
Guillaume Chatelet	bdb07c98c4	Revert D152630 "[libc] Enable custom logging in LibcTest" Failing buildbot https://lab.llvm.org/buildbot/#/builders/73/builds/49707 This reverts commit 9a7b4c934893d6bc571e1ce8efab2127ae5f4e45.	2023-06-14 10:31:49 +00:00
Guillaume Chatelet	9a7b4c9348	[libc] Enable custom logging in LibcTest This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D152630	2023-06-14 10:26:18 +00:00
Guillaume Chatelet	2cfae7cdf4	[libc] Dispatch memmove to memcpy when buffers are disjoint Most of the time `memmove` is called on buffers that are disjoint, in that case we can use `memcpy` which is faster. The additional test is branchless on x86, aarch64 and RISCV with the zbb extension (bitmanip). On x86 this patch adds a latency of 2 to 3 cycles. Before ``` -------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... -------------------------------------------------------------------------------- BM_Memmove/0/0_median 5.00 ns 5.00 ns 10 bytes_per_cycle=1.25477/s bytes_per_second=2.62933G/s items_per_second=199.87M/s __llvm_libc::memmove,memmove Google A BM_Memmove/1/0_median 6.21 ns 6.21 ns 10 bytes_per_cycle=3.22173/s bytes_per_second=6.75106G/s items_per_second=160.955M/s __llvm_libc::memmove,memmove Google B BM_Memmove/2/0_median 8.09 ns 8.09 ns 10 bytes_per_cycle=5.31462/s bytes_per_second=11.1366G/s items_per_second=123.603M/s __llvm_libc::memmove,memmove Google D BM_Memmove/3/0_median 5.95 ns 5.95 ns 10 bytes_per_cycle=2.71865/s bytes_per_second=5.69687G/s items_per_second=167.967M/s __llvm_libc::memmove,memmove Google L BM_Memmove/4/0_median 5.63 ns 5.63 ns 10 bytes_per_cycle=2.28294/s bytes_per_second=4.78383G/s items_per_second=177.615M/s __llvm_libc::memmove,memmove Google M BM_Memmove/5/0_median 5.68 ns 5.68 ns 10 bytes_per_cycle=2.16798/s bytes_per_second=4.54295G/s items_per_second=176.015M/s __llvm_libc::memmove,memmove Google Q BM_Memmove/6/0_median 7.46 ns 7.46 ns 10 bytes_per_cycle=3.97619/s bytes_per_second=8.332G/s items_per_second=134.044M/s __llvm_libc::memmove,memmove Google S BM_Memmove/7/0_median 5.40 ns 5.40 ns 10 bytes_per_cycle=1.79695/s bytes_per_second=3.76546G/s items_per_second=185.211M/s __llvm_libc::memmove,memmove Google U BM_Memmove/8/0_median 5.62 ns 5.62 ns 10 bytes_per_cycle=3.18747/s bytes_per_second=6.67927G/s items_per_second=177.983M/s __llvm_libc::memmove,memmove Google W BM_Memmove/9/0_median 101 ns 101 ns 10 bytes_per_cycle=9.77359/s bytes_per_second=20.4803G/s items_per_second=9.9333M/s __llvm_libc::memmove,uniform 384 to 4096 ``` After ``` BM_Memmove/0/0_median 3.57 ns 3.57 ns 10 bytes_per_cycle=1.71375/s bytes_per_second=3.59112G/s items_per_second=280.411M/s __llvm_libc::memmove,memmove Google A BM_Memmove/1/0_median 4.52 ns 4.52 ns 10 bytes_per_cycle=4.47557/s bytes_per_second=9.37843G/s items_per_second=221.427M/s __llvm_libc::memmove,memmove Google B BM_Memmove/2/0_median 5.70 ns 5.70 ns 10 bytes_per_cycle=7.37396/s bytes_per_second=15.4519G/s items_per_second=175.399M/s __llvm_libc::memmove,memmove Google D BM_Memmove/3/0_median 4.47 ns 4.47 ns 10 bytes_per_cycle=3.4148/s bytes_per_second=7.15563G/s items_per_second=223.743M/s __llvm_libc::memmove,memmove Google L BM_Memmove/4/0_median 4.53 ns 4.53 ns 10 bytes_per_cycle=2.86071/s bytes_per_second=5.99454G/s items_per_second=220.69M/s __llvm_libc::memmove,memmove Google M BM_Memmove/5/0_median 4.19 ns 4.19 ns 10 bytes_per_cycle=2.5484/s bytes_per_second=5.3401G/s items_per_second=238.924M/s __llvm_libc::memmove,memmove Google Q BM_Memmove/6/0_median 5.02 ns 5.02 ns 10 bytes_per_cycle=5.94164/s bytes_per_second=12.4505G/s items_per_second=199.14M/s __llvm_libc::memmove,memmove Google S BM_Memmove/7/0_median 4.03 ns 4.03 ns 10 bytes_per_cycle=2.47028/s bytes_per_second=5.17641G/s items_per_second=247.906M/s __llvm_libc::memmove,memmove Google U BM_Memmove/8/0_median 4.70 ns 4.70 ns 10 bytes_per_cycle=3.84975/s bytes_per_second=8.06706G/s items_per_second=212.72M/s __llvm_libc::memmove,memmove Google W BM_Memmove/9/0_median 90.7 ns 90.7 ns 10 bytes_per_cycle=10.8681/s bytes_per_second=22.7739G/s items_per_second=11.02M/s __llvm_libc::memmove,uniform 384 to 4096 ``` Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D152811	2023-06-14 08:29:15 +00:00
Tue Ly	1557256ab0	[libc] Add Int<> type and fix (U)Int<128> compatibility issues. Add Int<> and Int128 types to replace the usage of __int128_t in math functions. Clean up to make sure that (U)Int128 and __(u)int128_t are interchangeable in the code base. Reviewed By: sivachandra, mikhail.ramalho Differential Revision: https://reviews.llvm.org/D152459	2023-06-13 09:40:48 -04:00
Joseph Huber	746e72910f	[libc] Fix floating point test failing to build on the GPU A patch enabled this test which uses that `add_fp_unittest`. Unfortunately we do not support these on the GPU because it attempts to link in the floating point utils which are not built supporting hermetic tests. This was attempted to be fixed in D151123 but that had to be reverted. For now disable these so the tests pass. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D152742	2023-06-12 15:06:33 -05:00
Michael Jones	99686c5ed1	[libc][docs] Add undefined behavior doc to site This document is based on the RFC posted to discourse: https://discourse.llvm.org/t/rfc-defining-undefined-behavior-in-libc/ Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D152588	2023-06-12 11:13:51 -07:00
Michael Jones	d3074f16a6	[libc] Add qsort_r This patch adds the reentrent qsort entrypoint, qsort_r. This is done by extending the qsort functionality and moving it to a shared utility header. For this reason the qsort_r tests focus mostly on the places where it differs from qsort, since they share the same sorting code. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D152467	2023-06-12 11:12:17 -07:00
Alfred Persson Forsberg	08da9ceb64	[libc] Fix argument types for {f,}truncate specs The current argument types are currently switched around for ftruncate and truncate. Currently passes tests because the internal definitions inside the __llvm_libc namespace are fine. Reviewed By: michaelrj, thesamesam, sivachandra Differential Revision: https://reviews.llvm.org/D152664	2023-06-12 18:00:35 +01:00
Guillaume Chatelet	5e32765c15	[libc] Improve memcmp latency and codegen This is based on ideas from @nafi to: - use a branchless version of 'cmp' for 'uint32_t', - completely resolve the lexicographic comparison through vector operations when wide types are available. We also get rid of byte reloads and serializing '__builtin_ctzll'. I did not include the suggestion to replace comparisons of 'uint16_t' with two 'uint8_t' as it did not seem to help the codegen. This can be revisited in sub-sequent patches. The code been rewritten to reduce nested function calls, making the job of the inliner easier and preventing harmful code duplication. Reviewed By: nafi3000 Differential Revision: https://reviews.llvm.org/D148717	2023-06-12 13:47:16 +00:00
Tue Ly	a982431295	[libc] Add platform independent floating point rounding mode checks. Many math functions need to check for floating point rounding modes to return correct values. Currently most of them use the internal implementation of `fegetround`, which is platform-dependent and blocking math functions to be enabled on platforms with unimplemented `fegetround`. In this change, we add platform independent rounding mode checks and switching math functions to use them instead. https://github.com/llvm/llvm-project/issues/63016 Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D152280	2023-06-12 09:36:41 -04:00
Guillaume Chatelet	1ec995cc1c	Revert D148717 "[libc] Improve memcmp latency and codegen" This broke aarch64 debug buildbot https://lab.llvm.org/buildbot/#/builders/223/builds/21703 This reverts commit bd4f978754758d5ef29d1f10370f45362da3de37.	2023-06-12 08:32:00 +00:00

1 2 3 4 5 ...

1949 Commits