Summary:
This patch simply adds the necessary config to enable qsort and bsearch
on the GPU. It is *highly* unlikely that anyone will use these, as they
are single threaded, but we may as well support all entrypoints that we
can.
We don't have to set or clear the Thumb bit in relocation fixup values.
It's not part of the branch range and the respective encoding functions
like encodeImmBT4BlT1BlxT2() shift out the least significant bit anyway.
This was a leftover from the initial patch before we switched to store
Thumb state in target-flags with D146641.
This reverts commit 2ca4d13612.
Also revert the followup, "[InlineAsm] fix botched merge conflict resolution"
This reverts commit 8b9bf3a9f7.
There were SystemZ and Mips build errors, too many to fix forward.
This partially reverts commit e30a148b09, which removed the base
template for std::char_traits. That base template had been marked as
deprecated since LLVM 16 and we were planning to remove it in LLVM 18.
However, as explained in the post-commit comments in
https://reviews.llvm.org/D157058, the deprecation mechanism didn't work
as expected. Basically, the deprecation warnings were never shown to
users since libc++ headers are system headers and Clang doesn't show
warnings in system headers.
As a result, this removal came with basically no lead time as far as
users are concerned, which is a poor experience. For this reason, I am
re-introducing the deprecated char_traits specialization until we have a
proper way of phasing it out in a way that is not a surprise for users.
A splat index means the operation is reading from (writing to) the same
memory location. Generally, zero is the cheapest value to splat. As
such, we'd prefer to add the splatted value to the base, and use a
constant zero as the index operand.
Currently, the VectorToLLVM patterns are built into a library along
with the corresponding pass, which also pulls in all the
platform-specific vector dialects (like AMXDialect) to apply all the
vector to LLVM conversions.
This causes dependency bloat when writing libraries - for example the
GPU to LLVM passes, which use the vector to LLVM patterns, don't need
the X86Vector dialect to be present at all.
This commit partitions the library into VectorToLLVM and
VectorToLLVMPass, where the latter pulls in all the other vector
transformations.
Reviewed By: nicolasvasilache, mehdi_amini
Differential Revision: https://reviews.llvm.org/D158287
Similar to
commit 2fad6e6985 ("[InlineAsm] wrap Kind in enum class NFC")
Fix the TODOs added in
commit 93bd428742 ("[InlineAsm] refactor InlineAsm class NFC
(#65649)")
This is part of a libc wide CMake cleanup which aims to eliminate
certain explicitly duplicated logic which is available in CMake-3.20.
This change in particular makes the entrypoint aliases real library
targets so that they can be treated as normal library targets by other
libc build rules.
- Added WritableArmRelocation and ArmRelocation Structs
- Encode/Decode funcs for B/BL A1 and BLX A2 encodings
- Add ARM helper functions, consistent with the existing Thumb helper functions
- Add Test for ELF::R_ARM_CALL
Reviewed By: sgraenitz
Differential Revision: https://reviews.llvm.org/D157533
Dictionary literal keys and strings in TypeScript type declarations can
not be broken.
The problem was pointed out by @alexfh and @e-kud here:
https://reviews.llvm.org/D154093#4644512
The new ACLE PR#225[1] now combines the slice parameters for some
builtins.
Slice specifies the ZA slice number directly and needs to be explicity
implemented by the "user" with the base register plus the immediate
offset
[1]https://github.com/ARM-software/acle/pull/225/files
This patch changes the size of time_t to be an int64_t. This still
follows the POSIX standard which only requires time_t to be an integer.
Making time_t a 64-bit integer also fixes two cases in 32 bits platforms
that use SYS_clock_nanosleep_time64 and SYS_clock_gettime64, as the name
of these calls implies, they require a 64-bit time_t. For instance, in rv32,
the 32-bit version of these syscalls is not available.
We also follow glibc here, where time_t is still a 32-bit integer in
arm32.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D159125
Summary:
Currently, there is an assertion that prevents us from emitting an
AMDGPU global with a non-target specific address space (i.e. numerical
attribute). I'm unsure what the original intentions of this assertion
were, but we should be able to use OpenCL address spaces when compiling
directly to AMDGPU from C++. This is permitted on NVPTX so I'm unsure
what this assertion is guarding. The patch simply removes the assertion
and adds a test to ensure that these emit the expected address spaces.
Fixes https://github.com/llvm/llvm-project/issues/65069
Summary:
We use the `llvm.amgcn.abi.version` varaible to control code generation.
This is emitted in every module now to indicate what should be used when
compiling. Previously, the logic caused us to emit an external reference
to this variable when creating the code for the `none` type. This would
then cause us not to emit the actual definition. This patch refines the
logic to create the external reference, and then update it if it is
found unset by the time we emit the global. I had to remove the
reference to `GetOrCreateLLVmGlobal` because it did not accept the
proper address space.
We were using a mix of unsigned and signed ints in the various PTX asm
printers. All calls from tablgen use a non-negative immediate, so either
will work, but when doing arithmetic on the return value from
`getNumOperands`, or calling `getOperand`, it makes sense to keep
everything unsigned.
VPIntrinsics with VP_PROPERTY_BINARYOP property should have the ability
to be queried with with VPBinOpIntrinsic::isVPBinOp, similiar to how
intrinsics with the VP_PROPERTY_REDUCTION property can be queried with
VPReductionIntrinsic::isVPReduction.
This will be used in #65706. In that PR the usage of this class is
tested.
This adds a helper method to get the ID of the functionally equivalent
intrinsic, similar to the existing getFunctionalOpcodeForVP and
getConstrainedIntrinsicIDForVP methods.
Not sure if it's notable or not, but I can't find any existing uses of
VP_PROPERTY_FUNCTIONAL_INTRINSIC?
It could potentially be used in #65706 to scalarize VP intrinsics.