We previously directly codegened to v_log_f32, which is broken for
denormals. The lowering isn't complicated, you simply need to scale
denormal inputs and adjust the result. Note log and log10 are still
not accurate enough, and will be fixed separately.
This adds operations for binary additive operators to EmitC. The input
arguments to these ops can be EmitC pointers and thus the operations can
be used for pointer arithmetic.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D149963
Now MainFileMacros preserves enough information, we perform a just-in-time
convertion to interop with include-cleaner::Macro for include-cleaer features.
Differential Revision: https://reviews.llvm.org/D147034
Now that we have proper support for optional operands, the standard LLVM
machinery can take care of converting parsed instructions to MCInsts.
There are likely more cases where the conversion can be done
automatically, probably with some additional treatment. The plan is to
address them separately.
Part of <https://github.com/llvm/llvm-project/issues/62629>.
Reviewed By: arsenm, foad
Differential Revision: https://reviews.llvm.org/D153565
The DataLayout alloca address space is the address space that should
be used when creating new allocas. However, not all allocas are
required to be in this address space. The isKnownNonZero() check
should work on the actual address space of the alloca, not the
default alloca address space.
When implicitly defining a function in C, we would try to find an
appropriate declaration context for the function to be declared within.
However, we did not account for GNU statement expressions, which
masquerade as a compound statement and can be used in other contexts
such as within structure member declarations.
Fixes https://github.com/llvm/llvm-project/issues/48579
A common user mistake is specifying a target of aarch64-none-eabi or
arm-none-elf whereas the correct names are aarch64-none-elf &
arm-none-eabi. Currently if a target of aarch64-none-eabi is specified
then the Generic_ELF toolchain is used, unlike aarch64-none-elf which
will use the BareMetal toolchain. This is unlikely to be intended by the
user so issue a warning that the target is invalid.
The target parser is liberal in what input it accepts so invalid triples
may yield behaviour that's sufficiently close to what the user intended.
Therefore invalid triples were used in many tests. This change updates
those tests to use valid triples.
One test (gnu-mcount.c) relies on the Generic_ELF toolchain behaviour so
change it to explicitly specify aarch64-unknown-none-gnu as the target.
Reviewed By: peter.smith, DavidSpickett
Differential Revision: https://reviews.llvm.org/D153430
The current instruction's pointer operand may be different from the one
specified in the Operands argument. We should use the pointer operand
from here instead in case the user has transformed it.
This manifested itself somewhere down the line in
https://reviews.llvm.org/D149889, but I haven't been able to create a
test case on its own yet unfortunately.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D153574
The object size and alignment based restriction on the possible
allocation range also applies to allocas, not just globals, so
handle them as well.
We shouldn't really need any type restriction here at all, but
for now stay conservative.
GNU addr2line exits immediately if it cannot open the file specified as
executable/relocatable. In contrast llvm-addr2line does not exit and, if
addresses are not specified in command line, waits for input on stdin. This
causes the test compiler-rt/test/asan/TestCases/Posix/asan-symbolize-bad-path.cc to block
forever on Gentoo (see https://reviews.llvm.org/rG27c4777f41d2ab204c1cf84ff1cccd5ba41354da#1190273).
To fix this issue the behavior llvm-addr2line now exits if
executable/relocatable file cannot be found.
It fixes https://github.com/llvm/llvm-project/issues/42099 (llvm-addr2line
does not exit when passed a non-existent file).
Differential Revision: https://reviews.llvm.org/D147652
Adds the capability to recognize SelectInst that appear in the IR.
These instructions are generated during scalable vectorization for reduction
and when the code contains conditions inside the loop body or when
"-prefer-predicate-over-epilogue=predicate-dont-vectorize" is set.
Differential Revision: https://reviews.llvm.org/D152558
This reverts commit ab09654832dba5cef8baa6400fdfd3e4d1495624.
Reason: Reapplying after removing unnecessary default case in switch expression.
[ARM] generate armv6m eXecute Only (XO) code for immediates, globals
Previously eXecute Only (XO) support was implemented for targets that support
MOVW/MOVT (~armv7+). See: https://reviews.llvm.org/D27449
XO prevents the compiler from generating data accesses to code sections. This
patch implements XO codegen for armv6-M, which does not support MOVW/MOVT, and
must resort to the following general pattern to avoid loads:
movs r3, :upper8_15:foo
lsls r3, #8
adds r3, :upper0_7:foo
lsls r3, #8
adds r3, :lower8_15:foo
lsls r3, #8
adds r3, :lower0_7:foo
ldr r3, [r3]
This is equivalent to the code pattern generated by GCC.
The above relocations are new to LLVM and have been implemented in a parent
patch: https://reviews.llvm.org/D149443.
This patch limits itself to implementing codegen for this pattern and enabling
XO for armv6-M in the backend.
Separate patches will follow for:
- switch tables
- replacing specific loads from constant islands which are spread out over the
ARM backend codebase. Amongst others: FastISel, call lowering, stack frames.
Reviewed By: john.brawn
Differential Revision: https://reviews.llvm.org/D152795
Accidentally copy-pasted them into the .cpp while refactoring the file in D151432
Those functions are currently only used in the .cpp so it didn't cause an issue, but it causes an undefined reference if another file attempts to use them.
This patch removes DAG combines that are no longer relevant
because equivalent IR combines have been added.
Differential Revision: https://reviews.llvm.org/D153445
The corresponding function definition was removed by:
commit 773d663e4729f55d23cb04f78a9d003643f2cb37
Author: Arthur Eubanks <aeubanks@google.com>
Date: Mon Feb 27 19:00:37 2023 -0800
If a value is already the last element of the worklist, then I think that we don't have to add it again, it is not needed to process it repeatedly.
For some long Triton-generated LLVM IR, this can cause a ~100x speedup.
Differential Revision: https://reviews.llvm.org/D153561
Wrapping a warning into a silenceable failure will result in the warning
being interpreted as an error, which it is not.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D153546
When exiting the scope of a region attached to a transform op, clean up
the handle invalidation checks assocaited with handles defined in this
region. Otherwise, these checks may trigger on the next entry to the
region while there is no incorrect usage.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D153545
When the sign of either of the operands is known, it is possible to
determine what the saturating value will be without having to compute it
using the sign bits.
Differential Revision: https://reviews.llvm.org/D153575
This makes the bytecode reader/writer work on big-endian platforms.
The only problem was related to encoding of multi-byte integers,
where both reader and writer code make implicit assumptions about
endianness of the host platform.
This fixes the current test failures on s390x, and in addition allows
to remove the UNSUPPORTED markers from all other bytecode-related
test cases - they now also all pass on s390x.
Also adding a GFAIL_SKIP to the MultiModuleWithResource unit test,
as this still fails due to an unrelated endian bug regarding
decoding of external resources.
Differential Revision: https://reviews.llvm.org/D153567
Reviewed By: mehdi_amini, jpienaar, rriddle
Use hlfir::loadTrivialScalars to dereference pointer, allocatables, and
load numerical and logical scalars.
This has a small fallout on tests:
- load is done on the HLFIR entity (#0 of hlfir.declare) and not the FIR one (#1). This makes no difference at the FIR level (#1 and #0 only differs to account for assumed and explicit shape lower bounds).
- loadTrivialScalars get rids of allocatable fir.box for monomoprhic scalars
(it is not needed). This exposed a bug in lowering of MERGE with
a polymorphic and a monomorphic argument: when the monomorphic is not
a fir.box, the polymorphic fir.class should not be reboxed but its
address should be read.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D153252
- The AST of the function we're currently analyzing
- The CFG
- The CFG element we're currently processing
Reviewed By: ymandel
Differential Revision: https://reviews.llvm.org/D153549
The declaration was added without a corresponding class definition by:
commit 13bb8f491a1cb429226768cfd4ca6bcea3b938dd
Author: Stella Laurenzo <laurenzo@google.com>
Date: Wed Apr 3 11:16:32 2019 -0700
The declaration was added without a corresponding class definition by:
commit a84064bcda1a737658d33e96ca58516d01af70a6
Author: Florian Hahn <flo@fhahn.com>
Date: Wed Dec 21 22:02:31 2022 +0000
It is most likely a misspelling of PredicatedScalarEvolution.
The declaration was added without a corresponding function by:
commit cc3bb85580189d4a004cfd9bd2d6286cd1c1169f
Author: James Nagurne <j-nagurne@ti.com>
Date: Fri Oct 22 17:08:16 2021 -0500
Instead of dumping all sources into RTXray object library with a weird
special case for x86, handle multiarch builds better. Build a separate
object library for each arch with its arch-specific sources, then link
in all those libraries.
This fixes the build on platforms that produce fat binaries, such as new
macOS which expects both x86_64 and aarch64 objects in the same library
since Apple Silicon is a thing.
This only enables building XRay support for Apple Silicon. It does not
actually work yet on macOS, neither on Intel nor on Apple Silicon CPUs.
Thus the tests are still disabled.
Reviewed By: MaskRay, phosek
Differential Revision: https://reviews.llvm.org/D153221
CMake plumbing cargo culted from other tests.
Minor changes to Process to allow statically allocating a buffer.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D153594
`TargetGlobalTLSAddress` is not considered and handled correctly when matching addressing mode, which leads to an incorrect result of instruction selection.
fixes#63162.
Reviewed By: myhsu
Differential Revision: https://reviews.llvm.org/D153103