We don't need to have built all the builtins before building the
runtimes for a particular target, only the builtins for that target.
While I'm here, rename the variable that stores the builtins dep to
something less generic than `deps`, to minimize the chances of
accidentally using a variable with the same name from an outer scope.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D139913
HandlePassthroughUser may sometimes create a new entry for the OffsetInfo of a
user in the OffsetInfoMap. This can invalidate outstanding references into the
map, including the one which needs to be copied into the new entry. This
produces invalid offset info that can trigger assertions.
Fixed this by not using references at this point. The bug was originally
introduced in commit ID 0dc0a441323d41b4860668f38d290579e0de130c.
Reviewed By: ronlieb
Differential Revision: https://reviews.llvm.org/D140837
This patch makes SVE intrinsics more useable by gating them on the
target, not by ifdef preprocessor macros. See #56480. This alters the
SVEEmitter for arm_sve.h to remove the #ifdef guards and instead use
TARGET_BUILTIN with the correct features so that the existing "'func'
needs target feature sve" error will be generated when sve is not
present.
The ArchGuard containing defines in the SVEEmitter are changed to
TargetGuard containing target features. In the arm_neon.h emitter there
are both existing ArchGuard ifdefs mixed with new TargetGuard target
feature guards, so the name is change in the SVE too for consistency.
The few functions that are present in arm_sve.h (as opposed to builtin
aliases) have __attribute__((target("sve"))) added. Some of the tests
needed to be rejigged a little, as well as updating the error message,
as the error now happens at a later point.
Differential Revision: https://reviews.llvm.org/D131064
This new option is set to `false` by default. It should be set only in Canonicalizer tests to detect faulty canonicalization patterns. I.e., patterns that prevent the canonicalizer from converging. The canonicalizer should always convergence on such small unit tests that we have in `canonicalize.mlir`.
Two faulty canonicalization patterns were detected and fixed with this change.
Differential Revision: https://reviews.llvm.org/D140873
Currently `FunctionDecl::isReservedGlobalPlacementOperator` will crash
if the function is not an allocation/deallocation function, which is
surprising. Also, its semantics is not consistent with
isReplaceableGlobalAllocationFunction, which will return false if the
function is not an allocation/deallocation function.
This patch make FunctionDecl::isReservedGlobalPlacementOperator not
crash if the function is not an allocation/deallocation function, which
is consistent with isReplaceableGlobalAllocationFunction too.
This commit adds support for importing the magic globals "global_ctors"
and "global_dtors" from LLVM IR to the LLVM IR dialect. The import
fails when these globals have a non-null data pointer, as this can
currently not be represented in the corresponding MLIR operations.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D140877
Move code from SCF to Affine: Add a new helper function `simplifyConstrainedMinMaxOp` to Affine/Analysis/Utils.h. `canonicalizeMinMaxOp` was originally designed for loop peeling, but it is not SCF-specific and can be used to simplify any affine.min/max ops.
Various functions in SCF/Transforms are simplified by dropping unnecessary parameters.
Differential Revision: https://reviews.llvm.org/D140962
AbstractDenseDataFlowAnalysis::visitOperation controls how the dataflow
analysis proceeds around control flow. In particular, conservative
assumptions are made about call operations which can prevent some
analysis from succeeding.
The motivating case for this change is https://reviews.llvm.org/D140415,
for which it is correct and necessary for the lattice to be preserved
after call operations.
Some renaming was necessary to avoid confusion with
DenseDataFlowAnalysis::visitOperation.
AbstractDenseDataFlowAnalysis::visitRegionBranchOperation and
DenseDataFlowAnalysis::visitOperationImpl are also made protected
to allow implementation of AbstractDenseDataFlowAnalysis::visitOperation,
although I did not need these to be virtual.
Differential Revision: https://reviews.llvm.org/D140879
Using decls in header files are special, usually as part of the
public API, the check should not emit warnings on these.
The check already detects unused using-decls which are in the current main
file, but if the main file happens to be a header file, we still
emit warnings, this patch suppresses that.
Differential Revision: https://reviews.llvm.org/D140894
Since we're now requiring C++17, Let's get rid of makeXXX functions like
makeArrayRef, and use deduction guides instead.
This is a first step: Introduce the deduction guide. Following steps
will be a) use them and b) deprecate makeArrayRef.
Apart from codebase modernization, there isn't much benefit from that
move, but I can still mention that it would slightly (probably
negligibly) decrease the number of symbols / debug info, as deduction
guides don't generate new code.
Differential Revision: https://reviews.llvm.org/D140896
The patch also adds expandVPCTLZ and expandVPCTTZ to expand vp.ctlz/cttz nodes
and the cost model of vp.ctlz/cttz.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D140370
Instruction formats:
`movgr2fcsr fcsr, rj`
`movfcsr2gr rd, fcsr`
MOVGR2FCSR modifies the value of the software writable field
corresponding to the FCSR (floating-point control and status
register) `fcsr` according to the value of the lower 32 bits of
the GR (general purpose register) `rj`.
MOVFCSR2GR sign extends the 32-bit value of the FCSR `fcsr`
and writes it into the GR `rd`.
Add "i32 @llvm.loongarch.movfcsr2gr(i32)" intrinsic for MOVFCSR2GR
instruction. The argument is FCSR register number. The return value
is the value in the FCSR.
Add "void @llvm.loongarch.movgr2fcsr(i32, i32)" intrinsic for MOVGR2FCSR
instruction. The first argument is the FCSR number, the second argument
is the value in GR.
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D140685
Add stub Sphinx documentation, with configuration copy-pasted from lld and
index page converted from bolt/README.md.
Reviewed By: #bolt, rafauler
Differential Revision: https://reviews.llvm.org/D140156
Use llvm::reverse instead of `for (auto I = rbegin(), E = rend(); I != E; ++I)`
Reviewed By: #bolt, rafauler
Differential Revision: https://reviews.llvm.org/D140516
Defaulting to Generic mode doesn't make much sense as the kernel needs
to be prepared for it. SPMD mode is the "native" execution, e.g., for
"bare" kernels. It also is the execution method for constructors and
destructors (as we might otherwise throw an extra warp onto them).
Differential Revision: https://reviews.llvm.org/D140718
This is the default behavior and cuts down on attribute spam.
Probably should also do something to consolidate the option spellings;
printing and parsing it is repeated in at least 3 different places.
In the OpenMP tests, I had to manually delete some metadata check
lines update_cc_test_checks was inserting that included the local
build revision.
1. Minor editorial corrections.
2. Allow different call frames to be associated with different target
architectures in a single thread.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D140646
This is a follow-on to https://reviews.llvm.org/D134073.
The errors in the R600 half were fixed previously in
https://reviews.llvm.org/D134078. Originally, I thought that the fixes
to the AMDGPU half would be tricky, but upon taking another look,
there were only a couple minor issues that needed fixing:
1. Previously, buffer load instructions (`BUFFER_LOAD_*_LDS_*`) were
populating the `vdata` field in the instruction from the `swz`
operand. This was incorrect, but harmless, as when the LDS option is
set, the instruction does not use the vdata field.
2. The `BUFFER_STORE_LDS_DWORD_gfx90a` instruction was populating
`acc` from the `swz` operand, because `acc` was set to `?`. (I believe
that the intent here was to leave the instruction bit as an "unknown
value", but you can't do that except by setting the bits on `Inst`
directly). Also harmless, for the same reason.
Differential Revision: https://reviews.llvm.org/D140918
The scalar move instructions (vmv.s.x, and fvmv.s.f) depend solely on whether the VL is 0 or non-zero. By tracking the fact we only demand the zeroness and not the whole VL value, we can allow changing VL over a scalar move. This helps to eliminate vsetvli toggles.
Differential Revision: https://reviews.llvm.org/D140157
Address several issues involving control flow graph generation and
structured code ops.
- Fix a problem with constructs nested inside unstructured selection
constructs. This is a general problem involving branches that are
implied rather than explicit. It is addressed in the generic genFIR
"wrapper" function that calls individual statement-specific genFIR calls.
- The previous fix requires some compensating changes in IF and DO
construct code lowering.
- Streamline the code to generate explicit DO loop variable updates.
- Fix a problem with the individual detailed genFIR calls made in the
genFIR(SelectTypeConstruct) call.
- Modify control flow graph generation to support the insertion of
deallocation and finalization code when lowering most END <construct>
statements.
No-LBR mode wasn't tested and slipped when mayHaveProfileData was added for
Lite mode. This enables processing of profiles collected without LBR and
converted with `perf2bolt -nl` option.
Test Plan:
bin/llvm-lit -a tools/bolt/test/X86/nolbr.s
https://github.com/rafaelauler/bolt-tests/pull/20
Reviewed By: #bolt, rafauler
Differential Revision: https://reviews.llvm.org/D140256
When lowering tosa.resize it is possible there is an unary input dimension.
Lowering to a new tosa.resize and explicit broadcast simplifies the
tosa.resize operation to avoid recomputing the identical broadcasted values.
This change reworks the broadcast optimization reuse the tosa.resize generic
implementation.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D139963
This is mostly geared at consolidating logic into one form to reduce code duplication, but also has the effect of being a slight generalization. Since these operations aren't masked, we can ignore the mask policy bit when deciding on compatibility. The previous code was overly strict in checking that both policy bits matched.
Note: There's a slight difference from the reviewed version. The reviewed version was based on a local revision which included the isCompatible change to only check AVL if VL is used. I apparently never landed that change, and while functional, the functional change isn't visible without this one. I chose to role the extra change into this patch.
Differential Revision: https://reviews.llvm.org/D140147
Main thing I was unsure about was to whether try to delete the now
dead landing blocks, or leave that for the unreachable block reduction.
Personality function is not reduced, but that should be a separate
reduction on the function.
Fixes#58815
Currently, the checker only recognizes the nullopt constructor when it is called
without sugar, resulting in a crash in the (rare) case where it has been wrapped
in sugar. This relaxes the constraint by checking the constructor decl directly
(which always contains the same, desugared form) rather than the construct
expression (where the spelling depends on the context).
Differential Revision: https://reviews.llvm.org/D140921
Using the function's address space makes no sense. Copied from the
existing test, with more addrspace variation. Could just replace the
existing one with this version if it's redundant.