Handling LoadInst and StoreInst in tryToWiden seems a bit
counter-intuitive, as there is only an assertion for them and in no
case VPWidenRefipes are created for them.
I think it makes sense to move the assertion to handleReplication, where
the non-widened loads and store are handled.
Reviewers: gilr, rengolin, Ayal, hsaito
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D77972
Similarly to actual LLVM IR, and to `llvm.mlir.func`, allow the custom syntax
of `llvm.mlir.global` to omit the linkage keyword. If omitted, the linkage is
assumed to be external. This makes the modeling of globals in the LLVM dialect
more consistent, both within the dialect and with LLVM IR.
Differential Revision: https://reviews.llvm.org/D78096
Introduce mlir::applyOpPatternsAndFold which applies patterns as well as
any folding only on a specified op (in contrast to
applyPatternsAndFoldGreedily which applies patterns only on the regions
of an op isolated from above). The caller is made aware of the op being
folded away or erased.
Depends on D77485.
Differential Revision: https://reviews.llvm.org/D77487
As we disscussed in D77971, we haven't confirmed that if putting instructions
in a non-executable section is an undefined behaviour. To make things
easier to go on, we mark these sections executable in test file
align-branch-section-size.s.
Summary:
Changing all mnemonic to match assembly instructions to simplify mnemonic
naming rules. This time update all fixed-point arithmetic instructions.
This also corrects smax/smin code generations.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D77856
Summary:
The formatters code has a lot of 'reason' or 'why' values that we keep or-ing FormatterChoiceCriterion
enum values into. These values are only read by a single log statement and don't have any functional
purpose. It also seems the implementation is not finished (for example, display names and type
names don't have any dedicated enum values). Also everything is of course not tested or documented.
Let's just remove all of this.
Reviewers: labath, JDevlieghere, jingham, davide, vsk
Reviewed By: labath, vsk
Subscribers: JDevlieghere
Differential Revision: https://reviews.llvm.org/D77968
Fix an assert introduced in 41ed5d856c: a phi with a single predecessor and a
mask is a valid case which is already supported by the code.
Differential Revision: https://reviews.llvm.org/D78115
Summary:
This reduces memory usage by dynamic index from more than 400MB to 32MB
when all files in clang-tools-extra/clangd/*.cpp are active in clangd.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77732
This reverts commit 2ada8e2525.
Buildbots produced compilation errors which I was not able to quickly
reproduce locally. Need more time to investigate.
An irreducible SCC is one which has multiple "header" blocks, i.e., blocks
with control-flow edges incident from outside the SCC. This pass converts an
irreducible SCC into a natural loop by introducing a single new header
block and redirecting all the edges on the original headers to this
new block.
This is a useful workaround for a limitation in the structurizer
which, which produces incorrect control flow in the presence of
irreducible regions. The AMDGPU backend provides an option to
enable this pass before the structurizer, which may eventually be
enabled by default.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D77198
This is a minor NFC change to make the code more clear. We have the NegatibleCost that
has cheaper, neutral, and expensive. Typically, the smaller one means the less cost.
It is inverse for current implementation, which makes following code not easy to read.
If (CostX > CostY) negate(X)
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D77993
This was hitting the default instruction constraint code which uses
the register classes in the instruction def, which REG_SEQUENCE does
not have.
Fixes not constraining the register class for AMDGPU fneg/fabs
patterns, which would fail when the use was another generic,
unconstrained instruction.
Another oddity I noticed is that the temporary registers are created
with an unnecessary, but incorrect 16-bit LLT but this shouldn't
matter.
I'm also still unclear why root and sub-instructions have to be
handled differently.
Summary:
This revision adds two utilities currently present in MLIR to LLVM StringExtras:
* convertToSnakeFromCamelCase
Convert a string from a camel case naming scheme, to a snake case scheme
* convertToCamelFromSnakeCase
Convert a string from a snake case naming scheme, to a camel case scheme
Differential Revision: https://reviews.llvm.org/D78167
In particular, this affects Clang's vectors. Users encounter this issue
when a struct contains an __m128 type.
Fixes PR45420
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D77754
Summary:
Currently, the internal options -vectorize-loops, -vectorize-slp, and
-interleave-loops do not have much practical effect. This is because
they are used to initialize the corresponding flags in the pass
managers, and those flags are then unconditionally overwritten when
compiling via clang or via LTO from the linkers. The only exception was
-vectorize-loops via opt because of some special hackery there.
While vectorization could still be disabled when compiling via clang,
using -fno-[slp-]vectorize, this meant that there was no way to disable
it when compiling in LTO mode via the linkers. This only affected
ThinLTO, since for regular LTO vectorization is done during the compile
step for scalability reasons. For ThinLTO it is invoked in the LTO
backends. See also the discussion on PR45434.
This patch makes it so the internal options can actually be used to
disable these optimizations. Ultimately, the best long term solution is
to mark the loops with metadata (similar to the approach used to fix
-fno-unroll-loops in D77058), but this enables a shorter term
workaround, and actually makes these internal options useful.
I constant propagated the initial values of these internal flags into
the pass manager flags (for some reasons vectorize-loops and
interleave-loops were initialized to true, while vectorize-slp was
initialized to false). As mentioned above, they are overwritten
unconditionally so this doesn't have any real impact, and these initial
values aren't particularly meaningful.
I then changed the passes to check the internl values and return without
performing the associated optimization when false (I changed the default
of -vectorize-slp to true so the options behave similarly). I was able
to remove the hackery in opt used to get -vectorize-loops=false to work,
as well as a special option there used to disable SLP vectorization.
Finally, I changed thinlto-slp-vectorize-pm.c to:
a) Only test SLP (moved the loop vectorization checking to a new test).
b) Use code that is slp vectorized when it is enabled, and check that
instead of whether the pass is enabled.
c) Test the new behavior of -vectorize-slp.
d) Test both pass managers.
The loop vectorization (and associated interleaving) testing I moved to
a new thinlto-loop-vectorize-pm.c test, with several changes:
a) Changed the flags on the interleaving testing so that it will
actually interleave, and check that.
b) Test the new behavior of -vectorize-loops and -interleave-loops.
c) Test both pass managers.
Reviewers: fhahn, wmi
Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, davezarzycki, llvm-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77989
Summary:
The test in question uses a non-portable `grep -A` option in conjunction
with `wc -l`. `FileCheck` can be used to do the check without using
these extra utilities.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D78060
- Unify the sections on DWARF expression and location lists.
- Allow a location description to have one or more single location
descriptions.
- Define context of DWARF expression that includes an initial
stack. Allow initial stack to be used when evaluating location list
expression with overlapping PC ranges.
- Reorganize the DWARF proposal in AMDGPUUsage so suitable for
submission to the DWARF site.
- Replace CFI instruction DW_CFA_LLVM_def_cfa_aspace with
DW_CFA_def_aspace_cfa and DW_CFA_def_aspace_cfa_sf. This is to avoid
the problem that DW_CFA_def_cfa and DW_CFA_def_cfa_sf cannot use a
register that is not the size of an address in the CFA address
space.
- Clarify DWARF address class and DWARF address space. Define language
values for DWARF address classes and specify how they are used by
some common source languages.
- Define rules for accessing registers and derefencing memory when the
type size and register size or byte size operand do not match.
- Numerous cleanups for consistency.
Differential Revision: https://reviews.llvm.org/D70523
Fix a bug where UnwindAssemblyInstEmulation would confuse which
register is used to compute the Canonical Frame Address after it
had branched over a mid-function epilogue (where the CFA reg changes
from $fp to $sp in the process of epiloguing). Reinstate the
correct CFA register after we forward the unwind rule for branch
targets. The failure mode was that UnwindAssemblyInstEmulation
would think CFA was set in terms of $sp after one of these epilogues,
and if it sees modifications to $sp after the branch target, it would
change the CFA offset in the unwind rule -- even though the CFA is
defined in terms of $fp and the $sp changes are irrelevant to correct
calculation.
<rdar://problem/60300528>
Differential Revision: https://reviews.llvm.org/D78077
Integer type in Std dialect is signless so we should be checking
for signless integer type instead of signed integer type in EDSC.
Differential Revision: https://reviews.llvm.org/D78144
This is a no-op because it is set later on unconditionally again, but
it's far less confusing this way and consistent with how the setters
are initialized.
Summary:
The 'Clang 10' boxes should be green since Clang 10 has been released.
Reviewers: rsmith, aaron.ballman
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D78068
Previously, getWithOffset() would drop the offset if the base was null.
Because of this, MachineMemOperand would return the wrong result from
getAlign() in these cases. MachineMemOperand stores the alignment of
the pointer without the offset.
A bunch of MIR tests changed because we print the offset now.
Split off from D77687.
Differential Revision: https://reviews.llvm.org/D78049
Summary:
Continuing from D77285, the external interfaces implemented by
`WasmDump.cpp` are now declared in `WasmDump.h` and moved into the
`llvm::objdump` namespace.
Reviewers: jhenderson, MaskRay, DiggerLin, jasonliu, daltenty
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D77990
This symbol is defined in avr-gcc. Because AVR normally uses the ELF
format, define the symbol unconditionally.
This patch is needed to get Clang to compile compiler-rt.
Differential Revision: https://reviews.llvm.org/D78117
This class implements a switch-like dispatch statement for a value of 'T' using dyn_cast functionality. Each `Case<T>` takes a callable to be invoked if the root value isa<T>, the callable is invoked with the result of dyn_cast<T>() as a parameter.
Differential Revision: https://reviews.llvm.org/D78070
These have proved incredibly useful for interleaving values between a range w.r.t to streams. After this revision, the mlir/Support/STLExtras.h is empty. A followup revision will remove it from the tree.
Differential Revision: https://reviews.llvm.org/D78067
This revision moves the various range utilities present in MLIR to LLVM to enable greater reuse. This revision moves the following utilities:
* indexed_accessor_*
This is set of utility iterator/range base classes that allow for building a range class where the iterators are represented by an object+index pair.
* make_second_range
Given a range of pairs, returns a range iterating over the `second` elements.
* hasSingleElement
Returns if the given range has 1 element. size() == 1 checks end up being very common, but size() is not always O(1) (e.g., ilist). This method provides O(1) checks for those cases.
Differential Revision: https://reviews.llvm.org/D78064
This revision moves several type_trait utilities from MLIR into LLVM. Namely, this revision adds:
is_detected - This matches the experimental std::is_detected
is_invocable - This matches the c++17 std::is_invocable
function_traits - A utility traits class for getting the argument and result types of a callable type
Differential Revision: https://reviews.llvm.org/D78059
This revision adds a DenseMapInfo overload for std::tuples whose elements all have a DenseMapInfo. The implementation is similar to that of std::pair, and has been used within MLIR for over a year.
Differential Revision: https://reviews.llvm.org/D78057
Summary:
Also,
- add IndexTensor to OpBase.td
- fix typo in the op name. It was mistakenly `to_tensor` instead of
`to_extent_tensor`.
Differential Revision: https://reviews.llvm.org/D78149
This should make both static and dynamic NewPM plugins work with LTO.
And as a bonus, it makes static linking of OldPM plugins more reliable
for plugins with both an OldPM and NewPM interface.
I only implemented the command-line flag to specify NewPM plugins in
llvm-lto2, to show it works. Support can be added for other tools later.
Differential Revision: https://reviews.llvm.org/D76866