Many folds in InstCombine are limited to one-use instructions. For
that reason, if the use-count of an instruction drops to one, it
makes sense to revisit that one user. This is one of the most
common reasons why InstCombine fails to finish in a single iteration.
Doing this revisit actually slightly improves compile-time, because
we save an extra InstCombine iteration in enough cases to make a
visible difference.
This is conceptually NFC, but not NFC in practice, because differences
in worklist order can result in slightly different folding behavior.
The regressed tests in or-shifted-masks.ll now require a sequence of
instcombine,early-cse,instcombine to fold fully. D152876 would make
these fold in a single instcombine run again.
Differential Revision: https://reviews.llvm.org/D151807
This is the 11th patch of the patch-set. For the cover letter, please
checkout D152069.
Depends on D152078.
This patch also fixes the suffix for non-overloaded variants for
vset on tuple types.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D152079
This is the 10th patch of the patch-set. For the cover letter, please
checkout D152069.
Depends on D152077.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D152078
This is the 9th patch of the patch-set. For the cover letter, please
checkout D152069.
Depends on D152076.
This patch expands all variants of indexed strided segment store.
This patch also fixes the trailing suffix in the intrinsics' function
name that representing the return type, adding `x{NF}`.
For the same reason mentioned in [3/11], only full test case for
vsuxseg2ei32, vsoxseg2ei32 is added for now.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D152077
This is the 8th patch of the patch-set. For the cover letter, please
checkout D152069.
Depends on D152075.
This patch expands all variants of indexed strided segment load,
including the policy variants. This patch also fixes the trailing suffix
in the intrinsics' function name that representing the return type,
adding `x{NF}`.
For the same reason mentioned in [3/11], only full test case for
vluxseg2ei32, vloxseg2ei32 is added for now.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D152076
This is the 7th patch of the patch-set. For the cover letter, please
checkout D152069.
Depends on D152074.
This patch expands all variants for strided segment store. The store
intrinsics does not have any policy variants. This patch also fixes the
trailing suffix in the intrinsics' function name that representing the
return type, adding `x{NF}`.
For the same reason mentioned in [3/11], only full test case for
vssseg2e32 is added for now.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D152075
This is the 6th patch of the patch-set. For the cover letter, please
checkout D152069.
Depends on D152073.
This patch expands all variants of strided segment load, including the
policy variants. This patch also fixes the trailing suffix in the
intrinsics' function name that representing the return type, adding
`x{NF}`.
For the same reason mentioned in [3/11], only full test case for
vlsseg2e32 is added.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D152074
This is the 5th patch of the patch-set. For the cover letter, please
checkout D152069.
Depends on D152072.
This patch expands all variants of unit stride fault-first segment
load, including the policy variants. This patch also fixes the
trailing suffix in the intrinsics' function name that representing
the return type, adding `x{NF}`.
For the same reason mentioned in [3/11], only full test case for
vlseg2e32ff is added.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D152073
`RewriterBase::Listener::notifyOperationReplaced` notifies observers that an op is about to be replaced with a range of values. This notification is not very useful for ops without results, because it does not specify the replacement op (and it cannot be deduced from the replacement values). It provides no additional information over the `notifyOperationRemoved` notification.
This revision adds an additional notification when a rewriter replaces an op with another op. By default, this notification triggers the original "op replaced with values" notification, so there is no functional change for existing code.
This new API is useful for the transform dialect, which needs to track op replacements. (Updated in a subsequent revision.)
Also includes minor documentation improvements.
Differential Revision: https://reviews.llvm.org/D152814
This is the 4th patch of the patch-set. For the cover letter, please
checkout D152069.
Depends on D152071.
This patch expands all variants for unit stride segment store. The
store intrinsics does not have any policy variants. This patch also
fixes the trailing suffix in the intrinsics' function name that
representing the return type, adding `x{NF}`.
For the same reason mentioned in [3/11], only full test case for
vsseg2e32 is added.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D152072
This is the 3rd patch of the patch-set. For the cover letter, please
checkout D152069.
Depends on D152070.
This patch expands all variants of unit stride segment load, including
the policy variants. This patch also fixes the trailing suffix in the
intrinsics' function name that representing the return type, adding
`x{NF}`.
Currently the tuple type co-exists with the non-tuple type intrinsics.
Since the co-existance is temporary, this patch only adds test cases of
all variants for vlseg2e32 to show the capability done.
Test cases of other data type and NF will be added in the patch-set
when the replacement happens.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D152071
This is the 2nd patch of the patch-set. For the cover letter, please
checkout D152069.
Depends on D152069.
This patch also removes redundant checks related to tuples and dedicate
the check to happen in `RVVType::verifyType`.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D152070
Lower gang dim from the parse tree to the new MLIR
representation.
Depends on D151972
Reviewed By: razvanlupusoru, jeanPerier
Differential Revision: https://reviews.llvm.org/D151973
We're getting asserts for duplicate section registration during
linking which stems back to these sections. From previous
discussions, it seems like these are metadata sections that can
be dropped. See the discussion in D116474 and
https://bugs.llvm.org/show_bug.cgi?id=45111.
Differential Revision: https://reviews.llvm.org/D152574
This enables layering baremetal multilibs on top of each other.
For example a multilib containing only a no-exceptions libc++ could be
layered on top of a multilib containing C libs. This avoids the need
to duplicate the C library for every libc++ variant.
Differential Revision: https://reviews.llvm.org/D143075
This will enable layering multilibs on top of each other.
For example a multilib containing only a no-exceptions libc++ could be
layered on top of a multilib containing C libs. This avoids the need
to duplicate the C library for every libc++ variant.
This change doesn't expose the functionality externally, it only opens
the functionality up to be potentially used by ToolChain classes.
Differential Revision: https://reviews.llvm.org/D143059
The default location for multilib.yaml is lib/clang-runtimes, without
any target-specific suffix. This will allow multilibs for different
architectures to share a common include directory.
To avoid breaking the arm-execute-only.c CHECK-NO-EXECUTE-ONLY-ASM
test, add a ForMultilib argument to getARMTargetFeatures.
Since the presence of multilib.yaml can change the exact location of a
library, relax the baremetal.cpp test.
Differential Revision: https://reviews.llvm.org/D142986
This option causes the flags used for selecting multilibs to be printed.
This is an experimental feature that is documented in detail in D143587.
Differential Revision: https://reviews.llvm.org/D142933
The format includes a ClangMinimumVersion entry to avoid a potential
source of subtle errors if an older version of Clang were to be used
with a multilib.yaml that requires a newer Clang to work correctly.
This feature is comparable to CMake's cmake_minimum_required.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D142932
In Rust bare-metal targets, there is no environment component in triple name. This patch ignores warnings that look like:
```
warning: triple-implied ABI conflicts with provided target-abi ‘lp64s', using target-abi
```
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D152778
Close https://github.com/llvm/llvm-project/issues/61940.
The root cause is that clang will generate vtable as strong symbol now
even if the corresponding class is defined in other module units. After
I check the wording in Itanium ABI, I find this is not inconsistent.
Itanium ABI 5.2.3
(https://itanium-cxx-abi.github.io/cxx-abi/abi.html#vague-vtable) says:
> The virtual table for a class is emitted in the same object containing
> the definition of its key function, i.e. the first non-pure virtual
> function that is not inline at the point of class definition.
So the current behavior is incorrect. This patch tries to address this.
Also I think we need to do a similar change for MSVC ABI. But I don't
find the formal wording. So I don't address this in this patch.
Reviewed By: rjmccall, iains, dblaikie
Differential Revision: https://reviews.llvm.org/D150023
When hitting an lldbassert in a non-assert build, we emit a blurb
including the assertion, the triggering file and line and a pretty
backtrace leading up to the issue. Currently, this is all printed to
stderr. That's fine on the command line, but when used as library, for
example from Xcode, this information doesn't make it to the user. This
patch uses the diagnostic infrastructure to report LLDB asserts as
diagnostic events.
The patch is slightly more complicated than I would've liked because of
layering. lldbassert is part of Utility while the debugger diagnostics
are implemented in Core.
Differential revision: https://reviews.llvm.org/D152866
genObjectList is not used anymore. Just remove it.
Depends on D151975
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D151976
Add parsing supprot for dim in gang clause
Depends on D151971
Reviewed By: razvanlupusoru, jeanPerier
Differential Revision: https://reviews.llvm.org/D151972
Use the new firstprivate representation on the comupte construct.
Reviewed By: razvanlupusoru, jeanPerier
Differential Revision: https://reviews.llvm.org/D151975
If `Mask` and `Amt` are not constants and `binop1` and `binop2` are
the same we can transform to:
`(binop (lshift (binop X, Y), Amt), Mask)`
If `binop` is `add`, `lshift` must be `shl`.
If `Mask` and `Amt` are constants `C` and `C1` respectively.
We can transform to:
`(lshift1 (binop1 (binop2 X, (inv_lshift1 C, C1), Y)), C1)`
Saving an instruction IFF:
`lshift1` is same opcode as `lshift2`
Either `bitwise1` and/or `bitwise2` is `and`.
Proofs(1/2): https://alive2.llvm.org/ce/z/BjN-m_
Proofs(2/2): https://alive2.llvm.org/ce/z/bZn5QB
This is to help fix the regression caused in D151807
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D152568
For some reason we used to only handle address space aliasing through
chaining a target specific AA pass. We need never-fail simple queries
in order to lower memmove intrinsics based purely on the address
spaces.
I also think it would be better if BasicAA checked this, rather than
relying on the target AA passes. Currently we go through the more
expensive AA analyses before getting to the trivial address space
checks.
For now, only elementwise operations are supported. Operations that perform any
kind of data permutation require changes in the representation of scalable
dimensions in VectorType.
Differential Revision: https://reviews.llvm.org/D152599
ShuffleBuilder generates a zero mask here:
`[[TMP6:%.*]] = shufflevector <2 x float> [[TMP3]], <2 x float> poison, <4 x i32> zeroinitializer`
But the correct mask is `0,0,1,1`, or we should have reused `TMP4`.
Differential Revision: https://reviews.llvm.org/D152868
For a lightweight pass we do not want to instantiate or use the
MustBeExecutedContextExplorer. This simply allows such a configuration.
While at it, the explorer is now allocated with the bump allocator.
This is the first patch in a series to change how we represent tail agnostic, tail undefined, and tail undisturbed operations. In current code, we tend to use an unsuffixed pseudo for undefined (despite calling it TA most places in code), and the _TU form for both agnostic and undisturbed (via the policy operand).
The key observation behind this patch is that we can represent tail undefined via a pseudo with a passthrough operand if that operand is IMPLICIT_DEF (aka undef). We already have a few instances of this in tree - see vmv.s.x and vslide* - but we can do this more universally. Once complete, we will be able to delete roughly ~1/3 of our vector pseudo classes.
A bit more information on the overall goal can be found in this discourse post: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295.
This patch doesn't actually remove the legacy unsuffixed pseudo as there's still some path from intrinsic lowering which uses it. (I have not yet located it.) This also means we don't have to modify any of the lookup tables which makes the migration simpler. We can defer deleting the tables and pseudos until one final change once all the instructions have been migrated.
There are a couple of regressions in the tests. At first, these concerned me, but it turns out that all of them are differences in expansion of a single source level instruction. I think we can safely ignore this for the moment. I did explore changing the handling of IMPLICIT_DEF in ScheduleDAG, but that causes an absolutely *massive* test diff with minimal profit. I really don't think it's worth doing.
Differential Revision: https://reviews.llvm.org/D152380
If we had an unknown access but already some prior knowledge (known), we
could have ended up ignoring the unknown access all together. The
problem is that we track unknown not as all locations but separately.
This patch bridges the gap and expands the unknown bits to "all bits"
when we add an access.
Fixes: https://github.com/llvm/llvm-project/issues/63291