Summary:
In SelectionDAG, when a store is immediately chained to another store
to the same address, elide the first store as it has no observable
effects. This is causes small improvements dealing with intrinsics
lowered to stores.
Test notes:
* Many testcases overwrite store addresses multiple times and needed
minor changes, mainly making stores volatile to prevent the
optimization from optimizing the test away.
* Many X86 test cases optimized out instructions associated with
associated with va_start.
* Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has
dependencies to check and can probably be removed and potentially
replaced with another test.
Reviewers: rnk, john.brawn
Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33206
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303198 91177308-0d34-0410-b5e6-96231b3b80d8
We don't use it and it was removed in gfx9, and the encoding
bit repurposed.
Additionally actually using it requires changing the output register
class, which wasn't done anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302814 91177308-0d34-0410-b5e6-96231b3b80d8
This allows folding source modifiers in more f16 cases.
Makes it easier to select per-component packed neg modifiers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302813 91177308-0d34-0410-b5e6-96231b3b80d8
- MIParser: If the successor list is not specified successors will be
added based on basic block operands in the block and possible
fallthrough.
- MIRPrinter: Adds a new `simplify-mir` option, with that option set:
Skip printing of block successor lists in cases where the
parser is guaranteed to reconstruct it. This means we still print the
list if some successor cannot be determined (happens for example for
jump tables), if the successor order changes or branch probabilities
being unequal.
Differential Revision: https://reviews.llvm.org/D31262
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302289 91177308-0d34-0410-b5e6-96231b3b80d8
This would assert when there were multiple defs of
a physical register.
We just need to move all of the users of it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301730 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r300732. This breaks a few tests.
I think the problem is related to adding more uses of
the condition that don't yet exist at this point.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301242 91177308-0d34-0410-b5e6-96231b3b80d8
In call sequence setups, there may not be a frame index base
and the pointer is a constant offset from the frame
pointer / scratch wave offset register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301230 91177308-0d34-0410-b5e6-96231b3b80d8
Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0.
This is fine for most targets. However for amdgcn target, the size of pointer in address space 0
depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is
32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target
triple environment. Therefore a hook is need in target lowering for getting the fence operand type.
This patch has no effect on targets other than amdgcn.
Differential Revision: https://reviews.llvm.org/D32186
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301215 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes traps in any block besides the entry block,
and fixes depending on a live-in physical register
by using a virtual register copy.
Also happens to stop emitting a nop in the case
debug trap is not supported.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301206 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Fix a compiler bug when the lane select happens to end up in a VGPR.
Clarify the semantic of the corresponding intrinsic to be that of
the corresponding GLSL: the lane select must be uniform across a
wave front, otherwise results are undefined.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D32343
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301197 91177308-0d34-0410-b5e6-96231b3b80d8
SI_MASKED_UNREACHABLE does not have machine instruction encoding.
It needs special handling in AMDGPUAsmPrinter::EmitInstruction like some
other pseudo instructions.
This patch fixes compilation failure of RadeonRays.
Differential Revision: https://reviews.llvm.org/D32364
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301025 91177308-0d34-0410-b5e6-96231b3b80d8
Recently alloca address space has been added to data layout. Due to this
change, pointer returned by alloca may have different size as pointer in
address space 0.
However, currently the value type of frame index is assumed to be of the
same size as pointer in address space 0.
This patch fixes that.
Most targets assume alloca returning pointer in address space 0, which
is the default alloca address space. Therefore it is NFC for them.
AMDGCN target with amdgiz environment requires this change since it
assumes alloca returning pointer to addr space 5 and its size is 32,
which is different from the size of pointer in addr space 0 which is 64.
Differential Revision: https://reviews.llvm.org/D32021
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300864 91177308-0d34-0410-b5e6-96231b3b80d8
This is inserted directly in the text section. The relocation
for the function ends up resolving to the beginning of the
amd_kernel_code_t header rather than the actual function
entry point.
Also skip some of the comments for initialization
that only makes sense for kernels.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300736 91177308-0d34-0410-b5e6-96231b3b80d8
The most common case for a branch condition is
a single use compare. Directly invert the branch
predicate rather than adding a lot of xor i1 true
which the DAG will have to fold later.
This produces nicer to read structurizer output.
This produces some random changes in codegen
due to the DAG swapping branch conditions itself,
and then does a poor job of dealing with those
inverts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300732 91177308-0d34-0410-b5e6-96231b3b80d8
Avoid looping through program to determine register counts.
This avoids needing to look at regmask operands.
Also fixes some counting errors with flat_scr when there
are no stack objects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300482 91177308-0d34-0410-b5e6-96231b3b80d8