Add parsing and printing of image operands. Matches legacy sp3 assembler.
Change image instruction order to have data/image/sampler operands in the beginning. This is needed because optional operands in MC are always last.
Update SITargetLowering for new order.
Add basic MC test.
Update CodeGen tests.
Review: http://reviews.llvm.org/D17574
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261995 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of the convoluted if-statment we can just use getColor. This also fixes
a bug where we relied upon the parity of tablegen-generated register indexes
(instead of using the machine encoding).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261990 91177308-0d34-0410-b5e6-96231b3b80d8
Currently aligned is what is being used so remove the redundant patterns for the unaligned versions. But don't do this for the byte and word vector types since they don't have aligned versions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261985 91177308-0d34-0410-b5e6-96231b3b80d8
This also simplifies the code by removing the overly conservative
NoInterveningSideEffect() function. This function checked:
- That the two copies belong to the same block: We only process one
block at a time and clear our maps in between it is impossible to find a
copy from a different block.
- There is no terminator between the two copy instructions: This is not
allowed anyway (the MachineVerifier would complain)
- Does not have instructions with hasUnmodeledSideEffects() or isCall()
set: Even for those instructuction we must have all clobbers/defs of
registers explicit as an operand. If the register is explicitely
clobbered we would never come to the point of checking for
NoInterveningSideEffect() anyway.
(I also checked this with a temporary build of the test-suite with all
potentially failing conditions in NoInterveningSideEffect() turned into
asserts)
Differential Revision: http://reviews.llvm.org/D17474
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261965 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Check that we're using SCEV for the same loop we're simulating. Otherwise, we might try to use the iteration number of the current loop in SCEV expressions for inner/outer loops IVs, which is clearly incorrect.
Reviewers: chandlerc, hfinkel
Subscribers: sanjoy, llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D17632
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261958 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is the first simple attempt to reduce number of coverage-
instrumented blocks.
If a basic block dominates all its successors, then its coverage
information is useless to us. Ingore such blocks if
santizer-coverage-prune-tree option is set.
Differential Revision: http://reviews.llvm.org/D17626
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261949 91177308-0d34-0410-b5e6-96231b3b80d8
Inline-asm calls aren't annotated with funclet bundle operands because
they don't throw and cannot be inlined through. We shouldn't require
them to bear an funclet bundle operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261942 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Avoid special case for FP, LR CFI emission and just allow general
AArch64FrameLowering::emitCalleeSavedFrameMoves() to handle them. Also,
stop recalculating the stack offsets in emitCalleeSavedFrameMoves()
since we can just reuse the previously calculated offset stored in the
MachineFrameInfo.
Depends on D17000
Reviewers: t.p.northover, rengolin, mcrosier, jmolloy
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D17004
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261885 91177308-0d34-0410-b5e6-96231b3b80d8
Support all instructions with VOP1 encoding with 32 or 64-bit operands for VI subtarget:
VGPR_32 and VReg_64 operand register classes
VS_32 and VS_64 operand register classes with inline and literal constants
Tests for VOP1 instructions.
Patch by: skolton
Reviewers: arsenm, tstellarAMD
Review: http://reviews.llvm.org/D17194
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261878 91177308-0d34-0410-b5e6-96231b3b80d8
Resubmit with index problem fixed. Verified with valgrind.
Prepare to support DPP encodings.
For DPP encodings, we want row_mask/bank_mask/bound_ctrl to be optional operands.
However this means that when parsing instruction which has no mnemonic prefix,
we cannot add both default values for VOP3 and for DPP optional operands
to OperandVector - neither instructions would match. So add default values
for optional operands to MCInst during conversion instead.
Mark more operands as IsOptional = 1 in .td files.
Do not add default values for optional operands to OperandVector in AMDGPUAsmParser.
Add default values for optional operands during conversion using new helper addOptionalImmOperand.
Change to cvtVOP3_2_mod to check instruction flag instead of presence of modifiers. In the future, cvtVOP3* functions can be combined into one.
Separate cvtFlat and cvtFlatAtomic.
Fix CNDMASK_B32 definition to have no modifiers.
Review: http://reviews.llvm.org/D17445
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261856 91177308-0d34-0410-b5e6-96231b3b80d8
This creates the new-style LoopPassManager and wires it up with dummy
and print passes.
This version doesn't support modifying the loop nest at all. It will
be far easier to discuss and evaluate the approaches to that with this
in place so that the boilerplate is out of the way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261831 91177308-0d34-0410-b5e6-96231b3b80d8
The constant folding for sdiv and udiv has a big discrepancy between the
comments and the code, which looks like a typo. Currently, we're folding
X / undef pretty inconsistently:
0 / undef -> undef
C / undef -> 0
undef / undef -> 0
Whereas the comments state we do X / undef -> undef. The logic that
returns zero is actually commented as doing undef / X -> 0, despite that
the LHS isn't undef in many of the cases that hit it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261813 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Both the hardware and LLVM have changed since 2012.
Now, load-based heuristic don't show big differences any more on OoO cores.
There is no notable regressons and improvements on spec2000/2006. (Cortex-A57, Core i5).
Reviewers: spatel, zansari
Differential Revision: http://reviews.llvm.org/D16836
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261809 91177308-0d34-0410-b5e6-96231b3b80d8
(This is the second attemp to commit this patch, after fixing pr26652 & pr26653).
This patch detects vector reductions before instruction selection. Vector
reductions are vectorized reduction operations, and for such operations we have
freedom to reorganize the elements of the result as long as the reduction of them
stay unchanged. This will enable some reduction pattern recognition during
instruction combine such as SAD/dot-product on X86. A flag is added to
SDNodeFlags to mark those vector reduction nodes to be checked during instruction
combine.
To detect those vector reductions, we search def-use chains starting from the
given instruction, and check if all uses fall into two categories:
1. Reduction with another vector.
2. Reduction on all elements.
in which 2 is detected by recognizing the pattern that the loop vectorizer
generates to reduce all elements in the vector outside of the loop, which
includes several ShuffleVector and one ExtractElement instructions.
Differential revision: http://reviews.llvm.org/D15250
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261804 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes bugs in copy elimination code in llvm. It slightly changes the
semantics of clearRegisterKills(). This is appropriate because:
- Users in lib/CodeGen/MachineCopyPropagation.cpp and
lib/Target/AArch64RedundantCopyElimination.cpp and
lib/Target/SystemZ/SystemZElimCompare.cpp are incorrect without it
(see included testcase).
- All other users in llvm are unaffected (they pass TRI==nullptr)
- (Kill flags are optional anyway so removing too many shouldn't hurt.)
Differential Revision: http://reviews.llvm.org/D17554
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261763 91177308-0d34-0410-b5e6-96231b3b80d8
This wasn't causing a correctness issue, but was causing extra duplicate
entries to be added to the SummaryMap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261757 91177308-0d34-0410-b5e6-96231b3b80d8
The cleanupret instruction has an invariant that it's 'from' operand be
a cleanuppad. This invariant was violated when we removed a dead block
which removed a cleanuppad leaving behind a cleanupret with an undef
'from' operand.
This was solved in r261731 by staving off the removal of the dead block
to a later pass.
However, it occured to me that we do not need to do this.
Instead, we can simply avoid processing the cleanupret if it has an
undef 'from' operand because we know that it will be removed soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261754 91177308-0d34-0410-b5e6-96231b3b80d8
Part 1 of 2
This patch attempts to replace the insertion of zero scalars with a vector blend with zero, avoiding the use of the integer insertion instructions (which are particularly slow on many targets).
(Part 2 will add support for combining multiple blends-with-zero).
Differential Revision: http://reviews.llvm.org/D17483
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261743 91177308-0d34-0410-b5e6-96231b3b80d8
Prepare to support DPP encodings.
For DPP encodings, we want row_mask/bank_mask/bound_ctrl to be optional operands. However this means that when parsing instruction which has no mnemonic prefix, we cannot add both default values for VOP3 and for DPP optional operands to OperandVector - neither instructions would match. So add default values for optional operands to MCInst during conversion instead.
Mark more operands as IsOptional = 1 in .td files.
Do not add default values for optional operands to OperandVector in AMDGPUAsmParser.
Add default values for optional operands during conversion using new helper addOptionalImmOperand.
Change to cvtVOP3_2_mod to check instruction flag instead of presence of modifiers. In the future, cvtVOP3* functions can be combined into one.
Separate cvtFlat and cvtFlatAtomic.
Fix CNDMASK_B32 definition to have no modifiers.
Review: http://reviews.llvm.org/D17445
Reviewers: tstellarAMD
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261742 91177308-0d34-0410-b5e6-96231b3b80d8