We used to hit an unreachable in getRegBankFromRegClass when dealing with the
stack pointer. This commit adds support for the GPRsp reg class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297621 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is a follow-up on r297580. It fixes the FIXME added temporarily
by that commit to keep the removal of Unroller's specialized version of
scalarizeInstruction() an NFC. See https://reviews.llvm.org/D30715 for details.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297610 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts r297596.
There were other issues that were making this not work that have been fixed now. Reverting this results in a more accurate table.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297602 91177308-0d34-0410-b5e6-96231b3b80d8
This exposed that we have several intrinsic instructions that have identical TSFlags to other instructions. We should merge their patterns and kill of the duplicate. I'll fix that in a follow up patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297596 91177308-0d34-0410-b5e6-96231b3b80d8
The immediate should be 1 or 2, not 0 or 1. This was found while adding bounds checking to clang. In fact the existing clang builtin test failed if we ran it all the way to assembly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297591 91177308-0d34-0410-b5e6-96231b3b80d8
I noticed unnecessary 'sbb' instructions in D30472 and while looking at 'ptest' codegen recently.
This happens because we were transforming any 'setb' - even when we only wanted a single-bit result.
This patch moves those transforms under visitAdd/visitSub, so we we're only creating sbb/adc when it
is a win. I don't know why we need a SETCC_CARRY node type, but I'm not proposing to change that
existing behavior in this patch.
Also, I'm skeptical that sbb/adc are a win for all micro-arches, so I added comments to the test files
where this transform still fires.
The test changes here are all cases where we no longer produce sbb/adc. Avoiding partial register
stalls (generating an xor to clear a register) is not handled in some cases, but that's a separate
issue.
Differential Revision: https://reviews.llvm.org/D30611
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297586 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
A53 scheduler causes an assertion failure on all CRC instructions:
include/llvm/CodeGen/MachineInstr.h:280: const llvm::MachineOperand
&llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i <
getNumOperands() && "getOperand() out of range!"' failed.
The case statements corresponding to CRC instructions are incorrect and should
be removed.
Also adding a testcase while on this.
Reviewers: t.p.northover, javed.absar, apazos, rengolin
Reviewed By: rengolin
Subscribers: evandro, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D30274
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297582 91177308-0d34-0410-b5e6-96231b3b80d8
Unroller's specialized scalarizeInstruction() is mostly duplicating Vectorizer's
variant. OTOH Vectorizer's scalarizeInstruction() already supports the special
case of VF==1 except for avoiding mask-bit extraction in that case. This patch
removes Unroller's specialized version in favor of a unified method.
The only functional difference between the two variants seems to be setting
memcheck metadata for loads and stores only in Vectorizer's variant, which is a
bug in Unroller. To keep this patch an NFC the unified method doesn't set
memcheck metadata for VF==1.
Differential Revision: https://reviews.llvm.org/D30715
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297580 91177308-0d34-0410-b5e6-96231b3b80d8
I'm pretty sure there are more problems lurking here. But I think this fixes PR32241.
I've added the test case from that bug and added asserts that will fail if we ever try to copy between high registers and mask registers again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297574 91177308-0d34-0410-b5e6-96231b3b80d8
Without SSE41 (pextrb) we currently extract byte elements from a vector by spilling to stack and reloading the byte.
This patch is an initial attempt at using MOVD/PEXTRW to extract the relevant DWORD/WORD from the vector and then shift+truncate to collect the correct byte.
Extraction of multiple bytes this way would result in code bloat, but as explained in the patch we could probably afford to be more aggressive with the supported extractions before again falling back on spilling - possibly through counting the number of extracts and which DWORD/WORD they originate?
Differential Revision: https://reviews.llvm.org/D29841
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297568 91177308-0d34-0410-b5e6-96231b3b80d8
Since v_max_f32_e64/v_max_f16_e64 can be folded if the target
instruction supports the clamp bit, we also need to maintain
modifiers when converting v_mac to v_mad.
This fixes a rendering issue with Dirt Rally because a v_mac
instruction with the clamp bit set was converted to a v_mad
but that bit was lost during the conversion.
Fixes: e184e01dd79 ("AMDGPU: Fold FP clamp as modifier bit")
Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297556 91177308-0d34-0410-b5e6-96231b3b80d8
When CMAKE_INSTALL_MANDIR isn't defined it ends up attempting to install
the man pages under "/man1" and we really don't want to accidentally install
stuff at the filesystem root.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297545 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Ths "cases" support was not quite finished, is unused, and is really just debug counters.
(well, almost, debug counters are slightly more powerful, in that they can skip things at the start, too).
Note, opt-bisect itself could also be implemented as a wrapper around
debug counters, but not sure it's worth it ATM.
I'll shove it on a todo list if we think it is.
Reviewers: MatzeB, chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30856
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297542 91177308-0d34-0410-b5e6-96231b3b80d8
r297310 began inserting red zones around allocations under ASan, which
perturbs the alignment of subsequent allocations. Deliberately specify
this in two places where it matters.
Fixes failures when these tests are run under ASan and UBSan together.
Reviewed by Duncan Exon Smith.
rdar://problem/30980047
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297540 91177308-0d34-0410-b5e6-96231b3b80d8