Commit Graph

31499 Commits

Author SHA1 Message Date
Eli Friedman
ffd5316983 [ARM] VFPv2 only supports 16 D registers.
r361845 changed the way we handle "D16" vs. "D32" targets; there used to
be a negative "d16" which removed instructions from the instruction set,
and now there's a "d32" feature which adds instructions to the
instruction set.  This is good, but there was an oversight in the
implementation: the behavior of VFPv2 was changed.  In particular, the
"vfp2" feature was changed to imply "d32". This is wrong: VFPv2 only
supports 16 D registers.

In practice, this means if you specify -mfpu=vfpv2, the compiler will
generate illegal instructions.

This patch gets rid of "vfp2d16" and "vfp2d16sp", and fixes "vfp2" and
"vfp2sp" so they don't imply "d32".

Differential Revision: https://reviews.llvm.org/D67375



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372186 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 21:42:38 +00:00
Jessica Paquette
91c51ebe6b [AArch64][GlobalISel] Support -tailcallopt
This adds support for `-tailcallopt` tail calls to CallLowering. This
piggy-backs off the changes from D67577, since doing it without a bit of
refactoring gets extremely ugly.

Support is basically ported from AArch64ISelLowering. The main difference here
is that tail calls in `-tailcallopt` change the ABI, so there's some extra
bookkeeping for the stack.

Show that we are correctly lowering these by updating tail-call.ll.

Also show that we don't do anything strange in general by updating
fastcc-reserved.ll, which passes `-tailcallopt`, but doesn't emit any tail
calls.

Differential Revision: https://reviews.llvm.org/D67580

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372177 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 20:24:23 +00:00
Craig Topper
9bb2585bad [X86] Simplify b2b KSHIFTL+KSHIFTR using demanded elts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372155 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 18:02:56 +00:00
Craig Topper
684413690b [X86] Call SimplifyDemandedVectorElts on KSHIFTL/KSHIFTR nodes during DAG combine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372154 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 18:02:52 +00:00
Nemanja Ivanovic
1d6894b4f8 [PowerPC] Exploit single instruction load-and-splat for word and doubleword
We currently produce a load, followed by (possibly a move for integers and) a
splat as separate instructions. VSX has always had a splatting load for
doublewords, but as of Power9, we have it for words as well. This patch just
exploits these instructions.

Differential revision: https://reviews.llvm.org/D63624


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372139 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 16:45:20 +00:00
David Green
fa16ddf26f [ARM] Add a SelectTAddrModeImm7 for MVE narrow loads and stores
We were previously using the SelectT2AddrModeImm7 for both normal and narrowing
MVE loads/stores. As the narrowing instructions do not accept sp as a register,
it makes little sense to optimise a FrameIndex into the load, only to have to
recover that later on. This adds a SelectTAddrModeImm7 which does not do that
folding, and uses it for narrowing load/store patterns.

Differential Revision: https://reviews.llvm.org/D67489


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372134 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 15:32:28 +00:00
David Green
f6b2271403 [ARM] Fixup pipeline test. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372133 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 15:25:24 +00:00
David Green
c399770b5c [ARM] Reserve an emergency spill slot for fp16 addressing modes that need it
Similar to D67327, but this time for the FP16 VLDR and VSTR instructions that
use the AddrMode5FP16 addressing mode. We need to reserve an emergency spill
slot for instructions that will be out of range to use sp directly.
AddrMode5FP16 is 8 bits with a scale of 2.

Differential Revision: https://reviews.llvm.org/D67483


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372132 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 15:23:09 +00:00
Sam Parker
f4c9416b67 [ARM] Fix for buildbots
Remove setPreservesCFG from ARMConstantIslandPass and add a couple
of -verify-machine-dom-info instances into the existing codegen
tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372126 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 14:21:36 +00:00
Sam Parker
ec6d4897ce [ARM] Fix for buildbots
Add --verifymachineinstrs and update the remaining low overhead loop
tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372121 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 13:46:26 +00:00
David Green
59654524ce [ARM] Fix for MVE load/store stack accesses
MVE loads and stores have a 7 bit immediate range, scaled by the length of the type. This needs to be taught to the stack estimation code to ensure that an emergency spill slot is reserved in case we run out of registers when materialising stack indices.

Also the narrowing loads/stores can be created with frame indices even though they do not accept SP as a register. We need in those cases to make sure we have an emergency register to use as the frame base, as SP can never be used.

Differential Revision: https://reviews.llvm.org/D67327


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372114 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 12:58:51 +00:00
Sam Parker
81c57d04ce [ARM][LowOverheadLoops] Add LR def safety check
Converting the *LoopStart pseudo instructions into DLS/WLS results in
LR being defined. These instructions were inserted on the assumption
that LR would already contain the loop counter because a mov is
introduced during ISel as the the consumers in the loop can only use
LR. That assumption proved wrong!

So perform a safety check, finding an appropriate place to insert the
DLS/WLS instructions or revert if this isn't possible.

Differential Revision: https://reviews.llvm.org/D67539

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372111 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 12:19:32 +00:00
Luis Marques
85abf858b2 [RISCV] Switch to the Machine Scheduler
Most of the test changes are trivial instruction reorderings and differing
register allocations, without any obvious performance impact.

Differential Revision: https://reviews.llvm.org/D66973

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372106 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 11:15:35 +00:00
Luis Marques
3049bc0fef Revert Patch from Phabricator
This reverts r372092 (git commit e38695a0255c9e7b53639f349f8101bae1ce5c04)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372104 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 10:52:09 +00:00
Luis Marques
955b80686a Patch from Phabricator
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372092 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 09:43:08 +00:00
David Bolvansky
077e4d5d58 [SimplifyLibCalls] Mark known arguments with nonnull
Reviewers: efriedma, jdoerfert

Reviewed By: jdoerfert

Subscribers: ychen, rsmith, joerg, aaron.ballman, lebedev.ri, uenoku, jdoerfert, hfinkel, javed.absar, spatel, dmgreen, llvm-commits

Differential Revision: https://reviews.llvm.org/D53342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372091 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 09:32:52 +00:00
Alexander Timofeev
7ae04842e0 [AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Fixed
Defferential Revision: https://reviews.llvm.org/D67101

Reviewers: rampitec, vpykhtin

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372086 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 09:08:58 +00:00
Sam Parker
1bff644c7f [ARM] LE support in ConstantIslands
The low-overhead branch extension provides a loop-end 'LE' instruction
that performs no decrement nor compare, it just jumps backwards. This
patch modifies the constant islands pass to try to insert LE
instructions in place of a Thumb2 conditional branch, instead of
shrinking it. This only happens if a cmp can be converted to a cbn/z
and used to exit the loop.

Differential Revision: https://reviews.llvm.org/D67404

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372085 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 09:08:05 +00:00
Craig Topper
e22e82f8d9 [X86] Split oversized vXi1 vector arguments and return values into scalars on avx512 targets.
Previously we tried to split them into narrower v64i1 or v16i1
pieces that each got promoted to vXi8 and then passed in a zmm
or xmm register. But this crashes when you need to pass more
pieces than available registers reserved for argument passing.

The scalarizing done here generates much longer and slower code,
but is consistent with the behavior of avx2 and earlier targets
for these types.

Fixes PR43323.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372069 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 04:41:14 +00:00
Craig Topper
61044921b4 [X86] Allow masked VBROADCAST instructions to be turned into BLENDM with a broadcast load to avoid a copy.
The BLENDM instructions allow an 2 sources and an independent
destination while masked VBROADCAST has the destination tied
to the source.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372068 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 04:41:10 +00:00
Craig Topper
878e3df175 [X86] Add support for commuting EVEX VCMP instructons with any immediate value.
Previously we limited to the EQ/NE/TRUE/FALSE/ORD/UNORD immediates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372067 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 04:41:05 +00:00
Craig Topper
76e8852a16 [X86] Add test case for missed opportunity to commute a VCMP instruction after unfolding one load in order to fold another load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372066 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 04:41:01 +00:00
Craig Topper
66c8578805 [X86] Enable commuting of EVEX VCMP for all immediate values during isel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372065 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 04:40:58 +00:00
Amara Emerson
e5fe899c4b [GlobalISel] Partially revert r371901.
r371901 was overeager and widenScalarDst() and the like in the legalizer
attempt to increment the insert point given in order to add new instructions
after the currently legalizing inst. In cases where the insertion point is not
exactly the current instruction, then callers need to de-compensate for the
behaviour by decrementing the insertion iterator before calling them. It's not
a nice state of affairs, for now just undo the problematic parts of the change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372050 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 23:46:03 +00:00
Lei Huang
ab2855edc9 [PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32
This is a follow up patch from https://reviews.llvm.org/D57857 to handle
extract_subvector v4f32.  For cases where we fpext of v2f32 to v2f64 from
extract_subvector we currently generate on P9 the following:

  lxv 0, 0(3)
  xxsldwi 1, 0, 0, 1
  xscvspdpn 2, 0
  xxsldwi 3, 0, 0, 3
  xxswapd 0, 0
  xscvspdpn 1, 1
  xscvspdpn 3, 3
  xscvspdpn 0, 0
  xxmrghd 0, 0, 3
  xxmrghd 1, 2, 1
  stxv 0, 0(4)
  stxv 1, 0(5)

This patch custom lower it to the following sequence:

  lxv 0, 0(3)       # load the v4f32 <w0, w1, w2, w3>
  xxmrghw 2, 0, 0   # Produce the following vector <w0, w0, w1, w1>
  xxmrglw 3, 0, 0   # Produce the following vector <w2, w2, w3, w3>
  xvcvspdp 2, 2     # FP-extend to <d0, d1>
  xvcvspdp 3, 3     # FP-extend to <d2, d3>
  stxv 2, 0(5)      # Store <d0, d1> (%vecinit11)
  stxv 3, 0(4)      # Store <d2, d3> (%vecinit4)

Differential Revision: https://reviews.llvm.org/D61961

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372029 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 20:04:15 +00:00
Roman Lebedev
ae054dc434 [ARM][Codegen] Autogenerate arm-cgp-casts.ll test.
Apparently it got broken by r372009 while i thought it was r372012.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372019 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 18:28:22 +00:00
Simon Pilgrim
a8399fa299 [X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operands
Determine if all of the uses of LHS/RHS operands can be replaced with a zero vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372013 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 17:30:33 +00:00
David Green
3be9fc5a0d [ARM] A predicate cast of a predicate cast is a predicate cast
The adds some very basic folding of PREDICATE_CASTS, removing cases when they
are chained together. These would already be removed eventually, as these are
lowered to copies. This just allows it to happen earlier, which can help other
simplifications.

Differential Revision: https://reviews.llvm.org/D67591


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372012 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 17:29:07 +00:00
Oliver Cruickshank
4e7f7eff1b [ARM] Add patterns for BSWAP intrinsic on MVE
BSWAP can use the VREV instruction on MVE to produce better results than
expanding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372002 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 15:20:10 +00:00
Oliver Cruickshank
c3d8b899f0 [ARM] Add patterns for bitreverse intrinsic on MVE
BITREVERSE can use the VBRSR which will reverse and right shift.
Shifting right by 0 will just reverse the bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372001 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 15:20:03 +00:00
Oliver Cruickshank
2498183766 [ARM] Lower CTTZ on MVE
Lower CTTZ on MVE using VBRSR and VCLS which will reverse the bits and
count the leading zeros, equivalent to a count trailing zeros (CTTZ).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372000 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 15:19:56 +00:00
Oliver Cruickshank
e68996a611 [ARM] Add patterns for CTLZ on MVE
CTLZ intrinsic can use the VCLS instruction on MVE, which produces
better results than expanding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371999 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 15:19:49 +00:00
Matt Arsenault
fef53526a6 AMDGPU/GlobalISel: Fix some broken run lines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371992 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:14:40 +00:00
Matt Arsenault
8f86794aa2 AMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEIL
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371991 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:14:37 +00:00
Matt Arsenault
4fb941a716 AMDGPU/GlobalISel: Remove another illegal select test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371990 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:14:31 +00:00
David Green
437310b30f [ARM] Fold VCMP into VPT
MVE has VPT instructions, which perform the duties of both a VCMP and a VPST in
a single instruction, performing the compare and starting the VPT block in one.
This teaches the MVEVPTBlockPass to fold them, searching back through the
basicblock for a valid VCMP and creating the VPT from its operands.

There are some changes to the VPT instructions to accommodate this, altering
the order of the operands to match the VCMP better, and changing P0 register
defs to be VPR defs, as is used in other places.

Differential Revision: https://reviews.llvm.org/D66577


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371982 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 13:02:41 +00:00
Kerry McLaughlin
48d00babac [SVE][Inline-Asm] Add constraints for SVE predicate registers
Summary:
Adds the following inline asm constraints for SVE:
  - Upl: One of the low eight SVE predicate registers, P0 to P7 inclusive
  - Upa: SVE predicate register with full range, P0 to P15

Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, cameron.mcinally, greened, rengolin

Reviewed By: rovka

Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371967 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 09:45:27 +00:00
Sjoerd Meijer
71dc0b23c7 [AArch64] Some more FP16 FMA pattern matching
After our previous machinecombiner exercises (rL371321, rL371818, rL371833), we
were still missing a few FP16 FMA patterns.

Differential Revision: https://reviews.llvm.org/D67576

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371960 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 07:32:13 +00:00
Matt Arsenault
7511280318 AMDGPU/GlobalISel: Remove illegal select tests
These fail in a release build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371955 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 04:21:10 +00:00
Matt Arsenault
4cfc531c99 AMDGPU/GlobalISel: Select SMRD loads for more types
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371954 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:54:07 +00:00
Matt Arsenault
796815a826 AMDGPU/GlobalISel: RegBankSelect for kill
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371953 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:48:37 +00:00
Matt Arsenault
54dea4c5ee AMDGPU/GlobalISel: Legalize s1 source G_[SU]ITOFP
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371952 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:37:10 +00:00
Matt Arsenault
8917673a2f AMDGPU/GlobalISel: Set type on vgpr live in special arguments
Fixes assertion with workitem ID intrinsics used in non-kernel
functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371951 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:33:00 +00:00
Matt Arsenault
83c97ac441 AMDGPU/GlobalISel: Select S16->S32 fptoint
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371950 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:32:56 +00:00
Matt Arsenault
cadee98a6c AMDGPU/GlobalISel: Select s32->s16 G_[US]ITOFP
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371949 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:29:12 +00:00
Matt Arsenault
832d1e1170 AMDGPU/GlobalISel: Fix VALU s16 fneg
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371948 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:20:54 +00:00
Jinsong Ji
137a26a97a [PowerPC][NFC] Add a testcase for fdiv expansion.
Pre-commit for following patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371938 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 20:02:25 +00:00
David Green
f343dcd487 [ARM] Masked loads and stores
Masked loads and store fit naturally with MVE, the instructions being easily
predicated. This adds lowering for the simple cases of masked loads and stores.
It does not yet deal with widening/narrowing or pre/post inc, and so is
currently behind an option.

The llvm masked load intrinsic will accept a "passthru" value, dictating the
values used for the zero masked lanes. In MVE the instructions write 0 to the
zero predicated lanes, so we need to match a passthru that isn't 0 (or undef)
with a select instruction to pull in the correct data after the load.

Differential Revision: https://reviews.llvm.org/D67186


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371932 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 14:14:47 +00:00
David Green
2d52f8b93f [ARM] Simplify and update vmla test. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371930 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 11:53:05 +00:00
Simon Pilgrim
88ae6d603d [TargetLowering] SimplifyDemandedBits - add EXTRACT_SUBVECTOR support.
Call SimplifyDemandedBits on the source vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371923 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-14 16:38:26 +00:00