Commit Graph

3778 Commits

Author SHA1 Message Date
Sanjay Patel
e5a55cebc1 [DAGCombiner] narrow shuffle of concatenated vectors
// shuffle (concat X, undef), (concat Y, undef), Mask -->
// concat (shuffle X, Y, Mask0), (shuffle X, Y, Mask1)

The ARM changes with 'vtrn' and narrowed 'vuzp' are improvements.

The x86 changes look neutral or better. There's one test with an
extra instruction, but that could be reversed for a subtarget with
the right attributes. But by default, we want to avoid the 256-bit
op when possible (in my motivating benchmark, a handful of ymm ops
sprinkled into a sequence of xmm ops are triggering frequency
throttling on Haswell resulting in significantly worse perf).

Differential Revision: https://reviews.llvm.org/D60545

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358291 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 16:31:56 +00:00
David Green
45a375eb6b Revert rL357745: [SelectionDAG] Compute known bits of CopyFromReg
Certain optimisations from ConstantHoisting and CGP rely on Selection DAG not
seeing through to the constant in other blocks. Revert this patch while we come
up with a better way to handle that.

I will try to follow this up with some better tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358113 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-10 18:00:41 +00:00
Diogo N. Sampaio
ba3f83af6f [ARM] [FIX] Add missing f16 vector operations lowering
Summary:
Add missing <8xhalf> shufflevectors pattern, when using concat_vector dag node.
As well, allows <8xhalf> and <4xhalf> vldup1 operations.

These instructions are required for v8.2a fp16 lowering of vmul_n_f16, vmulq_n_f16 and vmulq_lane_f16 intrinsics.

Reviewers: olista01, pbarrio, LukeGeeson, efriedma

Reviewed By: efriedma

Subscribers: efriedma, javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60319



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358081 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-10 13:28:06 +00:00
Diana Picus
8c8976533c [ARM GlobalISel] Select G_FCONSTANT for VFP3
Make it possible to TableGen code for FCONSTS and FCONSTD.

We need to make two changes to the TableGen descriptions of vfp_f32imm
and vfp_f64imm respectively:
* add GISelPredicateCode to check that the immediate fits in 8 bits;
* extract the SDNodeXForms into separate definitions and create a
GISDNodeXFormEquiv and a custom renderer function for each of them.

There's a lot of boilerplate to get the actual value of the immediate,
but it basically just boils down to calling ARM_AM::getFP32Imm or
ARM_AM::getFP64Imm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358063 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-10 09:14:32 +00:00
Diana Picus
d8ccf753e3 [ARM GlobalISel] Select G_FCONSTANT into pools
Put all floating point constants into constant pools and load their
values from there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358062 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-10 09:14:24 +00:00
Diana Picus
735eff80dd [ARM GlobalISel] Map G_FCONSTANT
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358061 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-10 09:14:16 +00:00
Amara Emerson
e5bb56cb3c [GlobalISel][AArch64] Allow CallLowering to handle types which are normally
required to be passed as different register types. E.g. <2 x i16> may need to
be passed as a larger <2 x i32> type, so formal arg lowering needs to be able
truncate it back. Likewise, when dealing with returns of these types, they need
to be widened in the appropriate way back.

Differential Revision: https://reviews.llvm.org/D60425

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358032 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-09 21:22:33 +00:00
Simon Pilgrim
4cd136da59 [SelectionDAG] Add fcmp UNDEF handling to SelectionDAG::FoldSetCC
Second half of PR40800, this patch adds DAG undef handling to fcmp instructions to match the behavior in llvm::ConstantFoldCompareInstruction, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.).

This involves a lot of tweaking to reduced tests as bugpoint loves to reduce fcmp arguments to undef........

Differential Revision: https://reviews.llvm.org/D60006

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357765 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-05 14:56:21 +00:00
Piotr Sobczak
959b42493f [SelectionDAG] Compute known bits of CopyFromReg
Summary:
Teach SelectionDAG how to compute known bits of ISD::CopyFromReg if
the virtual reg used has one def only.

This can be particularly useful when calling isBaseWithConstantOffset()
with the ISD::CopyFromReg argument, as more optimizations may get enabled
in the result.

Also add a missing truncation on X86, found by testing of this patch.

Change-Id: Id1c9fceec862d118c54a5b53adf72ada5d6daefa

Reviewers: bogner, craig.topper, RKSimon

Reviewed By: RKSimon

Subscribers: lebedev.ri, nemanjai, jvesely, nhaehnle, javed.absar, jsji, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59535

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357745 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-05 07:44:09 +00:00
Diana Picus
80733cc0e9 [ARM GlobalISel] Support DBG_VALUE
Make sure we can map and select DBG_VALUE.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357681 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-04 10:24:51 +00:00
David L. Jones
a8e2ae4a63 Revert r357452 - 'SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)'
This revision causes tests to fail under ASAN. Since the cause of the failures
is not clear (could be ASAN, could be a Clang bug, could be a bug in this
revision), the safest course of action seems to be to revert while investigating.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357667 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-04 02:27:57 +00:00
Sanjay Patel
e0e9078d54 [DAGCombiner] loosen restrictions for moving shuffles after vector binop
There are 3 changes to make this correspond to the same transform in instcombine:
1. Remove the legality check - we can't create anything less legal than we started with.
2. Ease the use restriction, so we only bail out if both operands have >1 use.
3. Ease the use restriction for binops with a repeated operand (eg, mul x, x).

As discussed in D60150, there's a scalarization opportunity that will be made
easier by allowing this transform more generally.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357580 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-03 13:42:06 +00:00
Hans Wennborg
8abcd76492 SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)
The code was previously checking that candidates for sinking had exactly
one use or were a store instruction (which can't have uses). This meant
we could sink call instructions only if they had a use.

That limitation seemed a bit arbitrary, so this patch changes it to
"instruction has zero or one use" which seems more natural and removes
the need to special-case stores.

Differential revision: https://reviews.llvm.org/D59936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357452 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-02 08:01:38 +00:00
Eli Friedman
95e6c7dcce [ARM] Optimize expressions like "return x != 0;" for Thumb1.
There's an existing optimization for x != C, but somehow it was missing
a special case for 0.

While I'm here, also cleaned up the code/comments a bit: the second
value produced by the MERGE_VALUES was actually dead, since a CMOV only
produces one result.

Differential Revision: https://reviews.llvm.org/D59616



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357437 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-02 00:01:23 +00:00
Eli Friedman
271788f596 [ARM] Don't try to create "push {r12, lr}" in Thumb1 at -Oz.
It's a little tricky to make this issue show up because
prologue/epilogue emission normally likes to push at least two
registers... but it doesn't when lr is force-spilled due to function
length.  Not sure if that really makes sense, but I decided not to touch
it for now.

Differential Revision: https://reviews.llvm.org/D59385



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357436 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-01 23:55:57 +00:00
Simon Pilgrim
7de2c6d90d [ARM] Regenerate execute-only float comparison tests
Prep work for PR40800 (Add UNDEF handling to SelectionDAG::FoldSetCC) 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357293 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 18:21:19 +00:00
Nirav Dave
d3c5ebd041 [DAGCombine] Prune unnused nodes.
Summary:
Nodes that have no uses are eventually pruned when they are selected
from the worklist. Record nodes newly added to the worklist or DAG and
perform pruning after every combine attempt.

Reviewers: efriedma, RKSimon, craig.topper, spatel, jyknight

Reviewed By: jyknight

Subscribers: jdoerfert, jyknight, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58070

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357283 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 17:35:56 +00:00
Simon Pilgrim
2a7633ae4d [ARM] Regenerate vector comparison tests
Prep work for PR40800 (Add UNDEF handling to SelectionDAG::FoldSetCC) 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357281 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 17:35:11 +00:00
Diana Picus
5dc7f24f6a [ARM GlobalISel] Run regbankselect test for Thumb. NFCI
This should just work, since ARM mode and Thumb2 mode are at the same
level of support now and should map the same to GPR and FPR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357159 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-28 10:57:29 +00:00
Diana Picus
b94dc88e01 [ARM GlobalISel] Fix G_STORE with s1
G_STORE for 1-bit values uses a STRBi12, which stores the whole byte.
Zero out the undefined bits before writing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357154 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-28 09:09:36 +00:00
Diana Picus
51a43ae6d5 [ARM GlobalISel] Fix selection of G_SELECT
G_SELECT uses a 1-bit scalar for the condition, and is currently
implemented with a plain CMPri against 0. This means that values such as
0x1110 are interpreted as true, when instead the higher bits should be
treated as undefined and therefore ignored. Replace the CMPri with a
TSTri against 0x1, which performs an implicit AND, yielding the expected
result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357153 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-28 09:09:27 +00:00
Nirav Dave
b4adfc21eb Revert r356996 "[DAG] Avoid smart constructor-based dangling nodes."
This patch appears to trigger very large compile time increases in
halide builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357116 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 19:54:41 +00:00
Eli Friedman
953adb2fe4 [ARM] Don't confuse the scheduler for very large VLDMDIA etc.
ARMBaseInstrInfo::getNumLDMAddresses is making bad assumptions about the
memory operands of load and store-multiple operations.  This doesn't
really fix the problem properly, but it's enough to prevent crashing,
at least.

Fixes https://bugs.llvm.org/show_bug.cgi?id=41231 .

Differential Revision: https://reviews.llvm.org/D59834



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357109 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 18:33:30 +00:00
Nirav Dave
de6ac6d211 [DAG] Avoid smart constructor-based dangling nodes.
Various SelectionDAG non-combine operations (e.g. the getNode smart
constructor and legalization) may leave dangling nodes by applying
optimizations or not fully pruning unused result values. This can
result in nodes that are never added to the worklist and therefore can
not be pruned.

Add a node inserter as the current node deleter to make sure such
nodes have the chance of being pruned.

Many minor changes, mostly positive.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356996 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-26 15:08:14 +00:00
Diana Picus
fa60fe0a8f [ARM GlobalISel] 64-bit memops should be aligned
We currently use only VLDR/VSTR for all 64-bit loads/stores, so the
memory operands must be word-aligned. Mark aligned operations as legal
and narrow non-aligned ones to 32 bits.

While we're here, also mark non-power-of-2 loads/stores as unsupported.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356872 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-25 08:54:29 +00:00
Eli Friedman
925f59138d [ARM] Don't form "ands" when it isn't scheduled correctly.
In r322972/r323136, the iteration here was changed to catch cases at the
beginning of a basic block... but we accidentally deleted an important
safety check.  Restore that check to the way it was.

Fixes https://bugs.llvm.org/show_bug.cgi?id=41116

Differential Revision: https://reviews.llvm.org/D59680



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356809 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-22 20:49:15 +00:00
Evandro Menezes
f00c21b122 [AArch64, ARM] Add support for Exynos M5
Add Exynos M5 support and test cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356793 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-22 18:42:14 +00:00
Eli Friedman
283e227ba6 [ARM] Eliminate redundant "mov rN, sp" instructions in Thumb1.
This takes sequences like "mov r4, sp; str r0, [r4]", and optimizes them
to something like "str r0, [sp]".

For regular stack variables, this optimization was already implemented:
we lower loads and stores using frame indexes, which are expanded later.
However, when constructing a call frame for a call with more than four
arguments, the existing optimization doesn't apply.  We need to use
stores which are actually relative to the current value of sp, and don't
have an associated frame index.

This patch adds a special case to handle that construct.  At the DAG
level, this is an ISD::STORE where the address is a CopyFromReg from SP
(plus a small constant offset).

This applies only to Thumb1: in Thumb2 or ARM mode, a regular store
instruction can access SP directly, so the COPY gets eliminated by
existing code.

The change to ARMDAGToDAGISel::SelectThumbAddrModeSP is a related
cleanup: we shouldn't pretend that it can select anything other than
frame indexes.

Differential Revision: https://reviews.llvm.org/D59568



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356601 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-20 19:40:45 +00:00
Matt Arsenault
51c2ad77cd RegAllocFast: Remove early selection loop, the spill calculation will report cost 0 anyway for free regs
The 2nd loop calculates spill costs but reports free registers as cost
0 anyway, so there is little benefit from having a separate early
loop.

Surprisingly this is not NFC, as many register are marked regDisabled
so the first loop often picks up later registers unnecessarily instead
of the first one available in the allocation order...

Patch by Matthias Braun

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356499 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-19 19:01:34 +00:00
Eli Friedman
70a59574dd [ARM] Add MachineVerifier logic for some Thumb1 instructions.
tMOVr and tPUSH/tPOP/tPOP_RET have register constraints which can't be
expressed in TableGen, so check them explicitly. I've unfortunately run
into issues with both of these recently; hopefully this saves some time
for someone else in the future.

Differential Revision: https://reviews.llvm.org/D59383



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356303 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-15 21:44:49 +00:00
Sam Parker
7eaac1c18c [ARM] Remove EarlyCSE from backend
There is an issue with early CSE hitting an assert, so temporarily
remove the pass from the Arm backend.
    
Bug: https://bugs.llvm.org/show_bug.cgi?id=41081

Differential Revision: https://reviews.llvm.org/D59410


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356259 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-15 13:36:37 +00:00
Simon Pilgrim
13f8e3c482 [ARM] Remove icmp undef from reduced tests
Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC)

Approved by @efriedma (Eli Friedman)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356252 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-15 11:14:59 +00:00
Sam Parker
42dcf56122 [ARM][ParallelDSP] Disable for big-endian
Bail early when we don't have a preheader and also if the target is
big endian because it's written with only little endian in mind!

Differential Revision: https://reviews.llvm.org/D59368


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356243 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-15 10:19:32 +00:00
Sam Parker
acbed856d2 [NFC][ARM] Update test
Change some regex to handle commutable instructions. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356159 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-14 15:36:54 +00:00
Matt Arsenault
496c1dd07c ARM: Add ImmArg to intrinsics
I found these by asserting in clang for any GCCBuiltin that doesn't
require mangling and requires a constant for the builtin. This means
that intrinsics are missing which don't use GCCBuiltin, don't have
builtins defined in clang, or were missing the constant annotation in
the builtin definition.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356144 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-14 13:46:14 +00:00
Sam Parker
571398105e [ARM][ParallelDSP] Enable multiple uses of loads
When choosing whether a pair of loads can be combined into a single
wide load, we check that the load only has a sext user and that sext
also only has one user. But this can prevent the transformation in
the cases when parallel macs use the same loaded data multiple times.
    
To enable this, we need to fix up any other uses after creating the
wide load: generating a trunc and a shift + trunc pair to recreate
the narrow values. We also need to keep a record of which loads have
already been widened.

Differential Revision: https://reviews.llvm.org/D59215


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356132 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-14 11:14:13 +00:00
Sam Parker
01f20a4ee2 [ARM] Run ARMParallelDSP in the IRPasses phase
Run EarlyCSE before ParallelDSP and do this in the backend IR opt
phase.

Differential Revision: https://reviews.llvm.org/D59257


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356130 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-14 10:57:40 +00:00
Nirav Dave
154874adc5 [DAGCombiner] If a TokenFactor would be merged into its user, consider the user later.
Summary:
A number of optimizations are inhibited by single-use TokenFactors not
being merged into the TokenFactor using it. This makes we consider if
we can do the merge immediately.

Most tests changes here are due to the change in visitation causing
minor reorderings and associated reassociation of paired memory
operations.

CodeGen tests with non-reordering changes:

  X86/aligned-variadic.ll -- memory-based add folded into stored leaq
  value.

  X86/constant-combiners.ll -- Optimizes out overlap between stores.

  X86/pr40631_deadstore_elision -- folds constant byte store into
  preceding quad word constant store.

Reviewers: RKSimon, craig.topper, spatel, efriedma, courbet

Reviewed By: courbet

Subscribers: dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, eraman, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59260

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356068 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-13 17:07:09 +00:00
Nikita Popov
f7a652c414 [SDAG] Expand pow2 mulo using shifts
Expand MULO with constant power of two operand into a shift. The
overflow is checked with (x << shift) >> shift == x, where the right
shift will be logical for umulo and arithmetic for smulo (with
exception for multiplications by signed_min).

Differential Revision: https://reviews.llvm.org/D59041

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355937 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 16:57:25 +00:00
Sam Parker
88f0c562f5 [ARM][NFC] Delete original smlad tests
Because I don't understand svn.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355908 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 11:06:15 +00:00
Sam Parker
a7682f4fed [ARM][NFC] Move smlad tests
Created a test/CodeGen/ARM/ParallelDSP folder.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355907 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 11:01:11 +00:00
Nikita Popov
86f7633c4d [ARM] Use non-constant operand in umulo-32.ll; NFC
Currently the store+load is folded and both operands of the umulo
end up being constants. To avoid this getting folded away entirely,
make sure at least one operand is non-constant.

Also remove some allocas which don't seem relevant to the test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355776 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-09 13:43:21 +00:00
Nikita Popov
79fd81910e [ARM] Generate test checks for umulo-32.ll; NFC
The second test case is going to be changed by D59041, so generate
full baseline checks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355775 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-09 13:21:15 +00:00
David Green
8babc52a2b [LSR] Attempt to increase the accuracy of LSR's setup cost
In some loops, we end up generating loop induction variables that look like:
  {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1}
As opposed to the simpler:
  {(zext i16 (%i0 * %i1) to i32),+,-1}
i.e we count up from -limit to 0, not the simpler counting down from limit to
0. This is because the scores, as LSR calculates them, are the same and the
second is filtered in place of the first. We end up with a redundant SUB from 0
in the code.

This patch tries to make the calculation of the setup cost a little more
thoroughly, recursing into the scev members to better approximate the setup
required. The cost function for comparing LSR costs is:

return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds,
                C1.ScaleCost, C1.ImmCost, C1.SetupCost) <
       std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds,
                C2.ScaleCost, C2.ImmCost, C2.SetupCost);
So this will only alter results if none of the other variables turn out to be
different.

Differential Revision: https://reviews.llvm.org/D58770


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355597 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-07 13:44:40 +00:00
Oliver Stannard
3a22eb400f [ARM] Fix select_cc lowering for fp16
When lowering a select_cc node where the true and false values are of type f16,
we can't use a general conditional move because the FP16 instructions do not
support conditional execution. Instead, we must ensure that the condition code
is one of the four supported by the VSEL instruction.

Differential revision: https://reviews.llvm.org/D58813



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355385 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-05 10:42:34 +00:00
Oliver Stannard
b649874e8d [ARM] Fix selection of VLDR.16 instruction with imm offset
The isScaledConstantInRange function takes upper and lower bounds which are
checked after dividing by the scale, so the bounds checks for half, single and
double precision should all be the same. Previously, we had wrong bounds checks
for half precision, so selected an immediate the instructions can't actually
represent.

Differential revision: https://reviews.llvm.org/D58822



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355305 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-04 09:17:38 +00:00
Oliver Stannard
ed849dca81 [ARM] Fix FP16 stack loads/stores for Thumb2 with frame pointer
The new addressing mode added for the v8.2A FP16 instructions uses bit 8 of the
immediate to encode the sign of the offset, like the other FP loads/stores, so
need to be treated the same way.

Differential revision: https://reviews.llvm.org/D58816



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355201 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-01 14:20:28 +00:00
Oliver Stannard
c064077099 [ARM] Consider undefined-on-NaN conditions in checkVSELConstraints
This function was not checking for the condition code variants which are
undefined if either input is NaN, so we were missing selection of the VSEL
instruction in some cases when using -fno-honor-nans or -ffast-math.

Differential revision: https://reviews.llvm.org/D58812



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355199 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-01 13:58:25 +00:00
Diana Picus
de23a6b25f [ARM GlobalISel] Support G_CTLZ for Thumb2
Same as ARM mode but with different opcode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355191 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-01 10:12:28 +00:00
Diana Picus
6535445674 [ARM GlobalISel] Check target flags in test. NFCI
There was a time when we couldn't dump target-specific flags such as
arm-sbrel etc, so the tests didn't check for them. We can now be more
specific in our tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355189 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-01 10:01:22 +00:00