161 Commits

Author SHA1 Message Date
Sanjay Patel
7b05e5c94e [x86, SSE/AVX] allow 128/256-bit lowering for copysign vector intrinsics (PR30433)
This should fix:
https://llvm.org/bugs/show_bug.cgi?id=30433

There are a couple of open questions about the codegen:
1. Should we let scalar ops be scalars and avoid vector constant loads/splats?
2. Should we have a pass to combine constants such as the inverted pair that we have here?

Differential Revision: https://reviews.llvm.org/D25165
 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283119 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-03 16:38:27 +00:00
Simon Pilgrim
065e5924cd [CostModel][X86] Added tests for current fptosi/fptoui costs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283047 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 19:09:59 +00:00
Simon Pilgrim
1df457ad5e [CostModel][X86] Added fcopysign costs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283044 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 16:41:52 +00:00
Simon Pilgrim
089e03e79a [CostModel][X86] Added fabs costs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283042 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 16:30:13 +00:00
Simon Pilgrim
08a3649b8b [CostModel][X86] Added scalar float op costs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281864 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-18 21:01:20 +00:00
Simon Pilgrim
fc26436d17 [CostModel][X86] Removed shift tests
There are more thorough tests found in vshift-*-cost.ll 

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279406 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-21 19:56:02 +00:00
Simon Pilgrim
26eaf4f816 [CostModel][X86] Added costs for vXi16 and vXi8 vectors for add/sub/mul/and/or/xor tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279405 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-21 19:44:44 +00:00
Simon Pilgrim
799c5ebb1b [CostModel][X86] Replaced SSSE3 with SSE2 costs to create a better baseline
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279404 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-21 19:14:48 +00:00
Simon Pilgrim
4630834e7c [CostModel][X86] Added fsqrt and fma costs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279403 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-21 19:06:25 +00:00
Simon Pilgrim
4597dd96a3 [CostModel][X86] Split off float arithmetic cost tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279402 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-21 18:34:47 +00:00
Simon Pilgrim
18333ab15e [CostModel][X86] Added sub, or, and, fadd and fsub costs and missing 512-bit mul costs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279301 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-19 19:07:10 +00:00
Simon Pilgrim
07010a7197 [CostModel][X86] Added some AVX512 and 512-bit vector cost tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279291 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-19 18:24:10 +00:00
Simon Pilgrim
50fbf501e3 [CostModel][X86] Add fdiv + frem cost tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279283 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-19 17:39:00 +00:00
Michael Kuperstein
e788186982 [LV, X86] Be more optimistic about vectorizing shifts.
Shifts with a uniform but non-constant count were considered very expensive to
vectorize, because the splat of the uniform count and the shift would tend to
appear in different blocks. That made the splat invisible to ISel, and we'd
scalarize the shift at codegen time.

Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we
are able to select the appropriate vector shifts. This updates the cost model to
to take this into account by making shifts by a uniform cheap again.

Differential Revision: https://reviews.llvm.org/D23049


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277782 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-04 22:48:03 +00:00
Simon Pilgrim
2bdfdf95e6 [X86] Dropped XOP ctbits checks - they match the AVX checks
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277718 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-04 11:04:13 +00:00
Simon Pilgrim
d81c2d5aa5 [X86][SSE] Add initial costs for vector CTTZ/CTLZ
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277716 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-04 10:51:41 +00:00
Simon Pilgrim
002e4b5635 [X86][SSE] Add cost model values for CTPOP of vectors
This patch adds costs for the vectorized implementations of CTPOP, the default values were seriously underestimating the cost of these and was encouraging vectorization on targets where serialized use of POPCNT would be much better.

Differential Revision: https://reviews.llvm.org/D22456

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276104 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-20 10:41:28 +00:00
Simon Pilgrim
08df0eb04c [X86] Add CTPOP/CTLZ/CTTZ scalar cost tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275725 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-17 18:29:19 +00:00
Michael Kuperstein
b1fce5cc4c [X86] Make some cast costs more precise
Make some AVX and AVX512 cast costs more precise.
Based on part of a patch by Elena Demikhovsky (D15604).

Differential Revision: http://reviews.llvm.org/D22064


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275106 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11 21:39:44 +00:00
Sanjay Patel
368e7e3ad1 [x86] fix cost of SINT_TO_FP for i32 --> float (PR21356, PR28434)
This is "cvtdq2ps" which does not appear to be particularly slow on any CPU
according to Agner's tables. Choosing "5" as a cost here as suggested in:
https://llvm.org/bugs/show_bug.cgi?id=21356
...but it seems very conservative given that the instruction is fully pipelined,
and I think these costs are supposed to model throughput.

Note that related costs are also most likely too high, but this fixes PR21356
and partly fixes PR28434.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274658 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-06 19:15:54 +00:00
Michael Kuperstein
c7432f9ad3 [TTI] The cost model should not assume vector casts get completely scalarized
The cost model should not assume vector casts get completely scalarized, since
on targets that have vector support, the common case is a partial split up to
the legal vector size. So, when a vector cast  gets split, the resulting casts
end up legal and cheap.

Instead of pessimistically assuming scalarization, base TTI can use the costs
the concrete TTI provides for the split vector, plus a fudge factor to account
for the cost of the split itself. This fudge factor is currently 1 by default,
except on AMDGPU where inserts and extracts are considered free.

Differential Revision: http://reviews.llvm.org/D21251


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274642 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-06 17:30:56 +00:00
Nemanja Ivanovic
ba988eb430 [PowerPC] - Legalize vector types by widening instead of integer promotion
This patch corresponds to review:
http://reviews.llvm.org/D20443

It changes the legalization strategy for illegal vector types from integer
promotion to widening. This only applies for vectors with elements of width
that is a multiple of a byte since we have hardware support for vectors with
1, 2, 3, 8 and 16 byte elements.
Integer promotion for vectors is quite expensive on PPC due to the sequence
of breaking apart the vector, extending the elements and reconstituting the
vector. Two of these operations are expensive.
This patch causes between minor and major improvements in performance on most
benchmarks. There are very few benchmarks whose performance regresses. These
regressions can be handled in a subsequent patch with a DAG combine (similar
to how this patch handles int -> fp conversions of illegal vector types).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274535 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-05 09:22:29 +00:00
Artur Pilipenko
48917c9e44 Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274043 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-28 18:27:25 +00:00
Artur Pilipenko
be0da39a48 Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273895 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 16:54:33 +00:00
Artur Pilipenko
9227558e8e Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273892 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 16:29:26 +00:00
Michael Kuperstein
4e93d1c1e6 [X86] Make arithmetic operations cost model test saner. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273316 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-21 20:41:40 +00:00
Simon Pilgrim
06026c4ca3 [X86][SSE] Add cost model for BSWAP of vectors
The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization.

Differential Revision: http://reviews.llvm.org/D21521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273217 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-20 23:08:21 +00:00
Simon Pilgrim
07ef6bb26b [CostModel][X86][SSE] Updated costs for vector BITREVERSE ops on SSSE3+ targets
To account for the fast PSHUFB implementation now available

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272484 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-11 19:23:02 +00:00
Michael Kuperstein
e47deb88ee [X86] Add costs for SSE zext/sext to v4i64 to TTI
The costs are somewhat hand-wavy, but should be much closer to the truth
than what we get from BasicTTI.

Differential Revision: http://reviews.llvm.org/D21156


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272406 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 17:01:05 +00:00
Simon Pilgrim
611c8e7533 [CostModel][X86][XOP] Added XOP costmodel for BITREVERSE
Now that we have a nice fast VPPERM solution. Added framework for future intrinsic costs as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270537 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-24 08:17:50 +00:00
Simon Pilgrim
7537c45fbd [CostModel][X86] Tidied up checks
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269770 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-17 14:43:41 +00:00
Simon Pilgrim
78ba7287ef [CostModel][X86] Added scalar bitreverse tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269594 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-15 17:40:48 +00:00
Simon Pilgrim
15a59473b3 [X86][SSE] Improve cost model for i64 vector comparisons on pre-SSE42 targets
As discussed on PR24888, until SSE42 we don't have access to PCMPGTQ for v2i64 comparisons, but the cost models don't reflect this, resulting in over-optimistic vectorizaton.

This patch adds SSE2 'base level' costs that match what a typical target is capable of and only reduces the v2i64 costs at SSE42.

Technically SSE41 provides a PCMPEQQ v2i64 equality test, but as getCmpSelInstrCost doesn't give us a way to discriminate between comparison test types we can't easily make use of this, otherwise we could split the cost of integer equality and greater-than tests to give better costings of each.

Differential Revision: http://reviews.llvm.org/D20057

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268972 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-09 21:14:38 +00:00
Simon Pilgrim
ba60f16656 [CostModel][X86] Extended comparison instruction cost model tests to include SSE2/SSE3/SSSE3/SSE41/SSE42 targets
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268877 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-08 15:24:53 +00:00
Simon Pilgrim
a3ea594334 [CostModel][X86] Split BSWAP/BITREVERSE cost tests from CTPOP/CTLZ/CTTZ 'bit count' cost tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268859 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-07 16:34:16 +00:00
Simon Pilgrim
c67a48457d [CostModel][X86] Tweak 'SSE2-only' test CPU as it was only disabling SSE41 not SSE3/SSSE3 etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268763 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-06 17:50:07 +00:00
Simon Pilgrim
556c69a0f5 [CostModel][X86] Added ctlz/cttz undef-zero costmodel tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268761 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-06 17:48:35 +00:00
Simon Pilgrim
179257b158 [CostModel][X86] Added costmodel tests for vector ctpop/ctlz/cttz/bitreverse/bswap
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268738 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-06 14:38:14 +00:00
Ashutosh Nema
91d7ac06b6 [X86]: Changing cost for “TRUNCATE v16i32 to v16i8” in SSE4.1 mode.
Summary:
rL256194 transforms truncations between vectors of integers into PACKUS/PACKSS
operations during DAG combine. This generates better code for truncate, so cost
of truncate needs to be changed but looks like it got changed only in SSE2 table
Whereas this change is also applicable for SSE4.1, so the cost of truncate needs
to be changed for that as well. Cost of “TRUNCATE v16i32 to v16i8” & “TRUNCATE 
v16i16 to v16i8” should be same in SSE4.1 & SSE2 table. Removing their cost from
SSE4.1, so it will fall back to SSE2.

Reviewers: Simon Pilgrim


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267123 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 08:34:05 +00:00
Adam Nemet
cf0a711bff Revert "Support arbitrary addrspace pointers in masked load/store intrinsics"
This reverts commit r266086.

It breaks the LTO build of gcc in SPEC2000.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266282 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-14 08:47:17 +00:00
Artur Pilipenko
80ce67004b Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change.

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266086 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-12 15:58:04 +00:00
Benjamin Kramer
638cd0356d [TTI] Let the cost model estimate ctpop costs based on legality
PPC has a vector popcount, this lets the vectorizer use the correct cost
for it. Tweak X86 test to use an intrinsic that's actually scalarized (we
have a somewhat efficient lowering for vector popcount using SSE, the
cost model finds that now).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265005 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-31 10:42:40 +00:00
Matt Arsenault
bb366db643 AMDGPU: Cost model for basic integer operations
This resolves bug 21148 by preventing promotion to
i64 induction variables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264376 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 01:16:40 +00:00
Matt Arsenault
359a7d918e AMDGPU: Partially implement getArithmeticInstrCost for FP ops
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264374 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 01:00:32 +00:00
Matt Arsenault
e4e369ab90 TTI: Report 0 cost for free addrspacecasts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264369 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 00:26:29 +00:00
Matt Arsenault
93e0b28a0e TTI: Use 0 for cost of fabs if free
Ideally this would also happen for fneg, but that
isn't a distinct operation in the IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264368 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 00:26:22 +00:00
Matt Arsenault
42792ff39d AMDGPU: TTI: Make insertelement free.
We don't want to have a cost to scalarizing operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264364 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 00:14:11 +00:00
Matthias Braun
a31e891389 Revert "Support arbitrary addrspace pointers in masked load/store intrinsics"
This commit broke LTO builds. Reverting it to unbreak the bots while the
issue is investigated. See also:

http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160321/341002.html

This reverts r263158

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264088 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-22 20:24:34 +00:00
Artur Pilipenko
980df33d17 Support arbitrary addrspace pointers in masked load/store intrinsics
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263158 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 20:39:22 +00:00
Matthew Simpson
449a562ef6 [AArch64] Reduce vector insert/extract cost for Kryo
Differential Revision: http://reviews.llvm.org/D17379

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261237 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-18 18:35:45 +00:00