Commit Graph

268 Commits

Author SHA1 Message Date
Guozhi Wei
19969b8b8f [TargetTransformInfo] Add a new public interface getInstructionCost
Current TargetTransformInfo can support throughput cost model and code size model, but sometimes we also need instruction latency cost model in different optimizations. Hal suggested we need a single public interface to query the different cost of an instruction. So I proposed following interface:

  enum TargetCostKind {
    TCK_RecipThroughput, ///< Reciprocal throughput.
    TCK_Latency,         ///< The latency of instruction.
    TCK_CodeSize         ///< Instruction code size.
  };

  int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const;

All clients should mainly use this function to query the cost of an instruction, parameter <kind> specifies the desired cost model.

This patch also provides a simple default implementation of getInstructionLatency.

The default getInstructionLatency provides latency numbers for only small number of instruction classes, those latency numbers are only reasonable for modern OOO processors. It can be extended in following ways:

   Add more detail into this function.
   Add getXXXLatency function and call it from here.
   Implement target specific getInstructionLatency function.

Differential Revision: https://reviews.llvm.org/D37170



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312832 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 22:29:17 +00:00
Zvi Rackover
8a2fcfe5be X86: Improve AVX512 fptoui lowering
Summary:
Add patterns for
  fptoui <16 x float> to <16 x i8>
  fptoui <16 x float> to <16 x i16>

Reviewers: igorb, delena, craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37505

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312704 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 07:40:34 +00:00
Matt Arsenault
c3f95e0648 AMDGPU: Don't assert in TTI with fp32 denorms enabled
Also refine for f16 and rcp cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312213 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-31 05:47:00 +00:00
Simon Pilgrim
b148872e50 [CostModel][X86][XOP] Improve costs for XOP shuffles
VPPERM/VPERMIL2PD/VPERMIL2PS all provide more effective 2-input shuffles than regular AVX instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311005 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 13:50:20 +00:00
Simon Pilgrim
6f0ee4dc78 [CostModel][X86] Add SSE2 two-src shuffle costs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310654 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-10 19:32:35 +00:00
Simon Pilgrim
9a17eb199c [CostModel][X86] Add avx1 two-src shuffle costs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310650 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-10 19:02:51 +00:00
Simon Pilgrim
572897038a [CostModel][X86] Add avx2 two-src shuffle costs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310645 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-10 18:29:34 +00:00
Simon Pilgrim
93a8a3f5b3 [CostModel][X86] Extend two src shuffle cost tests
Cover most 128/256/512/1024-bit cases for vXf64/vXi64, vXf32/vXi32, vXi16 + vXi8


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310641 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-10 18:02:45 +00:00
Simon Pilgrim
93a9cc077c [CostModel][X86] Add avx512vbmi broadcast/reverse/single-src shuffle cost tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310633 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-10 17:33:25 +00:00
Simon Pilgrim
b0b94e1117 [CostModel][X86] Improve single src shuffle costs
Add missing SK_PermuteSingleSrc costs for AVX2 targets and earlier, also added some of the simpler SK_PermuteTwoSrc costs to support splitting of SK_PermuteSingleSrc shuffles

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310632 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-10 17:27:20 +00:00
Simon Pilgrim
5f9237dac6 [CostModel][X86] Added v2f64/v2i64 single src shuffle model tests
Fixed label checks for all prefixes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310606 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-10 15:25:08 +00:00
Ulrich Weigand
b5bf1de320 [SystemZ] Add support for IBM z14 processor (2/3)
This adds support for the new 32-bit vector float instructions of z14.
This includes:
- Enabling the instructions for the assembler/disassembler.
- CodeGen for the instructions, including new LLVM intrinsics.
- Scheduler description support for the instructions.
- Update to the vector cost function calculations.

In general, CodeGen support for the new v4f32 instructions closely
matches support for the existing v2f64 instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308195 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-17 17:42:48 +00:00
Mohammed Agabaria
ffb8f09571 [X86][CM] update add\sub costs of vectors of 64 in X86\SLM arch
this patch updates the cost of addq\subq (add\subtract of vectors of 64bits)
based on the performance numbers of SLM arch.

Differential Revision: https://reviews.llvm.org/D33983



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306974 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-02 12:16:15 +00:00
Dorit Nuzman
65b3f67e1c [AVX2] [TTI CostModel] Add cost of interleaved loads/stores for AVX2
The cost of an interleaved access was only implemented for AVX512. For other
X86 targets an overly conservative Base cost was returned, resulting in
avoiding vectorization where it is actually profitable to vectorize.
This patch starts to add costs for AVX2 for most prominent cases of
interleaved accesses (stride 3,4 chars, for now).

Note1: Improvements of up to ~4x were observed in some of EEMBC's rgb
workloads; There is also a known issue of 15-30% degradations on some of these
workloads, associated with an interleaved access followed by type
promotion/widening; the resulting shuffle sequence is currently inefficient and
will be improved by a series of patches that extend the X86InterleavedAccess pass
(such as D34601 and more to follow).

Note 2: The costs in this patch do not reflect port pressure penalties which can
be very dominant in the case of interleaved accesses since most of the shuffle
operations are restricted to a single port. Further tuning, that may incorporate
these considerations, will be done on top of the upcoming improved shuffle
sequences (that is, along with the abovementioned work to extend
X86InterleavedAccess pass).


Differential Revision: https://reviews.llvm.org/D34023



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306238 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 08:26:25 +00:00
Simon Pilgrim
1313d75cd2 [CostModel][X86] Add scalar arithmetic cost tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305810 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-20 17:10:27 +00:00
Simon Pilgrim
7359f171d7 [CostModel][X86] Declare costs variables based on type
The alphabetical progression isn't that useful

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305808 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-20 17:04:46 +00:00
Matthew Simpson
59a0e24a58 Revert r291254: [AArch64] Reduce vector insert/extract cost for Falkor
The default vector insert/extract cost is more profitable on Falkor than the
reduced cost.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303771 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-24 16:48:39 +00:00
Simon Pilgrim
46742151c4 Fix line-endings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303448 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-19 19:47:29 +00:00
Simon Pilgrim
bf60d089f2 [X86][AVX512] Add 512-bit vector ctpop costs + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303342 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-18 10:42:34 +00:00
Simon Pilgrim
13c0638d33 [X86][AVX512] Add 512-bit vector ctlz costs + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303300 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-17 21:02:18 +00:00
Simon Pilgrim
3b4043264e [X86][AVX512] Add 512-bit vector cttz costs + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303293 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-17 20:22:54 +00:00
Simon Pilgrim
ace0d94c13 [X86] Split ctpop/ctlz/cttz cost tests
This will make things a lot easier to test all the permutations of avx512 

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303290 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-17 19:57:20 +00:00
Simon Pilgrim
221d5dcd28 [X86][AVX512] Add 512-bit vector bitreverse costs + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303283 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-17 19:20:20 +00:00
Jonas Paulsson
2460452d77 [SystemZ] Modelling of costs of divisions with a constant power of 2.
Such divisions will eventually be implemented with shifts which should
be reflected in the cost function.

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303254 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-17 12:46:26 +00:00
Simon Pilgrim
18a1f6e5cf [X86][AVX1] Account for cost of extract/insert of 256-bit shifts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303023 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-14 20:52:11 +00:00
Simon Pilgrim
0086ec9443 [X86][AVX2] Fix costs for v4i64 ashr by splat
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303022 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-14 20:25:42 +00:00
Simon Pilgrim
1ce51ae933 [X86][AVX1] Account for cost of extract/insert of 256-bit shifts by splat
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303021 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-14 20:02:34 +00:00
Simon Pilgrim
3f74b0ee2c [X86][AVX1] Account for cost of extract/insert of 256-bit SDIV/UDIV by mul sequences
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303017 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-14 18:52:15 +00:00
Simon Pilgrim
40a0941bb9 [X86][XOP] XOP's general v16i8 shifts will be used instead of v8i16 shift + mask.
Tweak cost model to match what lowering actually does.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303013 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-14 17:59:46 +00:00
Simon Pilgrim
7adcab9058 [X86][SSE] Account for cost of extract/insert of v32i8 vector shifts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303012 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-14 17:36:07 +00:00
Simon Pilgrim
90e60b5e99 [X86][XOP] Account for cost of extract/insert of 256-bit vector shifts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303010 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-14 13:38:53 +00:00
Matt Arsenault
c11234753f AMDGPU: Make some packed shuffles free
VOP3P instructions can encode access to either
half of the register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302730 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-10 21:29:33 +00:00
Matthew Simpson
a4fa9d3a63 [AArch64] Consider widening instructions in cost calculations
The AArch64 instruction set has a few "widening" instructions (e.g., uaddl,
saddl, uaddw, etc.) that take one or more doubleword operands and produce
quadword results. The operands are automatically sign- or zero-extended as
appropriate. However, in LLVM IR, these extends are explicit. This patch
updates TTI to consider these widening instructions as single operations whose
cost is attached to the arithmetic instruction. It marks extends that are part
of a widening operation "free" and applies a sub-target specified overhead
(zero by default) to the arithmetic instructions.

Differential Revision: https://reviews.llvm.org/D32706

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302582 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-09 20:18:12 +00:00
Simon Pilgrim
8d84aa2cce [X86][AVX1] Improve 256-bit vector costs for integer unary intrinsics.
Account for subvector extraction/insertion, helps prevent the vectorizers from selecting 256-bit vectors that will have to be split anyhow on AVX1 targets. 

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302378 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-07 20:58:55 +00:00
Elad Cohen
ea59a24717 Support arbitrary address space pointers in masked gather/scatter intrinsics.
Fixes PR31789 - When loop-vectorize tries to use these intrinsics for a
non-default address space pointer we fail with a "Calling a function with a
bad singature!" assertion. This patch solves this by adding the 'vector of
pointers' argument as an overloaded type which will determine the address
space.

Differential revision: https://reviews.llvm.org/D31490



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302018 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-03 12:28:54 +00:00
Renato Golin
3183fbc849 [SystemZ] Fix target specific tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300078 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 17:14:46 +00:00
Jonas Paulsson
2778098e17 [SystemZ] Updated test fp-cast.ll
This did not get included in the previous commit for SystemZ cost functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300053 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 12:11:41 +00:00
Jonas Paulsson
c33bdfa7b1 [SystemZ] TargetTransformInfo cost functions implemented.
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleavedMemoryOpCost() implemented.

Interleaved access vectorization enabled.

BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads,
in which case the cost of the z/sext instruction becomes 0.

Review: Ulrich Weigand, Renato Golin.
https://reviews.llvm.org/D29631

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300052 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 11:49:08 +00:00
Matt Arsenault
d706d030af AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298444 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 21:39:51 +00:00
Jonas Paulsson
078fc4ca72 [BasicTTIImpl] Bugfix in getIntrinsicInstrCost()
Don't call getScalarizationOverhead(RetTy, true, false) if RetTy is void type.

Review: Hal Finkel
https://reviews.llvm.org/D31024

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297954 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-16 14:05:34 +00:00
Simon Pilgrim
1a576c57ed [X86] Add missing BITREVERSE costs for SSE2 vectors and i8/i16/i32/i64 scalars
Prep work for PR31810

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297876 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-15 19:34:55 +00:00
Jonas Paulsson
85dd82a95b [TargetTransformInfo] getIntrinsicInstrCost() scalarization estimation improved
getIntrinsicInstrCost() used to only compute scalarization cost based on types.
This patch improves this so that the actual arguments are checked when they are
available, in order to handle only unique non-constant operands.

Tests updates:

Analysis/CostModel/X86/arith-fp.ll
Transforms/LoopVectorize/AArch64/interleaved_cost.ll
Transforms/LoopVectorize/ARM/interleaved_cost.ll

The improvement in getOperandsScalarizationOverhead() to differentiate on
constants made it necessary to update the interleaved_cost.ll tests even
though they do not relate to intrinsics.

Review: Hal Finkel
https://reviews.llvm.org/D29540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297705 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-14 06:35:36 +00:00
Guozhi Wei
4d5bc87951 [PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.

This patch fixes pr31492.

Differential Revision: https://reviews.llvm.org/D28630

This is resubmit of r292680, which was reverted by r293092. The internal application failures were actually caused by a source code bug.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295506 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-17 22:29:39 +00:00
Michael Kuperstein
a7092d68da [X86] Add costs for non-AVX512 single-source permutation integer shuffles
Differential Revision: https://reviews.llvm.org/D29416


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293932 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-02 20:27:13 +00:00
Michael Kuperstein
3ad5f1a3f2 [X86] Extend single-source shuffle cost test to test more arches. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293793 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-01 18:09:47 +00:00
Daniel Jasper
fe08370a7f Revert "[PPC] Give unaligned memory access lower cost on processor that supports it"
This reverts commit r292680. It is causing significantly worse
performance and test timeouts in our internal builds. I have already
routed reproduction instructions your way.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293092 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-25 21:21:08 +00:00
Guozhi Wei
31027cdf52 [PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.

This patch fixes pr31492.

Differential Revision: https://reviews.llvm.org/D28630



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292680 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-20 23:35:27 +00:00
Simon Pilgrim
0016b62b09 [CostModel][X86] Fix AVX512BW vector shift costs for vXi16 types
We already have patterns in place to support 128/256-bit shifts without AVX512VL

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292077 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-15 20:44:00 +00:00
Simon Pilgrim
3a60120921 [CostModel][X86] Drop separate AVX512VL checks - they match existing AVX512 costs
Keep the tests though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292076 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-15 20:19:28 +00:00
Simon Pilgrim
43b72e4d01 [CostModel][X86] Update vector shift tests to correctly check by non-constant uniform values.
Use shuffle( scslar_to_vector, zeroinitializer) pattern instead of shuffle( vec, zeroinitializer)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292075 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-15 20:10:28 +00:00