RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-23 13:56:06 +00:00

Author	SHA1	Message	Date
Keno Fischer	15bfd0f9a0	[X86 TTI] Implement LSV hook Summary: LSV wants to know the maximum size that can be loaded to a vector register. On X86, this always matches the maximum register width. Implement this accordingly and add a test to make sure that LSV can vectorize up to the maximum permissible width on X86. Reviewers: delena, arsenm Reviewed By: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D31504 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299589 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-05 20:51:38 +00:00
Simon Pilgrim	1a576c57ed	[X86] Add missing BITREVERSE costs for SSE2 vectors and i8/i16/i32/i64 scalars Prep work for PR31810 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297876 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-15 19:34:55 +00:00
Simon Pilgrim	39fad26ce4	Align cost model columns. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297824 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-15 11:57:42 +00:00
Jonas Paulsson	85dd82a95b	[TargetTransformInfo] getIntrinsicInstrCost() scalarization estimation improved getIntrinsicInstrCost() used to only compute scalarization cost based on types. This patch improves this so that the actual arguments are checked when they are available, in order to handle only unique non-constant operands. Tests updates: Analysis/CostModel/X86/arith-fp.ll Transforms/LoopVectorize/AArch64/interleaved_cost.ll Transforms/LoopVectorize/ARM/interleaved_cost.ll The improvement in getOperandsScalarizationOverhead() to differentiate on constants made it necessary to update the interleaved_cost.ll tests even though they do not relate to intrinsics. Review: Hal Finkel https://reviews.llvm.org/D29540 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297705 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-14 06:35:36 +00:00
Michael Kuperstein	a7092d68da	[X86] Add costs for non-AVX512 single-source permutation integer shuffles Differential Revision: https://reviews.llvm.org/D29416 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293932 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-02 20:27:13 +00:00
Jonas Paulsson	7cb6abb7cb	[TargetTransformInfo] Refactor and improve getScalarizationOverhead() Refactoring to remove duplications of this method. New method getOperandsScalarizationOverhead() that looks at the present unique operands and add extract costs for them. Old behaviour was to just add extract costs for one operand of the type always, which still happens in getArithmeticInstrCost() if no operands are provided by the caller. This is a good start of improving on this, but there are more places that can be improved by using getOperandsScalarizationOverhead(). Review: Hal Finkel https://reviews.llvm.org/D29017 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293155 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-26 07:03:25 +00:00
Mohammed Agabaria	e0bafdf059	[X86] enable memory interleaving for X86\SLM arch. Differential Revision: https://reviews.llvm.org/D28547 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293040 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-25 09:14:48 +00:00
Simon Pilgrim	888319b41c	Remove trailing whitespace. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292613 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 15:15:59 +00:00
Simon Pilgrim	28ad0fea24	[CostModel][X86] Removed unused cost. NFCI. SHL v8i32 is already handled in the SSE41 cost table git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292612 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 15:14:38 +00:00
Simon Pilgrim	0016b62b09	[CostModel][X86] Fix AVX512BW vector shift costs for vXi16 types We already have patterns in place to support 128/256-bit shifts without AVX512VL git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292077 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-15 20:44:00 +00:00
Simon Pilgrim	75f614f4c2	[CostModel][X86] Updated vXi64 ASHR costs on AVX512 targets now that D28604 has landed git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292023 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-14 19:24:23 +00:00
Simon Pilgrim	b237097a65	[X86][AVX512BW] Vectorize v64i8 vector shifts Differential Revision: https://reviews.llvm.org/D28447 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291665 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 10:36:51 +00:00
Mohammed Agabaria	9c6b24cc3a	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch. updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291657 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 08:23:37 +00:00
Simon Pilgrim	35f85cb068	[CostModel][X86] Fixed vXi8 uniform shift costs. The 'fast' costs should only work for shifts by uniform constants (uniform non-constant are lowered using the slow default implementation). Logical shifts were not taking into account that we must mask the psrlw result, so the costs needed to be doubled. Added missing AVX2/AVX512BW costs as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291391 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-08 14:14:36 +00:00
Simon Pilgrim	93f6323c31	[CostModel][X86] Moved legal uniform shift costs earlier. XOP was prematurely matching, doubling the cost of ashr/lshr uniform shifts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291390 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-08 13:12:03 +00:00
Simon Pilgrim	4f2c5010fd	[CostModel][X86] Update SSE41/AVX1 vXi32 SHL costs SSE41 provides pmulld which allows the simpler pslld/paddd/cvttps2dq/pmulld pattern than SSE2's use of pmuludq. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291372 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-07 22:27:43 +00:00
Simon Pilgrim	ee6faf574a	[CostModel][X86] Fix AVX2 v16i16 shift 'splat' costs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291366 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-07 22:08:09 +00:00
Simon Pilgrim	371d289738	[CostModel][X86] Match 256-bit vector shift 'splat' costs for AVX2 and above We were matching against general vector shift costs before the uniform splat costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291365 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-07 21:47:10 +00:00
Simon Pilgrim	e1ddc8e7d2	[CostModel][X86] Generalized cost calculation of SHL by constant -> MUL conversion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291364 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-07 21:33:00 +00:00
Simon Pilgrim	129141ec2f	[CostModel][X86] Merge separate AVX1 cost LUTs. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291355 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-07 18:19:25 +00:00
Simon Pilgrim	21886bd4a8	[CostModel][AVX512BW] Add v32i16 vector shift costs for avx512bw targets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291354 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-07 17:54:10 +00:00
Simon Pilgrim	5988cea66b	[CostModel][X86] Added missing AVX2 arithmetic costs. Allows us to correctly fall through to the lower AVX1 costs if look up failed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291353 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-07 17:27:39 +00:00
Simon Pilgrim	9724d35716	[CostModel][X86] Reordered AVX1 arithmetic cost LUT into descending target order. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291352 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-07 17:03:51 +00:00
Simon Pilgrim	f9fdf76b96	[X86][AVX512] Use lowerShuffleAsRepeatedMaskAndLanePermute for non-VBMI v64i8 shuffles (PR31470) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291347 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-07 15:37:50 +00:00
Simon Pilgrim	e5088f5e84	[CostModel][X86] Fix 512-bit SDIV/UDIV 'big' costs. Set the costs on the lowest target that supports the type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291229 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-06 11:12:53 +00:00
Simon Pilgrim	39e9d60ebd	[CostModel][X86] Tidyup arithmetic costs code. NFCI. Remove unnecessary braces, remove one use variables and keep LUTs to similar naming convention. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291187 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 22:48:02 +00:00
Simon Pilgrim	38716ac7b5	[CostModel][X86] Move vXi32 MUL costs into existing tables. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291165 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 19:42:43 +00:00
Simon Pilgrim	7c5fe202d7	Remove trailing whitespace. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291163 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 19:24:25 +00:00
Simon Pilgrim	a04ecdd76e	[CostModel][X86] Reordered SSE42 arithmetic cost LUT into descending order. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291162 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 19:19:39 +00:00
Simon Pilgrim	688489139f	[CostModel][X86] Move vXi64 MUL costs into existing tables. NFCI. Removes need for yet another LUT. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291158 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 19:01:50 +00:00
Simon Pilgrim	944aa53645	[CostModel][X86] Strip unused 256-bit vector shift costs. NFCI. Remove SSE2 256-bit entries - AVX targets will have used the SSE42 costs instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291152 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 18:36:48 +00:00
Simon Pilgrim	6c924280fe	[CostModel][X86] Include the cost of 256-bit upper subvector extract/insertion in AVX1 v4i64 MUL Matches other MUL/ADD/SUB 256-bit case on AVX1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291149 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 18:20:25 +00:00
Simon Pilgrim	46a2104916	[CostModel][X86] Merged SK_PermuteSingleSrc/SK_PermuteTwoSrc into common shuffle cost LUTs. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291146 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 17:56:19 +00:00
Simon Pilgrim	2cf83fd91e	[CostModel][X86] Add support for broadcast shuffle costs Currently only for broadcasts with input and output of the same width. Differential Revision: https://reviews.llvm.org/D27811 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291122 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 15:56:08 +00:00
Simon Pilgrim	8e5a39580d	[CostModel][X86] Pulled out common type legalization code git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291109 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 14:33:32 +00:00
Mohammed Agabaria	6bf7471dbc	Currently isLikelyComplexAddressComputation tries to figure out if the given stride seems to be 'complex' and need some extra cost for address computation handling. This code seems to be target dependent which may not be the same for all targets. Passed the decision whether the given stride is complex or not to the target by sending stride information via SCEV to getAddressComputationCost instead of 'IsComplex'. Specifically at X86 targets we dont see any significant address computation cost in case of the strided access in general. Differential Revision: https://reviews.llvm.org/D27518 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291106 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 14:03:41 +00:00
Mohammed Agabaria	030c24dcda	[Test Commit] fixing some format issue in X86TTI to match clang-format output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291095 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-05 09:51:02 +00:00
Simon Pilgrim	19aab9f9fa	[CostModel][X86] Updated vXi8 and vXi16 Reverse/Alternate shuffle costs Actual codegen is much better than the extract+insert patterns that was assumed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290962 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-04 14:01:33 +00:00
Simon Pilgrim	7b65390293	[X86] Merged Reverse/Alternate shuffle cost tables. NFCI. As discussed on D27811, merged the shuffle cost LUTs and use the shuffle kind to perform the lookup instead of the ISD opcode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290956 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-04 12:08:41 +00:00
Elena Demikhovsky	6c207672d9	Fixed shuffle-reverse cost on AVX-512. (This changed was approved in https://reviews.llvm.org/D28118, but Simon asked to submit it separately). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290812 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-02 11:44:10 +00:00
Elena Demikhovsky	c2b6a16ee9	AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns. X86 target does not provide any target specific cost calculation for interleave patterns.It uses the common target-independent calculation, which gives very high numbers. As a result, the scalar version is chosen in many cases. The situation on AVX-512 is even worse, since we have 3-src shuffles that significantly reduce the cost. In this patch I calculate the cost on AVX-512. It will allow to compare interleave pattern with gather/scatter and choose a better solution (PR31426). * Shiffle-broadcast cost will be changed in Simon's upcoming patch. Differential Revision: https://reviews.llvm.org/D28118 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290810 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-02 10:37:52 +00:00
Simon Pilgrim	373eadc326	[X86][SSE] Improve lowering of vXi64 multiplies As mentioned on PR30845, we were performing our vXi64 multiplication as: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi, 32)+ psllqi(AhiBlo, 32); when we could avoid one of the upper shifts with: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi + AhiBlo, 32); This matches the lowering on gcc/icc. Differential Revision: https://reviews.llvm.org/D27756 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290267 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-21 20:00:10 +00:00
Simon Pilgrim	255071b56f	[CostModel][X86] Updated reverse shuffle costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289819 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-15 14:24:07 +00:00
Simon Pilgrim	5d31f856ab	[X86][AVX512] Add support for v2i64 fptosi/fptoui/sitofp/uitofp on AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287882 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-24 14:46:55 +00:00
Simon Pilgrim	e547b01b86	[X86][AVX512] Add support for v4i64 fptosi/fptoui/sitofp/uitofp on AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287762 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-23 14:01:18 +00:00
Simon Pilgrim	df7181d34e	[CostModel][X86] Add missing AVX512DQ v8i64 fptosi/sitofp costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287760 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-23 13:42:09 +00:00
Simon Pilgrim	ad27fdae89	[CostModel][X86] Added mul costs for vXi8 vectors More realistic v16i8/v32i8/v64i8 MUL costs - we have to extend to vXi16, use PMULLW and then truncate the result git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286838 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-14 15:54:24 +00:00
Simon Pilgrim	2e6f35ab88	[X86][AVX] Fixed v16i16/v32i8 ADD/SUB costs on AVX1 subtargets Add explicit v16i16/v32i8 ADD/SUB costs, matching the costs of v4i64/v8i32 - they were missing for some reason. This has side effects on the LV max bandwidth tests (AVX1 now prefers 128-bit vectors vs AVX2 which still prefers 256-bit) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286832 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-14 14:45:16 +00:00
Simon Pilgrim	169b408a54	[VectorLegalizer] Expansion of CTLZ using CTPOP when possible This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available. This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful. Differential Revision: https://reviews.llvm.org/D25910 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286233 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-08 14:10:28 +00:00
Alexey Bataev	7af02fcbc9	Improved cost model for FDIV and FSQRT, by Andrew Tischenko There is a bug describing poor cost model for floating point operations: Bug 29083 - [X86][SSE] Improve costs for floating point operations. This patch is the second one in series of patches dealing with cost model. Differential Revision: https://reviews.llvm.org/D25722 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285564 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-31 12:10:53 +00:00

1 2 3 4

195 Commits