RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-22 21:35:57 +00:00

Author	SHA1	Message	Date
Diana Picus	1d02724c71	Revert "Turn some C-style vararg into variadic templates" This reverts commit r299925 because it broke the buildbots. See e.g. http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/6008 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299928 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-11 10:07:12 +00:00
Serge Guelton	ec124b3a6f	Turn some C-style vararg into variadic templates Module::getOrInsertFunction is using C-style vararg instead of variadic templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299925 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-11 08:36:52 +00:00
Dehao Chen	d8cebd13cb	Use PMADDWD to expand reduction in a loop Summary: PMADDWD can help improve 8/16 bit integer mutliply-add operation performance for cases like: for (int i = 0; i < count; i++) a += x[i] * y[i]; Reviewers: wmi, davidxl, hfinkel, RKSimon, zvi, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31679 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299776 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-07 15:41:52 +00:00
Michael Kuperstein	bf82f16ca4	[X86] Revert r299387 due to AVX legalization infinite loop. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299720 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-06 22:33:25 +00:00
Mehdi Amini	8701bbc75d	Revert "Turn some C-style vararg into variadic templates" This reverts commit r299699, the examples needs to be updated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299702 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-06 20:23:57 +00:00
Mehdi Amini	753bd2a772	Turn some C-style vararg into variadic templates Module::getOrInsertFunction is using C-style vararg instead of variadic templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu> Differential Revision: https://reviews.llvm.org/D31070 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299699 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-06 20:09:31 +00:00
Simon Pilgrim	a71e6d97d5	[X86][SSE] Renamed combine to make it clear that it only handles the vector shift by immediate opcodes. NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299532 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-05 10:44:42 +00:00
Ahmed Bougacha	e339e10540	[X86] Relax assert in broadcast-of-subvector lowering. Before r294774, there was a problem when lowering broadcasts to use 128-bit subvectors. When we looked through a bitcast to find the broadcast input, we'd keep using the original type, so you'd end up with things like: (v8f32 (broadcast (v4f32 (extract_subvector (v8i32 V), ...)) )) r294774 fixed it to always emit subvectors with the scalar type of the original source. It also introduced some asserts, to check that we use scalars with the same size, and vectors with the same number of elements. The scalar size equality is checked earlier when looking through bitcasts, and is a useful assert. However, the number of elements don't have to be identical: we're always going to extract a 128-bit subvector, and we can have different size inputs if we looked through a concat_vector to find a 256-bit source. Relax the overzealous assert. Replace it with a check of the original source vector being 256 or 512 bits. If it's 128 bits, we can't extract_subvector from it. Fixes PR32371. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299490 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-05 00:14:39 +00:00
Sanjay Patel	1f3f346415	[x86] remove dead select-of-constants transform; NFCI https://reviews.llvm.org/D30537 / https://reviews.llvm.org/rL296977 added these transforms and other related transforms to the generic DAGCombiner (with a hook that x86 sets to true), so these patterns should not exist by the time we reach the target-specific combiner hook. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299448 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-04 16:54:58 +00:00
Simon Pilgrim	21de338c73	Strip trailing whitespace git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299438 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-04 14:40:53 +00:00
Oren Ben Simhon	78ed6ce91d	[X86] Add 64 bit pattern matching for PSADBW PSADBW pattern currently supports the 32 bit IR pattern and only GLT (greather than) comparison. The patch extends the pattern to catch also 64 bit IR pattern and includes all other comparison types (not only GLT). Differential Revision: https://reviews.llvm.org/D31577 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299425 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-04 10:23:18 +00:00
Simon Pilgrim	dc4f56dd52	[X86][SSE]] Lower BUILD_VECTOR with repeated elts as BUILD_VECTOR + VECTOR_SHUFFLE It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging. This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values. There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch. Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this. Differential Revision: https://reviews.llvm.org/D31373 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299387 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-03 21:06:51 +00:00
Amjad Aboud	019f65e751	x86 interrupt calling convention: re-align stack pointer on 64-bit if an error code was pushed The x86_64 ABI requires that the stack is 16 byte aligned on function calls. Thus, the 8-byte error code, which is pushed by the CPU for certain exceptions, leads to a misaligned stack. This results in bugs such as Bug 26413, where misaligned movaps instructions are generated. This commit fixes the misalignment by adjusting the stack pointer in these cases. The adjustment is done at the beginning of the prologue generation by subtracting another 8 bytes from the stack pointer. These additional bytes are popped again in the function epilogue. Fixes Bug 26413 Patch by Philipp Oppermann. Differential Revision: https://reviews.llvm.org/D30049 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299383 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-03 20:28:45 +00:00
Craig Topper	68149f546e	[APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt class. Implement them without memory allocation for multiword This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation. Differential Revision: https://reviews.llvm.org/D31565 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299362 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-03 16:34:59 +00:00
Simon Pilgrim	07ccae240a	[X86][MMX] Improve support for folding fptosi from XMM to MMX git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299338 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-02 17:45:41 +00:00
Simon Pilgrim	888b181566	[X86][MMX] Simplify tablegen patterns by always combining MOVDQ2Q from v2i64 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299336 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-02 16:20:34 +00:00
Simon Pilgrim	4475a6621a	[X86][MMX] Added support for subvector extraction to MMX register git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299335 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-02 15:52:28 +00:00
Craig Topper	af26b71085	[AVX-512] Update lowering for gather/scatter prefetch intrinsics to match the immediate encodings the frontend uses based on the _MM_HINT_T0/T1 constant values in clang's headers. Our _MM_HINT_T0/T1 constant values are 3/2 which matches gcc, but not icc or Intel documentation. Interestingly gcc had this same bug on their implementation of the gather/scatter builtins at one point too. Fixes PR32411. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299234 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-31 17:24:29 +00:00
Simon Pilgrim	9fc191fd45	[DAGCombiner] Add vector demanded elements support to ComputeNumSignBits Currently ComputeNumSignBits returns the minimum number of sign bits for all elements of vector data, when we may only be interested in one/some of the elements. This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original ComputeNumSignBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1. I've only added support for BUILD_VECTOR and EXTRACT_VECTOR_ELT so far, all others will default to demanding all elements but can be updated in due course. Followup to D25691. Differential Revision: https://reviews.llvm.org/D31311 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299219 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-31 13:54:09 +00:00
Simon Pilgrim	07898901df	[DAGCombiner] Add vector demanded elements support to computeKnownBitsForTargetNode Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes. Differential Revision: https://reviews.llvm.org/D31249 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299201 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-31 11:24:16 +00:00
Simon Pilgrim	cac5a6fb06	Spelling mistakes in comments. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299069 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-30 12:30:15 +00:00
Davide Italiano	ad6daf4031	[X86IselLowering] Remove extraneous semicolon. NFCI. Unbreaks the build with GCC -Werror. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299030 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-29 21:34:58 +00:00
Simon Pilgrim	d7f209a331	[X86] Tidied up comment - we don't custom lower add/sub i64 on i686 anymore. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299004 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-29 15:41:58 +00:00
Simon Pilgrim	2c2eb599d3	Spelling mistakes in comments. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299000 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-29 15:27:24 +00:00
Simon Pilgrim	d0ea014431	[X86][AVX2] Prevent unary interleaving patterns from calling lowerVectorShuffleAsSplitOrBlend (PR32453) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298993 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-29 13:00:00 +00:00
Simon Pilgrim	c9e0a0dbb1	[X86][MMX] Match MMX fp_to_sint conversions from XMM registers We currently perform the various fp_to_sint XMM conversion and then transfer to the MMX register (on 32-bit via the stack). This patch improves support for MOVDQ2Q XMM to MMX transfers and adds the XMM->MMX fp_to_sint direct conversion patterns. The SSE2 specifications are the same as for XMM->XMM and XMM->MMX rounding/exceptions/etc. Differential Revision: https://reviews.llvm.org/D30868 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298943 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-28 21:32:11 +00:00
Sanjay Patel	e26bd3af58	[x86] use VPMOVMSK to replace memcmp libcalls for 32-byte equality Follow-up to: https://reviews.llvm.org/rL298775 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298933 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-28 17:23:49 +00:00
Simon Pilgrim	a4ee850374	[X86][AVX2] Add support for combining v16i16 shuffles to VPBLENDW git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298929 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-28 16:40:38 +00:00
Simon Pilgrim	1a36c64204	[X86][SSE] Refactored shuffle BLEND combining to make future 16i16 support easier. NFCI. Call the matchVectorShuffleAsBlend test as early as possible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298925 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-28 15:50:23 +00:00
Simon Pilgrim	c47b59c064	Fix signed/unsigned comparison warning git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298917 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-28 13:40:09 +00:00
Simon Pilgrim	3d39cb0a48	[X86][SSE] Begin merging vector shuffle to BLEND for lowering and combining. Split off matchVectorShuffleAsBlend from lowerVectorShuffleAsBlend for reuse in combining. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298914 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-28 13:05:48 +00:00
Simon Pilgrim	f67785e440	[X86][SSE] Set second operand to undef instead of first operand in unary shuffle combines. Copy isn't necessary after the matchVectorShuffleWithUNPCK refactor and undef value will make some future undef/zero handling easier. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298910 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-28 12:16:42 +00:00
Simon Pilgrim	eb81f2b1a2	Strip trailing whitespace git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298909 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-28 11:15:17 +00:00
Gadi Haber	cd2a3f9e73	[X86][AVX2] bugzilla bug 21281 Performance regression in vector interleave in AVX2 This is a patch for an on-going bugzilla bug 21281 on the generated X86 code for a matrix transpose8x8 subroutine which requires vector interleaving. The generated code in AVX2 is currently non-optimal and requires 60 instructions as opposed to only 40 instructions generated for AVX1. The patch includes a fix for the AVX2 case where vector unpack instructions use less operations than the vector blend operations available in AVX2. In this case using vector unpack instructions is more efficient. Reviewers: zvi delena igorb craig.topper guyblank eladcohen m_zuckerman aymanmus RKSimon git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298840 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-27 12:13:37 +00:00
Simon Pilgrim	1dc69d0cab	[X86][SSE] Add computeKnownBitsForTargetNode support for (V)PSLL/(V)PSRL instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298806 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-26 13:17:55 +00:00
Simon Pilgrim	d7e1b091d0	[X86] Pull out repeated ScalarValueSizeInBits code. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298783 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-25 21:22:12 +00:00
Simon Pilgrim	f47742a184	[X86][SSE] Combine (VSRLI (VSRAI X, Y), (NumSignBits-1)) -> (VSRLI X, (NumSignBits-1)) Part 3 of 3. Differential Revision: https://reviews.llvm.org/D31347 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298782 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-25 20:43:01 +00:00
Simon Pilgrim	284e861f70	[X86][SSE] Added ComputeNumSignBitsForTargetNode support for (V)PSRAI Part 2 of 3. Differential Revision: https://reviews.llvm.org/D31347 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298780 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-25 19:58:36 +00:00
Simon Pilgrim	4671240f7d	[X86][SSE] Generalised CMP+AND1 combine to ZERO/ALLBITS+MASK Patch to generalize combinePCMPAnd1 (for handling SETCC + ZEXT cases) to work for any input that has zero/all bits set masked with an 'all low bits' mask. Replaced the implicit assumption of shift availability with a call to SupportedVectorShiftWithImm. Part 1 of 3. Differential Revision: https://reviews.llvm.org/D31347 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298779 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-25 19:50:14 +00:00
Sanjay Patel	a39e64c0aa	[x86] use PMOVMSK to replace memcmp libcalls for 16-byte equality This is the payoff for D31156 - if a target has efficient comparison instructions for vector-sized equality, we can replace memcmp calls with inline code that is both smaller and faster. Differential Revision: https://reviews.llvm.org/D31290 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298775 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-25 16:05:33 +00:00
Simon Pilgrim	ce91f53688	[X86][SSE] Generalised lowerTruncate by PACKSS to work with any 'zero/all bits' result, not just comparisons. Added vector compare opcodes to X86TargetLowering::ComputeNumSignBitsForTargetNode Covered by existing tests added for D22814. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298704 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-24 16:12:31 +00:00
Eric Christopher	7322394a8d	Remove the subtarget argument from LowerFP_TO_INT since there's one stored on X86TargetLowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298628 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-23 17:35:08 +00:00
Eric Christopher	415d5ca555	Remove unused X86Subtarget argument from getOnesVector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298627 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-23 17:35:06 +00:00
Simon Pilgrim	de4fa98807	[X86][SSE] Extract elements from narrower shuffle masks. Add support for widening narrow shuffle masks so we can directly extract from the relevant input vector of the shuffle. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298616 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-23 16:09:34 +00:00
Simon Pilgrim	bec15f22ea	[X86][SSE] Tidyup canWidenShuffleElements. NFCI. Pull out mask elements at the start, allowing us to make the widening pattern matching more readable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298594 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-23 13:33:03 +00:00
Michael Zuckerman	414ca0751d	[X86][TD][vpmovm2 ] New TD pattern for the vpmovm2 instruction Up until now, vpmovm2 instruction described its destination operand size by the source operand size. This patch adds new pattern for the vpmovm2 instruction. The node describes new expansion of the destination (from {128\|256} to 512). Differential Revision: https://reviews.llvm.org/D30654 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298586 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-23 09:57:01 +00:00
Eric Christopher	9e695dc5e8	Clean up some Subtarget uses and casts in the X86 backend, removing unnecessary work or calls. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298555 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-22 22:44:52 +00:00
Reid Kleckner	6707770d48	Rename AttributeSet to AttributeList Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name. Rename AttributeSetImpl to AttributeListImpl to follow suit. It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value. Reviewers: sanjoy, javed.absar, chandlerc, pete Reviewed By: pete Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits Differential Revision: https://reviews.llvm.org/D31102 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298393 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 16:57:19 +00:00
Sanjay Patel	abbd326847	[x86] use PMOVMSK for vector-sized equality comparisons We could do better by splitting any oversized type into whatever vector size the target supports, but I left that for future work if it ever comes up. The motivating case is memcmp() calls on 16-byte structs, so I think we can wire that up with a TLI hook that feeds into this. Differential Revision: https://reviews.llvm.org/D31156 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298376 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 13:50:33 +00:00
Evgeniy Stepanov	3fd77c94a3	[Fuchsia] Use %gs for ABI slots under -mcmodel=kernel Make x86_64-fuchsia targets under -mcmodel=kernel use %gs rather than %fs to access ABI slots for stack-protector and safe-stack Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D30870 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298302 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-20 20:35:37 +00:00

1 2 3 4 5 ...

4564 Commits