4564 Commits

Author SHA1 Message Date
Diana Picus
1d02724c71 Revert "Turn some C-style vararg into variadic templates"
This reverts commit r299925 because it broke the buildbots. See e.g.
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/6008

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299928 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-11 10:07:12 +00:00
Serge Guelton
ec124b3a6f Turn some C-style vararg into variadic templates
Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.

From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299925 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-11 08:36:52 +00:00
Dehao Chen
d8cebd13cb Use PMADDWD to expand reduction in a loop
Summary:
PMADDWD can help improve 8/16 bit integer mutliply-add operation performance for cases like:

for (int i = 0; i < count; i++)
  a += x[i] * y[i];

Reviewers: wmi, davidxl, hfinkel, RKSimon, zvi, mkuper

Reviewed By: mkuper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31679

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299776 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-07 15:41:52 +00:00
Michael Kuperstein
bf82f16ca4 [X86] Revert r299387 due to AVX legalization infinite loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299720 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-06 22:33:25 +00:00
Mehdi Amini
8701bbc75d Revert "Turn some C-style vararg into variadic templates"
This reverts commit r299699, the examples needs to be updated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299702 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-06 20:23:57 +00:00
Mehdi Amini
753bd2a772 Turn some C-style vararg into variadic templates
Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.

From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.

Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>

Differential Revision: https://reviews.llvm.org/D31070

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299699 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-06 20:09:31 +00:00
Simon Pilgrim
a71e6d97d5 [X86][SSE] Renamed combine to make it clear that it only handles the vector shift by immediate opcodes. NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299532 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-05 10:44:42 +00:00
Ahmed Bougacha
e339e10540 [X86] Relax assert in broadcast-of-subvector lowering.
Before r294774, there was a problem when lowering broadcasts to use
128-bit subvectors.

When we looked through a bitcast to find the broadcast input, we'd keep
using the original type, so you'd end up with things like:
  (v8f32 (broadcast
    (v4f32 (extract_subvector
      (v8i32 V),
      ...))
    ))

r294774 fixed it to always emit subvectors with the scalar type of the
original source.

It also introduced some asserts, to check that we use scalars with
the same size, and vectors with the same number of elements.

The scalar size equality is checked earlier when looking through bitcasts,
and is a useful assert.

However, the number of elements don't have to be identical: we're always
going to extract a 128-bit subvector, and we can have different size
inputs if we looked through a concat_vector to find a 256-bit source.

Relax the overzealous assert.

Replace it with a check of the original source vector being 256 or 512
bits.  If it's 128 bits, we can't extract_subvector from it.

Fixes PR32371.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299490 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-05 00:14:39 +00:00
Sanjay Patel
1f3f346415 [x86] remove dead select-of-constants transform; NFCI
https://reviews.llvm.org/D30537 / https://reviews.llvm.org/rL296977 added these transforms
and other related transforms to the generic DAGCombiner (with a hook that x86 sets to true),
so these patterns should not exist by the time we reach the target-specific combiner hook.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299448 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-04 16:54:58 +00:00
Simon Pilgrim
21de338c73 Strip trailing whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299438 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-04 14:40:53 +00:00
Oren Ben Simhon
78ed6ce91d [X86] Add 64 bit pattern matching for PSADBW
PSADBW pattern currently supports the 32 bit IR pattern and only GLT (greather than) comparison.
The patch extends the pattern to catch also 64 bit IR pattern and includes all other comparison types (not only GLT).

Differential Revision: https://reviews.llvm.org/D31577



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299425 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-04 10:23:18 +00:00
Simon Pilgrim
dc4f56dd52 [X86][SSE]] Lower BUILD_VECTOR with repeated elts as BUILD_VECTOR + VECTOR_SHUFFLE
It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging.

This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values.

There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch.

Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this.

Differential Revision: https://reviews.llvm.org/D31373

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299387 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-03 21:06:51 +00:00
Amjad Aboud
019f65e751 x86 interrupt calling convention: re-align stack pointer on 64-bit if an error code was pushed
The x86_64 ABI requires that the stack is 16 byte aligned on function calls. Thus, the 8-byte error code, which is pushed by the CPU for certain exceptions, leads to a misaligned stack. This results in bugs such as Bug 26413, where misaligned movaps instructions are generated.

This commit fixes the misalignment by adjusting the stack pointer in these cases. The adjustment is done at the beginning of the prologue generation by subtracting another 8 bytes from the stack pointer. These additional bytes are popped again in the function epilogue.

Fixes Bug 26413

Patch by Philipp Oppermann.

Differential Revision: https://reviews.llvm.org/D30049

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299383 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-03 20:28:45 +00:00
Craig Topper
68149f546e [APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt class. Implement them without memory allocation for multiword
This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation.

Differential Revision: https://reviews.llvm.org/D31565




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299362 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-03 16:34:59 +00:00
Simon Pilgrim
07ccae240a [X86][MMX] Improve support for folding fptosi from XMM to MMX
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299338 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-02 17:45:41 +00:00
Simon Pilgrim
888b181566 [X86][MMX] Simplify tablegen patterns by always combining MOVDQ2Q from v2i64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299336 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-02 16:20:34 +00:00
Simon Pilgrim
4475a6621a [X86][MMX] Added support for subvector extraction to MMX register
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299335 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-02 15:52:28 +00:00
Craig Topper
af26b71085 [AVX-512] Update lowering for gather/scatter prefetch intrinsics to match the immediate encodings the frontend uses based on the _MM_HINT_T0/T1 constant values in clang's headers.
Our _MM_HINT_T0/T1 constant values are 3/2 which matches gcc, but not icc or Intel documentation. Interestingly gcc had this same bug on their implementation of the gather/scatter builtins at one point too.

Fixes PR32411.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299234 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-31 17:24:29 +00:00
Simon Pilgrim
9fc191fd45 [DAGCombiner] Add vector demanded elements support to ComputeNumSignBits
Currently ComputeNumSignBits returns the minimum number of sign bits for all elements of vector data, when we may only be interested in one/some of the elements.

This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original ComputeNumSignBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.

I've only added support for BUILD_VECTOR and EXTRACT_VECTOR_ELT so far, all others will default to demanding all elements but can be updated in due course.

Followup to D25691.

Differential Revision: https://reviews.llvm.org/D31311

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299219 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-31 13:54:09 +00:00
Simon Pilgrim
07898901df [DAGCombiner] Add vector demanded elements support to computeKnownBitsForTargetNode
Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes.

Differential Revision: https://reviews.llvm.org/D31249

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299201 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-31 11:24:16 +00:00
Simon Pilgrim
cac5a6fb06 Spelling mistakes in comments. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299069 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-30 12:30:15 +00:00
Davide Italiano
ad6daf4031 [X86IselLowering] Remove extraneous semicolon. NFCI.
Unbreaks the build with GCC -Werror.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299030 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-29 21:34:58 +00:00
Simon Pilgrim
d7f209a331 [X86] Tidied up comment - we don't custom lower add/sub i64 on i686 anymore. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299004 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-29 15:41:58 +00:00
Simon Pilgrim
2c2eb599d3 Spelling mistakes in comments. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299000 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-29 15:27:24 +00:00
Simon Pilgrim
d0ea014431 [X86][AVX2] Prevent unary interleaving patterns from calling lowerVectorShuffleAsSplitOrBlend (PR32453)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298993 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-29 13:00:00 +00:00
Simon Pilgrim
c9e0a0dbb1 [X86][MMX] Match MMX fp_to_sint conversions from XMM registers
We currently perform the various fp_to_sint XMM conversion and then transfer to the MMX register (on 32-bit via the stack).

This patch improves support for MOVDQ2Q XMM to MMX transfers and adds the XMM->MMX fp_to_sint direct conversion patterns. The SSE2 specifications are the same as for XMM->XMM and XMM->MMX rounding/exceptions/etc.

Differential Revision: https://reviews.llvm.org/D30868

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298943 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-28 21:32:11 +00:00
Sanjay Patel
e26bd3af58 [x86] use VPMOVMSK to replace memcmp libcalls for 32-byte equality
Follow-up to:
https://reviews.llvm.org/rL298775


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298933 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-28 17:23:49 +00:00
Simon Pilgrim
a4ee850374 [X86][AVX2] Add support for combining v16i16 shuffles to VPBLENDW
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298929 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-28 16:40:38 +00:00
Simon Pilgrim
1a36c64204 [X86][SSE] Refactored shuffle BLEND combining to make future 16i16 support easier. NFCI.
Call the matchVectorShuffleAsBlend test as early as possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298925 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-28 15:50:23 +00:00
Simon Pilgrim
c47b59c064 Fix signed/unsigned comparison warning
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298917 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-28 13:40:09 +00:00
Simon Pilgrim
3d39cb0a48 [X86][SSE] Begin merging vector shuffle to BLEND for lowering and combining.
Split off matchVectorShuffleAsBlend from lowerVectorShuffleAsBlend for reuse in combining.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298914 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-28 13:05:48 +00:00
Simon Pilgrim
f67785e440 [X86][SSE] Set second operand to undef instead of first operand in unary shuffle combines.
Copy isn't necessary after the matchVectorShuffleWithUNPCK refactor and undef value will make some future undef/zero handling easier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298910 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-28 12:16:42 +00:00
Simon Pilgrim
eb81f2b1a2 Strip trailing whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298909 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-28 11:15:17 +00:00
Gadi Haber
cd2a3f9e73 [X86][AVX2] bugzilla bug 21281 Performance regression in vector interleave in AVX2
This is a patch for an on-going bugzilla bug 21281 on the generated X86 code for a matrix transpose8x8 subroutine which requires vector interleaving. The generated code in AVX2 is currently non-optimal and requires 60 instructions as opposed to only 40 instructions generated for AVX1.
 The patch includes a fix for the AVX2 case where vector unpack instructions use less operations than the vector blend operations available in AVX2.
 In this case using vector unpack instructions is more efficient.

Reviewers:
zvi  
delena  
igorb  
craig.topper  
guyblank  
eladcohen  
m_zuckerman  
aymanmus  
RKSimon 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298840 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-27 12:13:37 +00:00
Simon Pilgrim
1dc69d0cab [X86][SSE] Add computeKnownBitsForTargetNode support for (V)PSLL/(V)PSRL instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298806 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-26 13:17:55 +00:00
Simon Pilgrim
d7e1b091d0 [X86] Pull out repeated ScalarValueSizeInBits code. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298783 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-25 21:22:12 +00:00
Simon Pilgrim
f47742a184 [X86][SSE] Combine (VSRLI (VSRAI X, Y), (NumSignBits-1)) -> (VSRLI X, (NumSignBits-1))
Part 3 of 3.

Differential Revision: https://reviews.llvm.org/D31347


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298782 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-25 20:43:01 +00:00
Simon Pilgrim
284e861f70 [X86][SSE] Added ComputeNumSignBitsForTargetNode support for (V)PSRAI
Part 2 of 3.

Differential Revision: https://reviews.llvm.org/D31347


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298780 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-25 19:58:36 +00:00
Simon Pilgrim
4671240f7d [X86][SSE] Generalised CMP+AND1 combine to ZERO/ALLBITS+MASK
Patch to generalize combinePCMPAnd1 (for handling SETCC + ZEXT cases) to work for any input that has zero/all bits set masked with an 'all low bits' mask.

Replaced the implicit assumption of shift availability with a call to SupportedVectorShiftWithImm.

Part 1 of 3.

Differential Revision: https://reviews.llvm.org/D31347


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298779 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-25 19:50:14 +00:00
Sanjay Patel
a39e64c0aa [x86] use PMOVMSK to replace memcmp libcalls for 16-byte equality
This is the payoff for D31156 - if a target has efficient comparison instructions for vector-sized equality, 
we can replace memcmp calls with inline code that is both smaller and faster.

Differential Revision: https://reviews.llvm.org/D31290


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298775 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-25 16:05:33 +00:00
Simon Pilgrim
ce91f53688 [X86][SSE] Generalised lowerTruncate by PACKSS to work with any 'zero/all bits' result, not just comparisons.
Added vector compare opcodes to X86TargetLowering::ComputeNumSignBitsForTargetNode

Covered by existing tests added for D22814.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298704 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-24 16:12:31 +00:00
Eric Christopher
7322394a8d Remove the subtarget argument from LowerFP_TO_INT since there's one
stored on X86TargetLowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298628 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-23 17:35:08 +00:00
Eric Christopher
415d5ca555 Remove unused X86Subtarget argument from getOnesVector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298627 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-23 17:35:06 +00:00
Simon Pilgrim
de4fa98807 [X86][SSE] Extract elements from narrower shuffle masks.
Add support for widening narrow shuffle masks so we can directly extract from the relevant input vector of the shuffle.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298616 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-23 16:09:34 +00:00
Simon Pilgrim
bec15f22ea [X86][SSE] Tidyup canWidenShuffleElements. NFCI.
Pull out mask elements at the start, allowing us to make the widening pattern matching more readable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298594 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-23 13:33:03 +00:00
Michael Zuckerman
414ca0751d [X86][TD][vpmovm2 ] New TD pattern for the vpmovm2 instruction
Up until now, vpmovm2 instruction described its destination operand size
by the source operand size. This patch adds new pattern for the vpmovm2
instruction. The node describes new expansion of the destination (from
{128|256} to 512).

Differential Revision: https://reviews.llvm.org/D30654


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298586 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-23 09:57:01 +00:00
Eric Christopher
9e695dc5e8 Clean up some Subtarget uses and casts in the X86 backend, removing unnecessary work or calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298555 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-22 22:44:52 +00:00
Reid Kleckner
6707770d48 Rename AttributeSet to AttributeList
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.

Rename AttributeSetImpl to AttributeListImpl to follow suit.

It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.

Reviewers: sanjoy, javed.absar, chandlerc, pete

Reviewed By: pete

Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits

Differential Revision: https://reviews.llvm.org/D31102

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298393 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 16:57:19 +00:00
Sanjay Patel
abbd326847 [x86] use PMOVMSK for vector-sized equality comparisons
We could do better by splitting any oversized type into whatever vector size the target supports, 
but I left that for future work if it ever comes up. The motivating case is memcmp() calls on 16-byte
structs, so I think we can wire that up with a TLI hook that feeds into this.

Differential Revision: https://reviews.llvm.org/D31156


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298376 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 13:50:33 +00:00
Evgeniy Stepanov
3fd77c94a3 [Fuchsia] Use %gs for ABI slots under -mcmodel=kernel
Make x86_64-fuchsia targets under -mcmodel=kernel use %gs rather
than %fs to access ABI slots for stack-protector and safe-stack

Patch by Roland McGrath.

Differential Revision: https://reviews.llvm.org/D30870

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298302 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-20 20:35:37 +00:00