58 Commits

Author SHA1 Message Date
Elena Demikhovsky
01736613c2 [X86 Codegen Test] Divided masked_memop into several files. NFC.
The masked_memop.ll became huge. I extracted AVX-512 specific tests into separate files.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281892 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-19 08:58:43 +00:00
Craig Topper
bda8c85789 [AVX-512] Simplify X86InstrInfo::copyPhysReg for 128/256-bit vectors with AVX512, but not VLX. We should use the VEX opcodes and trust the register allocator to not use the extended XMM/YMM register space.
Previously we were extending to copying the whole ZMM register. The register allocator shouldn't use XMM16-31 or YMM16-31 in this configuration as the instructions to spill them aren't available.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280648 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-05 06:43:06 +00:00
Igor Breger
02eff5c342 revert r279960.
https://llvm.org/bugs/show_bug.cgi?id=30249

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280625 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-04 14:03:52 +00:00
Igor Breger
84cb7f4d14 [AVX512] In some cases KORTEST instruction may be used instead of ZEXT + TEST sequence.
Differential Revision: http://reviews.llvm.org/D23490

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279960 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-29 08:52:52 +00:00
Michael Kuperstein
5858ccdfc7 Revert r274613 because it breaks the test suite with AVX512
This reverts most of r274613 (AKA r274626) and its follow-ups (r276347, r277289),
due to miscompiles in the test suite. The FastISel change was left in, because
it apparently fixes an unrelated issue.

(Recommit of r279782 which was broken due to a bad merge.)

This fixes 4 out of the 5 test failures in PR29112.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279788 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-25 22:48:11 +00:00
Michael Kuperstein
efd12f4af5 Revert r279782 due to debug buildbot breakage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279785 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-25 22:14:45 +00:00
Michael Kuperstein
0eddaa0d42 Revert r274613 because it breaks the test suite with AVX512
This reverts most of r274613 and its follow-ups (r276347, r277289), due to
miscompiles in the test suite. The FastISel change was left in, because it
apparently fixes an unrelated issue.

This fixes 4 out of the 5 test failures in PR29112.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279782 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-25 21:55:41 +00:00
Igor Breger
33ca8b83f6 [AVX512] Fix extractelement i1 lowering.
The previous implementation (not custom) doesn't enforce zeroing off upper bits. The assumption is that i1 PRODUCER (truncate and extractelement) must zero all upper bits, so i1 CONSUMER instructions ( test, zext, save, etc) can be done without additional zeroing.
Make extractelement i1 lowering custom for all vector i1.

Differential Revision: http://reviews.llvm.org/D23246

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278328 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 12:13:46 +00:00
Craig Topper
f015e11376 [AVX512] Add VLX packed move instructions to the execution dependency fix pass and update tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277304 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-31 20:20:01 +00:00
Craig Topper
22cd3ed2a2 [AVX512] Add ExeDomain to vector extend and truncate instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276394 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22 05:46:44 +00:00
Craig Topper
f876acdcd8 [AVX512] Add initial support for the Execution Domain fixing pass to change some EVEX instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276393 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22 05:00:52 +00:00
Simon Pilgrim
54eb8ad654 [X86][SSE] Allow folding of store/zext with PEXTRW of 0'th element
Under normal circumstances we prefer the higher performance MOVD to extract the 0'th element of a v8i16 vector instead of PEXTRW.

But as detailed on PR27265, this prevents the SSE41 implementation of PEXTRW from folding the store of the 0'th element. Additionally it prevents us from making use of the fact that the (SSE2) reg-reg version of PEXTRW implicitly zero-extends the i16 element to the i32/i64 destination register.

This patch only preferentially lowers to MOVD if we will not be zero-extending the extracted i16, nor prevent a store from being folded (on SSSE41).

Fix for PR27265.

Differential Revision: https://reviews.llvm.org/D22509

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276289 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-21 14:54:17 +00:00
Craig Topper
e70f2b66e1 [X86] Add more opcodes to isFrameLoadOpcode/isFrameStoreOpcode. Mainly AVX-512 related.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275764 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:14:39 +00:00
Craig Topper
0c4677f3cc [AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when VLX is supported.
Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275763 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:14:34 +00:00
Craig Topper
b6d6904481 [AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275045 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11 05:36:48 +00:00
Matthias Braun
79519fecc3 VirtRegMap: Replace some identity copies with KILL instructions.
An identity COPY like this:
   %AL = COPY %AL, %EAX<imp-def>
has no semantic effect, but encodes liveness information: Further users
of %EAX only depend on this instruction even though it does not define
the full register.

Replace the COPY with a KILL instruction in those cases to maintain this
liveness information. (This reverts a small part of r238588 but this
time adds a comment explaining why a KILL instruction is useful).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274952 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-09 00:19:07 +00:00
Artur Pilipenko
48917c9e44 Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274043 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-28 18:27:25 +00:00
Artur Pilipenko
be0da39a48 Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273895 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 16:54:33 +00:00
Artur Pilipenko
9227558e8e Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273892 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 16:29:26 +00:00
Craig Topper
022094446e [AVX512] Add patterns for extracting subvectors and storing to memory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270334 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-21 22:50:14 +00:00
Craig Topper
945c4ac1dc [AVX512] Add patterns for VEXTRACT v16i16->v8i16 and v32i8->v16i8. Disable AVX2 versions of vector extract when AVX512VL is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270318 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-21 07:08:56 +00:00
Craig Topper
dba67a4fdb [AVX512] Add VLX 128/256-bit SET0 operations that encode to 128/256-bit EVEX encoded VPXORD so all 32 registers can be used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268884 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-08 21:33:53 +00:00
Adam Nemet
cf0a711bff Revert "Support arbitrary addrspace pointers in masked load/store intrinsics"
This reverts commit r266086.

It breaks the LTO build of gcc in SPEC2000.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266282 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-14 08:47:17 +00:00
Artur Pilipenko
80ce67004b Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change.

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266086 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-12 15:58:04 +00:00
Matthias Braun
a31e891389 Revert "Support arbitrary addrspace pointers in masked load/store intrinsics"
This commit broke LTO builds. Reverting it to unbreak the bots while the
issue is investigated. See also:

http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160321/341002.html

This reverts r263158

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264088 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-22 20:24:34 +00:00
Sanjay Patel
d26914da05 [x86, AVX] replace masked load with full vector load when possible
Converting masked vector loads to regular vector loads for x86 AVX should always be a win.
I raised the legality issue of reading the extra memory bytes on llvm-dev. I did not see any
objections.

1. x86 already does this kind of optimization for multiple scalar loads -> vector load.
2. If other targets have the same flexibility, we could move this transform up to CGP or DAGCombiner.

Differential Revision: http://reviews.llvm.org/D18094



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263446 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-14 16:54:43 +00:00
Artur Pilipenko
980df33d17 Support arbitrary addrspace pointers in masked load/store intrinsics
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263158 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 20:39:22 +00:00
Sanjay Patel
3ad244cde2 give regression test a meaningful name
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263135 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 17:52:19 +00:00
Sanjay Patel
40847a4c84 [x86, AVX] optimize masked loads with constant masks
Instead of a variable-blend instruction, form a blend with immediate because those are always cheaper.

Differential Revision: http://reviews.llvm.org/D17899


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263067 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 22:12:08 +00:00
Igor Breger
c3bc454e83 AVX512BW: Support llvm intrinsic masked vector load/store for i8/i16 element types on SKX
Differential Revision: http://reviews.llvm.org/D17913

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262803 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-06 12:38:58 +00:00
Igor Breger
64fd08f76f AVX512: Remove VSHRI kmask patterns from TD file. It is incorrect to use kshiftw to implement VSHRI v4i1 , bits 15-4 is undef so the upper bits of v4i1 may not be zeroed. v4i1 should be zero_extend to v16i1 ( or any natively supported vector).
Differential Revision: http://reviews.llvm.org/D17763

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262797 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-06 07:46:03 +00:00
Sanjay Patel
a6cab8c59e [x86] add tests for masked loads with constant masks
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262758 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-04 23:28:07 +00:00
Sanjay Patel
70cbdcc9e5 [x86] convert masked load of exactly one element to scalar load
This is the load counterpart to the store optimization that was added in:
http://reviews.llvm.org/rL260145



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260325 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-09 23:44:35 +00:00
Sanjay Patel
ab75a1751c [x86] convert masked store of one element to scalar store
Another opportunity to reduce masked stores: in D16691, we decided not to attempt the 'one mask element is set'
transform in InstCombine, but this should be a win for any AVX machine.

Code comments note that this transform could be extended for other targets / cases.

Differential Revision: http://reviews.llvm.org/D16828



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260145 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-08 21:05:08 +00:00
Igor Breger
3c3041375c AVX1 : Enable vector masked_load/store to AVX1.
Use AVX1 FP instructions (vmaskmovps/pd) in place of the AVX2 int instructions (vpmaskmovd/q).

Differential Revision: http://reviews.llvm.org/D16528

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258675 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-25 10:17:11 +00:00
Sanjay Patel
d33ffc66c6 regenerate checks and note some near-term improvements
For the moment, this file takes way too long to run (see inline comments), but
that should be a temporary problem. The fact that the compile time is so slow
for a target that doesn't support maskmov may be a bug worth investigating too.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258629 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-23 17:52:56 +00:00
Sanjay Patel
588841dc86 fixed to test features, not CPU models
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258568 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-22 22:20:56 +00:00
Igor Breger
3f202fdf9e AVX512: Change VPMOVB2M DAG lowering , use CVT2MASK node instead TRUNCATE.
Fix TRUNCATE lowering vector to vector i1, use LSB and not MSB.
Implement VPMOVB/W/D/Q2M intrinsic.

Differential Revision: http://reviews.llvm.org/D15675

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256470 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 13:56:16 +00:00
Elena Demikhovsky
56a8a2731e Type legalizer for masked gather and scatter intrinsics.
Full type legalizer that works with all vectors length - from 2 to 16, (i32, i64, float, double).

This intrinsic, for example
void @llvm.masked.scatter.v2f32(<2 x float>%data , <2 x float*>%ptrs , i32 align , <2 x i1>%mask )
requires type widening for data and type promotion for mask.

Differential Revision: http://reviews.llvm.org/D13633



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255629 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-15 08:40:41 +00:00
Elena Demikhovsky
b06ff9b1e1 AVX-512: Fixed masked load / store instruction selection for KNL.
Patterns were missing for KNL target for <8 x i32>, <8 x float> masked load/store.

This intrinsic comes with all legal types:
<8 x float> @llvm.masked.load.v8f32(<8 x float>* %addr, i32 align, <8 x i1> %mask, <8 x float> %passThru),
but still requires lowering, because VMASKMOVPS, VMASKMOVDQU32 work with 512-bit vectors only.

All data operands should be widened to 512-bit vector.
The mask operand should be widened to v16i1 with zeroes.

Differential Revision: http://reviews.llvm.org/D15265



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254909 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-07 13:39:24 +00:00
Elena Demikhovsky
43be5f580c Pointers in Masked Load, Store, Gather, Scatter intrinsics
The masked intrinsics support all integer and floating point data types. I added the pointer type to this list.
Added tests for CodeGen and for Loop Vectorizer.
Updated the Language Reference.

Differential Revision: http://reviews.llvm.org/D14150



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253544 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-19 07:17:16 +00:00
Elena Demikhovsky
c57d9bd443 Masked Load/Store optimization for scalar code
When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks.
I added optimization for constant mask vector.

Differential Revision: http://reviews.llvm.org/D13855



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250893 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-21 11:50:54 +00:00
Simon Pilgrim
fa2c240c15 [DAGCombiner] Convert constant AND masks to shuffle clear masks down to the byte level
The XformToShuffleWithZero method currently checks AND masks at the per-lane level for all-one and all-zero constants and attempts to convert them to legal shuffle clear masks.

This patch generalises XformToShuffleWithZero, splitting and checking the sub-lanes of the constants down to the byte level to see if any legal shuffle clear masks are possible. This allows a lot of masks (often from legalization or truncation) to be folded into existing shuffle patterns and removes a lot of constant mask loading.

There are a few examples of poor shuffle lowering that are exposed by this patch that will be cleaned up in future patches (e.g. merging shuffles that are separated by bitcasts, x86 legalized v8i8 zero extension uses PMOVZX+AND+AND instead of AND+PMOVZX, etc.)

Differential Revision: http://reviews.llvm.org/D11518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243831 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-01 10:01:46 +00:00
Igor Breger
a1692b30cb AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer Truncate with/without saturation
Added tests for DAG lowering ,encoding and intrinsic

Differential Revision: http://reviews.llvm.org/D11218

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243122 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-24 17:24:15 +00:00
Chandler Carruth
0451957993 Revert r242990: "AVX-512: Implemented encoding , DAG lowering and ..."
This commit broke the build. Numerous build bots broken, and it was
blocking my progress so reverting.

It should be trivial to reproduce -- enable the BPF backend and it
should fail when running llvm-tblgen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242992 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-23 08:03:44 +00:00
Igor Breger
cb8fe113a3 AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer Truncate with/without saturation
Added tests for DAG lowering ,encoding and intrinsic

Differential Revision: http://reviews.llvm.org/D11218

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242990 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-23 07:39:21 +00:00
Elena Demikhovsky
2d05c885ff Masked gather and scatter intrinsics - enabled codegen for KNL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236394 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-03 07:12:25 +00:00
Elena Demikhovsky
e670dc7848 AVX-512, SKX: Enabled masked_load/store operations for this target.
Added lowering for ISD::CONCAT_VECTORS and ISD::INSERT_SUBVECTOR for i1 vectors,
it is needed to pass all masked_memop.ll tests for SKX.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231371 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 15:11:35 +00:00
Chandler Carruth
454c3997b4 [x86] Teach the 128-bit vector shuffle lowering routines to take
advantage of the existence of a reasonable blend instruction.

The 256-bit vector shuffle lowering has leveraged the general technique
of decomposed shuffles and blends for quite some time, but this never
made it back into the 128-bit code, and there are a large number of
patterns where this is substantially better. For example, this removes
almost all domain crossing in vector shuffles that involve some blend
and some permutation with SSE4.1 and later. See the massive reduction
in 'shufps' for integer test cases in this commit.

This isn't perfect yet for a few reasons:

1) The v8i16 shuffle lowering continues to plague me. We don't always
   form an unpack-based blend when that would be better. But the wins
   pretty drastically outstrip the losses here.
2) The v16i8 shuffle lowering is just a disaster here. I never went and
   implemented blend support here for some terrible reason. I'll do
   that next probably. I've not updated it for now.

More variations on this technique are coming as well -- we don't
shuffle-into-unpack or shuffle-into-palignr, both of which would also be
profitable.

Note that some test cases grow significantly in the number of
instructions, but I expect to actually be faster. We use
pshufd+pshufd+blendw instead of a single shufps, but the pshufd's are
very likely to pipeline well (two ports on most modern intel chips) and
the blend is a *very* fast instruction. The domain switch penalty will
essentially always be more than a blend instruction, which is the only
increase in tree height.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229350 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 01:52:02 +00:00
Elena Demikhovsky
2785766bc8 Fixed a bug in type legalizer for masked load/store intrinsics.
The problem occurs when after vectorization we have type
<2 x i32>. This type is promoted to <2 x i64> and then requires
additional efforts for expanding loads and truncating stores.
I added EXPAND / TRUNCATE attributes to the masked load/store
SDNodes. The code now contains additional shuffles.
I've prepared changes in the cost estimation for masked memory
operations, it will be submitted separately.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226808 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-22 12:07:59 +00:00