Chris Lattner
cf80e569f6
Move the rest of the PPCTargetLowering::LowerOperation cases out into
...
separate functions, for simplicity and code clarity.
llvm-svn: 27693
2006-04-14 06:01:58 +00:00
Chris Lattner
aacabea404
Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate
...
functions, which makes the code much cleaner :)
llvm-svn: 27692
2006-04-14 05:19:18 +00:00
Chris Lattner
569ea9c6dd
Force non-darwin targets to use a static relo model. This fixes PR734,
...
tested by CodeGen/Generic/vector.ll
llvm-svn: 27657
2006-04-13 17:10:48 +00:00
Chris Lattner
e087b8e321
Add a new way to match vector constants, which make it easier to bang bits of
...
different types.
Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load,
implementing PowerPC/vec_constants.ll:test1. This compiles:
typedef float vf __attribute__ ((vector_size (16)));
typedef int vi __attribute__ ((vector_size (16)));
void test(vi *P1, vi *P2, vf *P3) {
*P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000};
*P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF};
*P3 = vec_abs((vector float)*P3);
}
to:
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
vspltisw v0, -1
vslw v0, v0, v0
lvx v1, 0, r3
vand v1, v1, v0
stvx v1, 0, r3
lvx v1, 0, r4
vandc v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vandc v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
instead of (with two constant pool entries):
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
li r6, lo16(LCPI1_0)
lis r7, ha16(LCPI1_0)
li r8, lo16(LCPI1_1)
lis r9, ha16(LCPI1_1)
lvx v0, r7, r6
lvx v1, 0, r3
vand v0, v1, v0
stvx v0, 0, r3
lvx v0, r9, r8
lvx v1, 0, r4
vand v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vand v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
GCC produces (with 2 cp entries):
_test:
mfspr r0,256
stw r0,-4(r1)
oris r0,r0,0xc00c
mtspr 256,r0
lis r2,ha16(LC0)
lis r9,ha16(LC1)
la r2,lo16(LC0)(r2)
lvx v0,0,r3
lvx v1,0,r5
la r9,lo16(LC1)(r9)
lwz r12,-4(r1)
lvx v12,0,r2
lvx v13,0,r9
vand v0,v0,v12
stvx v0,0,r3
vspltisw v0,-1
vslw v12,v0,v0
vandc v1,v1,v12
stvx v1,0,r5
lvx v0,0,r4
vand v0,v0,v13
stvx v0,0,r4
mtspr 256,r12
blr
llvm-svn: 27624
2006-04-12 19:07:14 +00:00
Chris Lattner
ce6e988fa6
Rename get_VSPLI_elt -> get_VSPLTI_elt
...
Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each
form, eliminating a bunch of Pat patterns in the .td file and allowing us to
CSE stuff more aggressively. This implements
PowerPC/buildvec_canonicalize.ll:VSPLTI
llvm-svn: 27614
2006-04-12 17:37:20 +00:00
Chris Lattner
602d86f7af
Ensure that zero vectors are always v4i32, which forces them to CSE with
...
each other. This implements CodeGen/PowerPC/vxor-canonicalize.ll
llvm-svn: 27609
2006-04-12 16:53:28 +00:00
Chris Lattner
e12152a64b
Vector function results go into V2 according to GCC. The darwin ABI doc
...
doesn't say where they go :-/
llvm-svn: 27579
2006-04-11 01:38:39 +00:00
Chris Lattner
5d1acb831a
Move some return-handling code from lowerarguments to the ISD::RET handling stuff.
...
No functionality change.
llvm-svn: 27577
2006-04-11 01:21:43 +00:00
Chris Lattner
3c6e4a1dc9
properly mark vector selects as expanded to select_cc
...
llvm-svn: 27544
2006-04-08 22:59:15 +00:00
Chris Lattner
2ffa288a23
Add VRRC select support
...
llvm-svn: 27543
2006-04-08 22:45:08 +00:00
Chris Lattner
8234bfe18e
Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a
...
constant pool load.
llvm-svn: 27538
2006-04-08 07:14:26 +00:00
Chris Lattner
e8defcff7d
Change the interface to the predicate that determines if vsplti* can be used.
...
No functionality changes.
llvm-svn: 27536
2006-04-08 06:46:53 +00:00
Chris Lattner
a390188fd4
Make sure to return the result in the right type.
...
llvm-svn: 27469
2006-04-06 23:12:19 +00:00
Chris Lattner
c0680ae07e
Match vpku[hw]um(x,x).
...
Convert vsldoi(x,x) to work the same way other (x,x) cases work.
llvm-svn: 27467
2006-04-06 22:28:36 +00:00
Chris Lattner
a52d88ee89
Add support for matching vmrg(x,x) patterns
...
llvm-svn: 27463
2006-04-06 22:02:42 +00:00
Chris Lattner
300076cbd8
Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles.
...
llvm-svn: 27457
2006-04-06 21:11:54 +00:00
Chris Lattner
2875bb116e
Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to
...
lower it and LLVM to have one fewer intrinsic. This implements
CodeGen/PowerPC/vec_shuffle.ll
llvm-svn: 27450
2006-04-06 18:26:28 +00:00
Chris Lattner
10fa7be550
Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into
...
vperm with a perm mask lvx'd from the constant pool.
llvm-svn: 27448
2006-04-06 17:23:16 +00:00
Chris Lattner
d1b47b18ed
Fix CodeGen/PowerPC/2006-04-05-splat-ish.ll
...
llvm-svn: 27439
2006-04-05 17:39:25 +00:00
Evan Cheng
9e56e97205
Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered.
...
llvm-svn: 27433
2006-04-05 06:09:26 +00:00
Chris Lattner
d1483ca1ad
Fix some broken logic that would cause us to codegen {2147483647,2147483647,2147483647,2147483647} as 'vspltisb v0, -1'.
...
llvm-svn: 27413
2006-04-04 22:28:35 +00:00
Chris Lattner
4e99e6dfdd
Ask legalize to promote all vector shuffles to be v16i8 instead of having to
...
handle all 4 PPC vector types. This simplifies the matching code and allows
us to eliminate a bunch of patterns. This also adds cases we were missing,
such as CodeGen/PowerPC/vec_splat.ll:splat_h.
llvm-svn: 27400
2006-04-04 17:25:31 +00:00
Chris Lattner
0128e4d335
Revert accidentally committed hunks.
...
llvm-svn: 27386
2006-04-03 23:58:04 +00:00
Chris Lattner
57b9e01b3e
Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand.
...
llvm-svn: 27385
2006-04-03 23:55:43 +00:00
Chris Lattner
8ba4723c74
Inform the dag combiner that the predicate compares only return a low bit.
...
llvm-svn: 27359
2006-04-02 06:26:07 +00:00
Chris Lattner
da4217646a
Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into
...
"vspltisb v0, 8" instead of a constant pool load.
llvm-svn: 27335
2006-04-02 00:43:36 +00:00
Chris Lattner
336d6646ab
Rearrange code a bit
...
llvm-svn: 27306
2006-03-31 19:52:36 +00:00
Chris Lattner
786f782398
Add, sub and shuffle are legal for all vector types
...
llvm-svn: 27305
2006-03-31 19:48:58 +00:00
Chris Lattner
e3774da014
note to self: *save* file, then check it in
...
llvm-svn: 27291
2006-03-31 06:04:53 +00:00
Chris Lattner
95d358dbdb
Implement an item from the readme, folding vcmp/vcmp. instructions with
...
identical instructions into a single instruction. For example, for:
void test(vector float *x, vector float *y, int *P) {
int v = vec_any_out(*x, *y);
*x = (vector float)vec_cmpb(*x, *y);
*P = v;
}
we now generate:
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
lvx v0, 0, r4
lvx v1, 0, r3
vcmpbfp. v0, v1, v0
mfcr r4, 2
stvx v0, 0, r3
rlwinm r3, r4, 27, 31, 31
xori r3, r3, 1
stw r3, 0(r5)
mtspr 256, r2
blr
instead of:
_test:
mfspr r2, 256
oris r6, r2, 57344
mtspr 256, r6
lvx v0, 0, r4
lvx v1, 0, r3
vcmpbfp. v2, v1, v0
mfcr r4, 2
*** vcmpbfp v0, v1, v0
rlwinm r4, r4, 27, 31, 31
stvx v0, 0, r3
xori r3, r4, 1
stw r3, 0(r5)
mtspr 256, r2
blr
Testcase here: CodeGen/PowerPC/vcmp-fold.ll
llvm-svn: 27290
2006-03-31 06:02:07 +00:00
Chris Lattner
e330741a6c
Lower vector compares to VCMP nodes, just like we lower vector comparison
...
predicates to VCMPo nodes.
llvm-svn: 27285
2006-03-31 05:13:27 +00:00
Chris Lattner
a31d719e0a
Mark INSERT_VECTOR_ELT as expand
...
llvm-svn: 27276
2006-03-31 01:48:55 +00:00
Nate Begeman
5a82c8ccbd
Add a few more altivec intrinsics
...
llvm-svn: 27215
2006-03-28 04:15:58 +00:00
Chris Lattner
ac98e20cc9
Use normal lvx for scalar_to_vector instead of lve*x. They do the exact
...
same thing and we have a dag node for the former.
llvm-svn: 27205
2006-03-28 01:43:22 +00:00
Chris Lattner
d5da541d42
Tblgen doesn't like multiple SDNode<> definitions that map to the sameenum value. Split them into separate enums.
...
llvm-svn: 27201
2006-03-28 00:40:33 +00:00
Nate Begeman
3d518334b9
SelectionDAGISel can now natively handle Switch instructions, in the same
...
manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary
search tree of basic blocks. The new approach has several advantages:
it is faster, it generates significantly smaller code in many cases, and
it paves the way for implementing dense switch tables as a jump table by
handling switches directly in the instruction selector.
This functionality is currently only enabled on x86, but should be safe for
every target. In anticipation of making it the default, the cfg is now
properly updated in the x86, ppc, and sparc select lowering code.
llvm-svn: 27156
2006-03-27 01:32:24 +00:00
Chris Lattner
65a455b060
Codegen vector predicate compares.
...
llvm-svn: 27151
2006-03-26 10:06:40 +00:00
Evan Cheng
b17bbf8ccb
Remove PPC:isZeroVector, use ISD::isBuildVectorAllZeros instead
...
llvm-svn: 27149
2006-03-26 09:52:32 +00:00
Chris Lattner
57064915a6
Add some basic patterns for other datatypes
...
llvm-svn: 27116
2006-03-25 07:39:07 +00:00
Chris Lattner
0899b16b2d
Codegen things like:
...
<int -1, int -1, int -1, int -1>
and
<int 65537, int 65537, int 65537, int 65537>
Using things like:
vspltisb v0, -1
and:
vspltish v0, 1
instead of using constant pool loads.
This implements CodeGen/PowerPC/vec_splat.ll:splat_imm_i{32|16}.
llvm-svn: 27106
2006-03-25 06:12:06 +00:00
Chris Lattner
303dc30593
Disable the i32->float G5 optimization. It is unsafe, as documented in the
...
comment.
This fixes 177.mesa, and McCat/09-vor with the td scheduler.
llvm-svn: 27060
2006-03-24 07:53:47 +00:00
Chris Lattner
ba4966c16c
add support for using vxor to build zero vectors. This implements
...
Regression/CodeGen/PowerPC/vec_zero.ll
llvm-svn: 27059
2006-03-24 07:48:08 +00:00
Chris Lattner
f84f3bf95b
When possible, custom lower 32-bit SINT_TO_FP to this:
...
_foo2:
extsw r2, r3
std r2, -8(r1)
lfd f0, -8(r1)
fcfid f0, f0
frsp f1, f0
blr
instead of this:
_foo2:
lis r2, ha16(LCPI2_0)
lis r4, 17200
xoris r3, r3, 32768
stw r3, -4(r1)
stw r4, -8(r1)
lfs f0, lo16(LCPI2_0)(r2)
lfd f1, -8(r1)
fsub f0, f1, f0
frsp f1, f0
blr
This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s
with llcbeta (16.7% and 38.1% respectively).
llvm-svn: 26943
2006-03-22 05:30:33 +00:00
Chris Lattner
31a93c7740
These targets don't support EXTRACT_VECTOR_ELT, though, in time, X86 will.
...
llvm-svn: 26930
2006-03-21 20:51:05 +00:00
Chris Lattner
a498dd25d9
remove dead variable
...
llvm-svn: 26907
2006-03-20 22:37:23 +00:00
Chris Lattner
978628896b
Fix a couple of bugs in permute/splat generate, thanks to Nate for actually
...
figuring these out! :)
llvm-svn: 26904
2006-03-20 18:26:51 +00:00
Chris Lattner
dc3605efdb
Add support for generating vspltw, instead of a vperm instruction with a
...
constant pool load. This generates significantly nicer code for splats.
When tblgen gets bugfixed, we can remove the custom selection code.
llvm-svn: 26898
2006-03-20 06:51:10 +00:00
Chris Lattner
8faa2cf693
Implement PPC::isSplatShuffleMask and PPC::getVSPLTImmediate.
...
llvm-svn: 26897
2006-03-20 06:37:44 +00:00
Chris Lattner
4b7aa59bbc
fix duplicate definition errors
...
llvm-svn: 26896
2006-03-20 06:33:01 +00:00
Chris Lattner
0e56cf0d94
Custom lower arbitrary VECTOR_SHUFFLE's to VPERM.
...
TODO: leave specific ones as VECTOR_SHUFFLE's and turn them into specialized
operations like vsplt*
llvm-svn: 26887
2006-03-20 01:53:53 +00:00