Chris Lattner
0a12e343e2
Switch the PPC backend over to using FORMAL_ARGUMENTS for formal argument
...
handling. This makes the lower argument code significantly simpler (we
only need to handle legal argument types).
Incidentally, this also implements support for vector argument registers,
so long as they are not on the stack.
llvm-svn: 28331
2006-05-16 18:18:50 +00:00
Chris Lattner
44ea12c5f8
Implement an important entry from README_ALTIVEC:
...
If an altivec predicate compare is used immediately by a branch, don't
use a (serializing) MFCR instruction to read the CR6 register, which requires
a compare to get it back to CR's. Instead, just branch on CR6 directly. :)
For example, for:
void foo2(vector float *A, vector float *B) {
if (!vec_any_eq(*A, *B))
*B = (vector float){0,0,0,0};
}
We now generate:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
bne cr6, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
instead of:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
cmpwi cr0, r3, 0
beq cr0, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
This implements CodeGen/PowerPC/vec_br_cmp.ll.
llvm-svn: 27804
2006-04-18 17:59:36 +00:00
Chris Lattner
ce6e988fa6
Rename get_VSPLI_elt -> get_VSPLTI_elt
...
Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each
form, eliminating a bunch of Pat patterns in the .td file and allowing us to
CSE stuff more aggressively. This implements
PowerPC/buildvec_canonicalize.ll:VSPLTI
llvm-svn: 27614
2006-04-12 17:37:20 +00:00
Chris Lattner
e8defcff7d
Change the interface to the predicate that determines if vsplti* can be used.
...
No functionality changes.
llvm-svn: 27536
2006-04-08 06:46:53 +00:00
Chris Lattner
c0680ae07e
Match vpku[hw]um(x,x).
...
Convert vsldoi(x,x) to work the same way other (x,x) cases work.
llvm-svn: 27467
2006-04-06 22:28:36 +00:00
Chris Lattner
a52d88ee89
Add support for matching vmrg(x,x) patterns
...
llvm-svn: 27463
2006-04-06 22:02:42 +00:00
Chris Lattner
300076cbd8
Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles.
...
llvm-svn: 27457
2006-04-06 21:11:54 +00:00
Chris Lattner
2875bb116e
Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to
...
lower it and LLVM to have one fewer intrinsic. This implements
CodeGen/PowerPC/vec_shuffle.ll
llvm-svn: 27450
2006-04-06 18:26:28 +00:00
Chris Lattner
10fa7be550
Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into
...
vperm with a perm mask lvx'd from the constant pool.
llvm-svn: 27448
2006-04-06 17:23:16 +00:00
Chris Lattner
4e99e6dfdd
Ask legalize to promote all vector shuffles to be v16i8 instead of having to
...
handle all 4 PPC vector types. This simplifies the matching code and allows
us to eliminate a bunch of patterns. This also adds cases we were missing,
such as CodeGen/PowerPC/vec_splat.ll:splat_h.
llvm-svn: 27400
2006-04-04 17:25:31 +00:00
Chris Lattner
8ba4723c74
Inform the dag combiner that the predicate compares only return a low bit.
...
llvm-svn: 27359
2006-04-02 06:26:07 +00:00
Chris Lattner
e330741a6c
Lower vector compares to VCMP nodes, just like we lower vector comparison
...
predicates to VCMPo nodes.
llvm-svn: 27285
2006-03-31 05:13:27 +00:00
Chris Lattner
ac98e20cc9
Use normal lvx for scalar_to_vector instead of lve*x. They do the exact
...
same thing and we have a dag node for the former.
llvm-svn: 27205
2006-03-28 01:43:22 +00:00
Chris Lattner
65a455b060
Codegen vector predicate compares.
...
llvm-svn: 27151
2006-03-26 10:06:40 +00:00
Evan Cheng
b17bbf8ccb
Remove PPC:isZeroVector, use ISD::isBuildVectorAllZeros instead
...
llvm-svn: 27149
2006-03-26 09:52:32 +00:00
Chris Lattner
0899b16b2d
Codegen things like:
...
<int -1, int -1, int -1, int -1>
and
<int 65537, int 65537, int 65537, int 65537>
Using things like:
vspltisb v0, -1
and:
vspltish v0, 1
instead of using constant pool loads.
This implements CodeGen/PowerPC/vec_splat.ll:splat_imm_i{32|16}.
llvm-svn: 27106
2006-03-25 06:12:06 +00:00
Chris Lattner
ba4966c16c
add support for using vxor to build zero vectors. This implements
...
Regression/CodeGen/PowerPC/vec_zero.ll
llvm-svn: 27059
2006-03-24 07:48:08 +00:00
Chris Lattner
f84f3bf95b
When possible, custom lower 32-bit SINT_TO_FP to this:
...
_foo2:
extsw r2, r3
std r2, -8(r1)
lfd f0, -8(r1)
fcfid f0, f0
frsp f1, f0
blr
instead of this:
_foo2:
lis r2, ha16(LCPI2_0)
lis r4, 17200
xoris r3, r3, 32768
stw r3, -4(r1)
stw r4, -8(r1)
lfs f0, lo16(LCPI2_0)(r2)
lfd f1, -8(r1)
fsub f0, f1, f0
frsp f1, f0
blr
This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s
with llcbeta (16.7% and 38.1% respectively).
llvm-svn: 26943
2006-03-22 05:30:33 +00:00
Chris Lattner
4b7aa59bbc
fix duplicate definition errors
...
llvm-svn: 26896
2006-03-20 06:33:01 +00:00
Chris Lattner
1cdeda1c5a
Check in some intermediate code that adds a skeleton for matching vsplt*
...
instructions
llvm-svn: 26894
2006-03-20 06:15:45 +00:00
Chris Lattner
0e56cf0d94
Custom lower arbitrary VECTOR_SHUFFLE's to VPERM.
...
TODO: leave specific ones as VECTOR_SHUFFLE's and turn them into specialized
operations like vsplt*
llvm-svn: 26887
2006-03-20 01:53:53 +00:00
Chris Lattner
789570bafb
Custom lower SCALAR_TO_VECTOR into lve*x.
...
llvm-svn: 26868
2006-03-19 06:55:52 +00:00
Evan Cheng
7ec94f2ff7
Added getTargetLowering() to TargetMachine. Refactored targets to support this.
...
llvm-svn: 26742
2006-03-13 23:20:37 +00:00
Chris Lattner
137c02aa60
Compile this:
...
void foo(float a, int *b) { *b = a; }
to this:
_foo:
fctiwz f0, f1
stfiwx f0, 0, r4
blr
instead of this:
_foo:
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
stw r2, 0(r4)
blr
This implements CodeGen/PowerPC/stfiwx.ll, and also incidentally does the
right thing for GCC bugzilla 26505.
llvm-svn: 26447
2006-03-01 05:50:56 +00:00
Chris Lattner
6ba976b557
Use a target-specific dag-combine to implement CodeGen/PowerPC/fp-int-fp.ll.
...
llvm-svn: 26445
2006-03-01 04:57:39 +00:00
Chris Lattner
6bb2c3e9cd
split register class handling from explicit physreg handling.
...
llvm-svn: 26308
2006-02-22 00:56:39 +00:00
Chris Lattner
a124432746
Updates to match change of getRegForInlineAsmConstraint prototype
...
llvm-svn: 26305
2006-02-21 23:11:00 +00:00
Chris Lattner
abaa15898f
Implement getConstraintType for PPC.
...
llvm-svn: 26042
2006-02-07 20:16:30 +00:00
Chris Lattner
50af3c6eb7
Add the simple PPC integer constraints
...
llvm-svn: 26027
2006-02-07 00:47:13 +00:00
Chris Lattner
892fe31362
add info about the inline asm register constraints for PPC
...
llvm-svn: 25853
2006-01-31 19:20:21 +00:00
Chris Lattner
6a5d2450a3
Use PPCISD::CALL instead of ISD::CALL
...
llvm-svn: 25717
2006-01-27 23:34:02 +00:00
Chris Lattner
59a4f3f637
Make llvm.frame/returnaddr not crash on ppc
...
llvm-svn: 25710
2006-01-27 22:25:06 +00:00
Nate Begeman
d2c6fbef4a
Remove TLI.LowerReturnTo, and just let targets custom lower ISD::RET for
...
the same functionality. This addresses another piece of bug 680. Next,
on to fixing Alpha VAARG, which I broke last time.
llvm-svn: 25696
2006-01-27 21:09:22 +00:00
Nate Begeman
c29fac7fce
First part of bug 680:
...
Remove TLI.LowerVA* and replace it with SDNodes that are lowered the same
way as everything else.
llvm-svn: 25606
2006-01-25 18:21:52 +00:00
Chris Lattner
a59d6394d2
Give PPCISD:: nodes legible names in dumps.
...
llvm-svn: 25166
2006-01-09 23:52:17 +00:00
Nate Begeman
a114534620
Pattern-match return. Includes gross hack!
...
llvm-svn: 24874
2005-12-20 00:26:01 +00:00
Nate Begeman
1700fe3f71
Prepare support for AltiVec multiply, divide, and sqrt.
...
llvm-svn: 24700
2005-12-13 22:55:22 +00:00
Chris Lattner
68a0fed879
Use new PPC-specific nodes to represent shifts which require the 6-bit
...
amount handling that PPC provides. These are generated by the lowering code
and prevents the dag combiner from assuming (rightfully) that the shifts
don't only look at 5 bits. This fixes a miscompilation of crafty with
the new front-end.
llvm-svn: 24615
2005-12-06 02:10:38 +00:00
Chris Lattner
8d04987a39
Add an initial hack at legalizing GlobalAddress into the appropriate nodes
...
on Darwin to remove smarts from the isel. This is currently disabled by
default (uncomment setOperationAction(ISD::GlobalAddress to enable it).
tblgen needs to become smarter about tglobaladdr nodes and bigger patterns
needed to be added to the .td file. However, we can currently emit stuff like
this: :)
li r2, lo16(L_x$non_lazy_ptr)
lis r3, ha16(L_x$non_lazy_ptr)
lwzx r2, r3, r2
The obvious improvements will follow.
llvm-svn: 24390
2005-11-17 07:30:41 +00:00
Nate Begeman
ee581735d9
Add the ability to lower return instructions to TargetLowering. This
...
allows us to lower legal return types to something else, to meet ABI
requirements (such as that i64 be returned in two i32 regs on Darwin/ppc).
llvm-svn: 23802
2005-10-18 23:23:37 +00:00
Nate Begeman
723637974b
More PPC32 -> PPC changes, as well as merging some classes that were
...
redundant after the change.
llvm-svn: 23759
2005-10-16 05:39:50 +00:00
Chris Lattner
0405f3388f
Rename PowerPC*.h to PPC*.h
...
llvm-svn: 23743
2005-10-14 23:51:18 +00:00
Nate Begeman
718cae4eba
Implement i64<->fp using the fctidz/fcfid instructions on PowerPC when we
...
are allowed to generate 64-bit-only PowerPC instructions for 32 bit hosts,
such as the PowerPC 970.
This speeds up 189.lucas from 81.99 to 32.64 seconds.
llvm-svn: 23250
2005-09-06 22:03:27 +00:00
Chris Lattner
914a0dbba1
Move FCTIWZ handling out of the instruction selectors and into legalization,
...
getting them out of the business of making stack slots.
llvm-svn: 23180
2005-08-31 21:09:52 +00:00
Chris Lattner
f470259b50
implement SELECT_CC fully for the DAG->DAG isel!
...
llvm-svn: 23101
2005-08-26 21:23:58 +00:00
Chris Lattner
04b88ca768
Make fsel emission work with both the pattern and dag-dag selectors, by
...
giving it a non-instruction opcode. The dag->dag selector used to not
select the operands of the fsel, because it thought that whole tree was
already selected.
llvm-svn: 23091
2005-08-26 20:25:03 +00:00
Chris Lattner
dae9a4899f
add initial support for converting select_cc -> fsel in the legalizer
...
instead of in the backend. This currently handles fsel cases with registers,
but doesn't have the 0.0 and -0.0 optimization enabled yet.
Once this is finished, special hack for fp immediates can go away.
llvm-svn: 23075
2005-08-26 00:52:45 +00:00
Chris Lattner
a4c5954c52
Pull the LLVM -> DAG lowering code out of the pattern selector so that it
...
can be shared with the DAG->DAG selector.
llvm-svn: 22799
2005-08-16 17:14:42 +00:00