llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-05-23 23:46:11 +00:00

Author	SHA1	Message	Date
Chris Lattner	51e1b75fba	Fix some ppc64 issues with vector code. llvm-svn: 29384	2006-07-28 16:45:47 +00:00
Chris Lattner	b4165c39d7	Rename RelocModel::PIC to PIC_, to avoid conflicts with -DPIC. llvm-svn: 29307	2006-07-26 21:12:04 +00:00
Chris Lattner	abaaddc214	Implement Regression/CodeGen/PowerPC/bswap-load-store.ll by folding bswaps into i16/i32 load/stores. llvm-svn: 29089	2006-07-10 20:56:58 +00:00
Chris Lattner	2c3f67f6a7	Implement 64-bit select, bswap, etc. llvm-svn: 28935	2006-06-27 20:14:52 +00:00
Chris Lattner	8569f4042d	PPC doesn't have bit converts to/from i64 llvm-svn: 28932	2006-06-27 18:40:08 +00:00
Chris Lattner	26f2bd4d4b	Implement 64-bit undef, sub, shl/shr, srem/urem llvm-svn: 28929	2006-06-27 18:18:41 +00:00
Chris Lattner	b4a636f966	Use i32 for shift amounts instead of i64. This gets bisort working. llvm-svn: 28927	2006-06-27 17:34:57 +00:00
Chris Lattner	494f476ca7	Implement a bunch of 64-bit cleanliness work. With this, treeadd builds (but doesn't work right). llvm-svn: 28921	2006-06-27 00:04:13 +00:00
Chris Lattner	cbd4d14b24	Improve PPC64 calling convention support llvm-svn: 28919	2006-06-26 22:48:35 +00:00
Chris Lattner	5fa6e47534	Correct returns of 64-bit values, though they seemed to work before... llvm-svn: 28892	2006-06-21 00:34:03 +00:00
Chris Lattner	81845946ff	fix some assumptions that pointers can only be 32-bits. With this, we can now compile: static unsigned long X; void test1() { X = 0; } into: _test1: lis r2, ha16(_X) li r3, 0 stw r3, lo16(_X)(r2) blr Totally amazing :) llvm-svn: 28839	2006-06-16 21:01:35 +00:00
Chris Lattner	fa884ac11b	Rename some subtarget features. A CPU now can have 64-bit instructions, can in 32-bit mode we can choose to optionally use 64-bit registers. llvm-svn: 28824	2006-06-16 17:34:12 +00:00
Evan Cheng	32feafd76c	Type of extract_element index operand should be iPTR. llvm-svn: 28797	2006-06-15 08:18:06 +00:00
Chris Lattner	b231c3d11c	Fix a problem exposed by the local allocator. CALL instructions are not marked as using incoming argument registers, so the local allocator would clobber them between their set and use. To fix this, we give the call instructions a variable number of uses in the CALL MachineInstr itself, so live variables understands the live ranges of these register arguments. llvm-svn: 28744	2006-06-10 01:14:28 +00:00
Chris Lattner	31b150e334	Always reserve space for 8 spilled GPRs. GCC apparently assumes that this space will be available, even if the callee isn't varargs. llvm-svn: 28571	2006-05-30 21:21:04 +00:00
Evan Cheng	de0f25081a	Change RET node to include signness information of the return values. i.e. RET chain, value1, sign1, value2, sign2, ... llvm-svn: 28510	2006-05-26 23:10:12 +00:00
Evan Cheng	4a74dd0c51	CALL node change (arg / sign pairs instead of just arguments). llvm-svn: 28462	2006-05-25 00:57:32 +00:00
Chris Lattner	f604017e47	Patches to make the LLVM sources more -pedantic clean. Patch provided by Anton Korobeynikov! This is a step towards closing PR786. llvm-svn: 28447	2006-05-24 17:04:05 +00:00
Chris Lattner	bc3be2ff8a	Fix CodeGen/Generic/vector.ll:test_div with altivec. llvm-svn: 28445	2006-05-24 00:15:25 +00:00
Chris Lattner	56862bbd53	Handle SETO* like we handle SET*, restoring behavior after Evan's setcc change. This fixes PowerPC/fnegsel.ll. llvm-svn: 28443	2006-05-24 00:06:44 +00:00
Chris Lattner	2208c3214c	Make PPC call lowering more aggressive, making the isel matching code simple enough to be autogenerated. llvm-svn: 28354	2006-05-17 19:00:46 +00:00
Chris Lattner	03c70b7f27	Switch PPC over to a call-selection model where the lowering code creates the copyto/fromregs instead of making the PPCISD::CALL selection code create them. This vastly simplifies the selection code, and moves the ABI handling parts into one place. llvm-svn: 28346	2006-05-17 06:01:33 +00:00
Chris Lattner	348883611c	3 changes, 2 of which are cleanup one of which changes codegen: 1. Rearrange code a bit so that the special case doesn't require indenting lots of code. 2. Add comments describing PPC calling convention. 3. Only round up to 56-bytes of stack space for an outgoing call if the callee is varargs. This saves a bit of stack space. llvm-svn: 28342	2006-05-17 00:15:40 +00:00
Chris Lattner	a36579803f	implement passing/returning vector regs to calls, at least non-varargs calls. llvm-svn: 28341	2006-05-16 23:54:25 +00:00
Chris Lattner	b5271a0f4c	Instead of implementing LowerCallTo directly, let the default impl produce an ISD::CALL node, then custom lower that. This means that we only have to handle LEGAL call operands/results, not every possible type. This allows us to simplify the call code, shrinking it by about 1/3. llvm-svn: 28339	2006-05-16 22:56:08 +00:00
Chris Lattner	40d1eaad0a	Simplify the argument counting logic by only incrementing the index. llvm-svn: 28335	2006-05-16 18:58:15 +00:00
Chris Lattner	0ae068ed8f	Simplify the dead argument handling code. llvm-svn: 28334	2006-05-16 18:54:32 +00:00
Chris Lattner	fbbe542235	Vector args passed in registers don't reserve stack space. llvm-svn: 28333	2006-05-16 18:51:52 +00:00
Chris Lattner	0a12e343e2	Switch the PPC backend over to using FORMAL_ARGUMENTS for formal argument handling. This makes the lower argument code significantly simpler (we only need to handle legal argument types). Incidentally, this also implements support for vector argument registers, so long as they are not on the stack. llvm-svn: 28331	2006-05-16 18:18:50 +00:00
Chris Lattner	199f3f6af8	Fit in 80 cols llvm-svn: 28311	2006-05-16 04:20:24 +00:00
Chris Lattner	adcb0582d8	Remove dead var, fix bad override. llvm-svn: 28264	2006-05-12 21:09:57 +00:00
Chris Lattner	e3de67fae2	Fix CodeGen/Generic/2006-04-28-Sign-extend-bool.ll llvm-svn: 28017	2006-04-28 21:56:10 +00:00
Nate Begeman	7ed816f900	JumpTable support! What this represents is working asm and jit support for x86 and ppc for 100% dense switch statements when relocations are non-PIC. This support will be extended and enhanced in the coming days to support PIC, and less dense forms of jump tables. llvm-svn: 27947	2006-04-22 18:53:45 +00:00
Chris Lattner	47a41ae889	Fix a crash on: void foo2(vector float A, vector float B) { vector float C = (vector float)vec_cmpeq(A, B); if (!vec_any_eq(A, B)) B = (vector float){0,0,0,0}; A = C; } llvm-svn: 27808	2006-04-18 18:28:22 +00:00
Chris Lattner	2bd91746e1	pretty print node name llvm-svn: 27806	2006-04-18 18:05:58 +00:00
Chris Lattner	44ea12c5f8	Implement an important entry from README_ALTIVEC: If an altivec predicate compare is used immediately by a branch, don't use a (serializing) MFCR instruction to read the CR6 register, which requires a compare to get it back to CR's. Instead, just branch on CR6 directly. :) For example, for: void foo2(vector float A, vector float B) { if (!vec_any_eq(A, B)) *B = (vector float){0,0,0,0}; } We now generate: _foo2: mfspr r2, 256 oris r5, r2, 12288 mtspr 256, r5 lvx v2, 0, r4 lvx v3, 0, r3 vcmpeqfp. v2, v3, v2 bne cr6, LBB1_2 ; UnifiedReturnBlock LBB1_1: ; cond_true vxor v2, v2, v2 stvx v2, 0, r4 mtspr 256, r2 blr LBB1_2: ; UnifiedReturnBlock mtspr 256, r2 blr instead of: _foo2: mfspr r2, 256 oris r5, r2, 12288 mtspr 256, r5 lvx v2, 0, r4 lvx v3, 0, r3 vcmpeqfp. v2, v3, v2 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 cmpwi cr0, r3, 0 beq cr0, LBB1_2 ; UnifiedReturnBlock LBB1_1: ; cond_true vxor v2, v2, v2 stvx v2, 0, r4 mtspr 256, r2 blr LBB1_2: ; UnifiedReturnBlock mtspr 256, r2 blr This implements CodeGen/PowerPC/vec_br_cmp.ll. llvm-svn: 27804	2006-04-18 17:59:36 +00:00
Chris Lattner	e90fdf3b98	Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing even/odd halves. Thanks to Nate telling me what's what. llvm-svn: 27793	2006-04-18 04:28:57 +00:00
Chris Lattner	5951b60cb4	Implement v16i8 multiply with this code: vmuloub v5, v3, v2 vmuleub v2, v3, v2 vperm v2, v2, v5, v4 This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are 6.79x faster than before. Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with GCC. Remove the 'integer multiplies' todo from the README file. llvm-svn: 27792	2006-04-18 03:57:35 +00:00
Chris Lattner	4d84b56e64	Lower v8i16 multiply into this code: li r5, lo16(LCPI1_0) lis r6, ha16(LCPI1_0) lvx v4, r6, r5 vmulouh v5, v3, v2 vmuleuh v2, v3, v2 vperm v2, v2, v5, v4 where v4 is: LCPI1_0: ; <16 x ubyte> .byte 2 .byte 3 .byte 18 .byte 19 .byte 6 .byte 7 .byte 22 .byte 23 .byte 10 .byte 11 .byte 26 .byte 27 .byte 14 .byte 15 .byte 30 .byte 31 This is 5.07x faster on the G5 (measured) than lowering to scalar code + loads/stores. llvm-svn: 27789	2006-04-18 03:43:48 +00:00
Chris Lattner	613d7fda64	Custom lower v4i32 multiplies into a cute sequence, instead of having legalize scalarize the sequence into 4 mullw's and a bunch of load/store traffic. This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements PowerPC/vec_mul.ll llvm-svn: 27788	2006-04-18 03:24:30 +00:00
Chris Lattner	f2347c31b4	Make sure to check splats of every constant we can, handle splat(31) by being a bit more clever, add support for odd splats from -31 to -17. llvm-svn: 27764	2006-04-17 18:09:22 +00:00
Chris Lattner	cc4222d95b	Teach the ppc backend to use rol and vsldoi to generate splatted constants. This implements vec_constants.ll:test_vsldoi and test_rol llvm-svn: 27760	2006-04-17 17:55:10 +00:00
Chris Lattner	2d8d6c9feb	Make some code more general, adding support for constant formation of several new patterns. llvm-svn: 27754	2006-04-17 06:58:41 +00:00
Chris Lattner	9dd4ebffca	Learn how to make odd splatted constants in range [17,29]. This implements PowerPC/vec_constants.ll:test_29. llvm-svn: 27752	2006-04-17 06:07:44 +00:00
Chris Lattner	72a67a5b1f	Pull some code out into a helper function. Effeciently codegen even splats in the range [-32,30]. This allows us to codegen <30,30,30,30> as: vspltisw v0, 15 vadduwm v2, v0, v0 instead of as a cp load. llvm-svn: 27750	2006-04-17 06:00:21 +00:00
Chris Lattner	5367a73dec	Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle, if it can be implemented in 3 or fewer discrete altivec instructions, codegen it as such. This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll llvm-svn: 27748	2006-04-17 05:28:54 +00:00
Chris Lattner	d86516991a	Implement a TODO: have the legalizer canonicalize a bunch of operations to one type (v4i32) so that we don't have to write patterns for each type, and so that more CSE opportunities are exposed. llvm-svn: 27731	2006-04-16 01:37:57 +00:00
Chris Lattner	f4126f0db7	Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors. Remove some done items from the todo list. llvm-svn: 27729	2006-04-16 01:01:29 +00:00
Chris Lattner	44245f11c3	Fix a crash when faced with a shuffle vector that has an undef in its mask. llvm-svn: 27726	2006-04-15 23:48:05 +00:00
Chris Lattner	5c9d357d7c	Allow undef in a shuffle mask llvm-svn: 27714	2006-04-14 23:19:08 +00:00

... 6 7 8 9 10 ...

550 Commits