llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-27 06:54:30 +00:00

Author	SHA1	Message	Date
Bruno Cardoso Lopes	bb491bd56c	Begin to support some vector operations for AVX 256-bit intructions. The long term goal here is to be able to match enough of vector_shuffle and build_vector so all avx intrinsics which aren't mapped to their own built-ins but to shufflevector calls can be codegen'd. This is the first (baby) step, support building zeroed vectors. llvm-svn: 110897	2010-08-12 02:06:36 +00:00
Devang Patel	66fc7d88ae	This is x86 only test. llvm-svn: 110887	2010-08-12 00:17:38 +00:00
Bruno Cardoso Lopes	2051068483	Add testcases for all AVX 256-bit intrinsics added in the last couple days llvm-svn: 110854	2010-08-11 21:12:09 +00:00
Bruno Cardoso Lopes	fa19084e79	Reapply r109881 using a more strict command line for llc. llvm-svn: 110833	2010-08-11 17:39:23 +00:00
Jim Grosbach	401709255b	fix silly typo llvm-svn: 110831	2010-08-11 17:32:46 +00:00
Jim Grosbach	be1b6086b3	Add a target triple, as the runtime library invocation varies a bit by platform. It's apparently "bl __muldf3" on linux, for example. Since that's not what we're checking here, it's more robust to just force a triple. We just wwant to check that the inline FP instructions are only generated on cpus that have them." llvm-svn: 110830	2010-08-11 17:31:12 +00:00
Evan Cheng	c25cd5a82e	Fix test and re-enable it. llvm-svn: 110829	2010-08-11 17:25:51 +00:00
Dan Gohman	afbb9c2f2e	Temporarily disable some failing tests, until they can be properly investigated. llvm-svn: 110825	2010-08-11 16:36:07 +00:00
Jim Grosbach	1128a47289	cortex m4 has floating point support, but only single precision. llvm-svn: 110810	2010-08-11 15:44:15 +00:00
Dan Gohman	7b88985ebd	Temporarily disable some failing tests, until they can be properly investigated. llvm-svn: 110808	2010-08-11 15:09:00 +00:00
Bill Wendling	f10d5c00fc	Consider this code snippet: float t1(int argc) { return (argc == 1123) ? 1.234f : 2.38213f; } We would generate truly awful code on ARM (those with a weak stomach should look away): _t1: movw r1, #1123 movs r2, #1 movs r3, #0 cmp r0, r1 mov.w r0, #0 it eq moveq r0, r2 movs r1, #4 cmp r0, #0 it ne movne r3, r1 adr r0, #LCPI1_0 ldr r0, [r0, r3] bx lr The problem was that legalization was creating a cascade of SELECT_CC nodes, for for the comparison of "argc == 1123" which was fed into a SELECT node for the ?: statement which was itself converted to a SELECT_CC node. This is because the ARM back-end doesn't have custom lowering for SELECT nodes, so it used the default "Expand". I added a fairly simple "LowerSELECT" to the ARM back-end. It takes care of this testcase, but can obviously be expanded to include more cases. Now we generate this, which looks optimal to me: _t1: movw r1, #1123 movs r2, #0 cmp r0, r1 adr r0, #LCPI0_0 it eq moveq r2, #4 ldr r0, [r0, r2] bx lr .align 2 LCPI0_0: .long 1075344593 @ float 2.382130e+00 .long 1067316150 @ float 1.234000e+00 llvm-svn: 110799	2010-08-11 08:43:16 +00:00
Evan Cheng	f8604b772e	Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors. llvm-svn: 110798	2010-08-11 07:17:46 +00:00
Evan Cheng	273160895e	Add ARM Archv6M and let it implies FeatureDB (having dmb, etc.) llvm-svn: 110795	2010-08-11 06:51:54 +00:00
Evan Cheng	e5bab36c75	Add Cortex-M0 support. It's a ARMv6m device (no ARM mode) with some 32-bit instructions: dmb, dsb, isb, msr, and mrs. llvm-svn: 110786	2010-08-11 06:30:38 +00:00
Evan Cheng	5fca4ca5f9	- Add subtarget feature -mattr=+db which determine whether an ARM cpu has the memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. llvm-svn: 110785	2010-08-11 06:22:01 +00:00
Bill Wendling	8c5ac0d30c	Update test to match output of optimize compares for ARM. llvm-svn: 110765	2010-08-11 01:05:02 +00:00
Bill Wendling	37ac7cfa7d	The optimize comparisons pass removes the "cmp" instruction this is checking for. llvm-svn: 110739	2010-08-10 22:16:05 +00:00
Evan Cheng	d9a1b0d046	Re-apply r110655 with fixes. Epilogue must restore sp from fp if the function stack frame has a var-sized object. Also added a test case to check for the added benefit of this patch: it's optimizing away the unnecessary restore of sp from fp for some non-leaf functions. llvm-svn: 110707	2010-08-10 19:30:19 +00:00
Daniel Dunbar	872e84afb5	Revert r110655, "Fix ARM hasFP() semantics. It should return true whenever FP register is", it breaks a couple test-suite tests. llvm-svn: 110701	2010-08-10 18:32:02 +00:00
Jakob Stoklund Olesen	99402e857d	Fix test for more architectures. Patch by Tobias Grosser. llvm-svn: 110685	2010-08-10 16:48:24 +00:00
Tobias Grosser	766f219db9	Fix failing testcase. Those look like typos to me. llvm-svn: 110664	2010-08-10 09:54:29 +00:00
Devang Patel	84f48b5483	Handle TAG_constant for integers. llvm-svn: 110656	2010-08-10 07:11:13 +00:00
Evan Cheng	3d47dbe761	Fix ARM hasFP() semantics. It should return true whenever FP register is reserved, not available for general allocation. This eliminates all the extra checks for Darwin. This change also fixes the use of FP to access frame indices in leaf functions and cleaned up some confusing code in epilogue emission. llvm-svn: 110655	2010-08-10 06:26:49 +00:00
Kalle Raiskila	e2c0e66ff1	Have SPU handle halfvec stores aligned by 8 bytes. llvm-svn: 110576	2010-08-09 16:33:00 +00:00
Dale Johannesen	23f9086dd3	Use sdmem and sse_load_f64 (etc.) for the vector form of CMPSD (etc.) Matching a 128-bit memory operand is wrong, the instruction uses only 64 bits (same as ADDSD etc.) 8193553. llvm-svn: 110491	2010-08-07 00:33:42 +00:00
Rafael Espindola	6d53fded19	Fix eabi calling convention when a 64 bit value shadows r3. Without this what was happening was: * R3 is not marked as "used" * ARM backend thinks it has to save it to the stack because of vaarg * Offset computation correctly ignores it * Offsets are wrong llvm-svn: 110446	2010-08-06 15:35:32 +00:00
Eric Christopher	cf17d8dfa7	Add an option to always emit realignment code for a particular module. llvm-svn: 110404	2010-08-05 23:57:43 +00:00
Devang Patel	9801232716	Move x86 specific tests into test/CodeGen/X86. llvm-svn: 110372	2010-08-05 20:25:37 +00:00
Dan Gohman	d108d2b2f8	Move x86-specific tests out of test/Transforms/LoopStrengthReduce and into test/CodeGen/X86, so that they aren't run when the x86 target is not enabled. Fix uglygep.ll to not be x86-specific. llvm-svn: 110343	2010-08-05 17:04:15 +00:00
Daniel Dunbar	c93cd33f41	tests: CodeGen/X86/GC tests require X86. llvm-svn: 110338	2010-08-05 15:45:33 +00:00
Bill Wendling	446a54d234	The lower invoke pass needs to have unreachable code elimination run after it because it could create such things. This fixes a MingW buildbot test failure. llvm-svn: 110279	2010-08-04 23:36:02 +00:00
Eli Friedman	401dbe036d	PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268	2010-08-04 22:40:58 +00:00
Bill Wendling	46dea7e086	Testcase for r110248. llvm-svn: 110249	2010-08-04 21:56:30 +00:00
Stuart Hastings	003c3778ff	call-imm.ll test case regex fix. Patch by Dimitry Andric! llvm-svn: 110199	2010-08-04 15:31:35 +00:00
Kalle Raiskila	ce1e4d80cb	Make SPU backend handle insertelement and store for "half vectors" llvm-svn: 110198	2010-08-04 13:59:48 +00:00
Bob Wilson	6a2437480a	Combine NEON VABD (absolute difference) intrinsics with ADDs to make VABA (absolute difference with accumulate) intrinsics. Radar 8228576. llvm-svn: 110170	2010-08-04 00:12:08 +00:00
Jakob Stoklund Olesen	058d1fc5bd	OK, that's it. This test is going away now. But don't worry, I am taking it to a nice farm in the country where it can play with other tests. And bunnies. It is not clear what is being tested, and the revision history shows a bunch of random changes to the expected instruction count. Clearly, we are just fudging it to pass whenever it fails. llvm-svn: 110118	2010-08-03 17:21:14 +00:00
Kalle Raiskila	014c93befb	More SPU v2f32 stuff added: insertelement and shuffle. llvm-svn: 110038	2010-08-02 11:22:10 +00:00
Kalle Raiskila	766fd434df	Add preliminary v2f32 support for SPU. Like with v2i32, we just duplicate the instructions and operate on half vectors. Also reorder code in SPUInstrInfo.td for better coherency. llvm-svn: 110037	2010-08-02 10:25:47 +00:00
Kalle Raiskila	21615cb06e	Add preliminary v2i32 support for SPU backend. As there are no such registers in SPU, this support boils down to "emulating" them by duplicating instructions on the general purpose registers. This adds the most basic operations on v2i32: passing parameters, addition, subtraction, multiplication and a few others. llvm-svn: 110035	2010-08-02 08:54:39 +00:00
Eli Friedman	40cb7d9994	PR7781: Fix incorrect shifting in PPCTargetLowering::LowerBUILD_VECTOR. llvm-svn: 109998	2010-08-02 00:18:19 +00:00
Eli Friedman	3c5289c381	PR7774: Fix undefined shifts in Alpha backend. As a bonus, this actually improves the generated code in some cases. llvm-svn: 109985	2010-08-01 21:13:28 +00:00
Bob Wilson	43273fe746	Revert new AVX intrinsic tests. They are breaking buildbots and Bruno is away from a computer now. --- Reverse-merging r109881 into '.': D test/CodeGen/X86/avx-intrinsics-x86.ll D test/CodeGen/X86/avx-intrinsics-x86_64.ll llvm-svn: 109959	2010-07-31 22:36:03 +00:00
Bruno Cardoso Lopes	f6ed26ef55	A bunch of tests for AVX intrinsics llvm-svn: 109881	2010-07-30 19:57:56 +00:00
Eli Friedman	bea7c851cf	Fix for bug reported by Evzen Muller on llvm-commits: make sure to correctly check the range of the constant when optimizing a comparison between a constant and a sign_extend_inreg node. llvm-svn: 109854	2010-07-30 06:44:31 +00:00
Jim Grosbach	1718345a30	Many Thumb2 instructions can reference the full ARM register set (i.e., have 4 bits per register in the operand encoding), but have undefined behavior when the operand value is 13 or 15 (SP and PC, respectively). The trivial coalescer in linear scan sometimes will merge a copy from SP into a subsequent instruction which uses the copy, and if that instruction cannot legally reference SP, we get bad code such as: mls r0,r9,r0,sp instead of: mov r2, sp mls r0, r9, r0, r2 This patch adds a new register class for use by Thumb2 that excludes the problematic registers (SP and PC) and is used instead of GPR for those operands which cannot legally reference PC or SP. The trivial coalescer explicitly requires that the register class of the destination for the COPY instruction contain the source register for the COPY to be considered for coalescing. This prevents errant instructions like that above. PR7499 llvm-svn: 109842	2010-07-30 02:41:01 +00:00
Dale Johannesen	717fbb2b32	Implement vector constants which are splat of integers with mov + vdup. 8003375. This is currently disabled by default because LICM will not hoist a VDUP, so it pessimizes the code if the construct occurs inside a loop (8248029). llvm-svn: 109799	2010-07-29 20:10:08 +00:00
Nate Begeman	133820e806	Implement a vectorized algorithm for <16 x i8> << <16 x i8> This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566	2010-07-28 00:21:48 +00:00
Nate Begeman	068e932975	~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549	2010-07-27 22:37:06 +00:00
Nate Begeman	15fe179ecb	Fix a crash in the dag combiner caused by ConstantFoldBIT_CONVERTofBUILD_VECTOR calling itself recursively and returning a SCALAR_TO_VECTOR node, but assuming the input was always a BUILD_VECTOR. llvm-svn: 109519	2010-07-27 18:02:18 +00:00

1 2 3 4 5 ...

3489 Commits