Commit Graph

4714 Commits

Author SHA1 Message Date
Duncan Sands
47bcca5cea This would be better done as an executable test.
llvm-svn: 46493
2008-01-29 06:04:54 +00:00
Duncan Sands
84bc852b52 After recent changes we fail to optimize this test
sufficiently to have it pass.  I'm removing it from
the testsuite and adding it to PR452 instead.

llvm-svn: 46492
2008-01-29 05:57:23 +00:00
Devang Patel
86ff705c22 Filter loops that subtract induction variables.
These loops are not yet handled.

Fix PR 1912.

llvm-svn: 46484
2008-01-29 02:20:41 +00:00
Scott Michel
dc780aeb57 Overhaul Cell SPU's addressing mode internals so that there are now
only two addressing mode nodes, SPUaform and SPUindirect (vice the
three previous ones, SPUaform, SPUdform and SPUxform). This improves
code somewhat because we now avoid using reg+reg addressing when
it can be avoided. It also simplifies the address selection logic,
which was the main point for doing this.

Also, for various global variables that would be loaded using SPU's
A-form addressing, prefer D-form offs[reg] addressing, keeping the
base in a register if the variable is used more than once.

llvm-svn: 46483
2008-01-29 02:16:57 +00:00
Devang Patel
51fde22367 New test.
llvm-svn: 46479
2008-01-29 01:10:04 +00:00
Bill Wendling
839e21bce4 Add test to make sure that #pragma mark/error doesn't error if there are
unbalanced quotes.

llvm-svn: 46476
2008-01-29 00:41:29 +00:00
Duncan Sands
93f785a638 Pure/const functions with ByVal parameters cannot
be marked readonly either.

llvm-svn: 46456
2008-01-28 19:25:47 +00:00
Chris Lattner
20854cf4e7 this test is now compiled into the right thing.
llvm-svn: 46454
2008-01-28 17:38:46 +00:00
Duncan Sands
ecab334ce0 Make this more likely to be passed byval.
llvm-svn: 46451
2008-01-28 10:35:11 +00:00
Nick Lewycky
6b070b1b93 Handle some more combinations of extend and icmp. Fixes PR1940.
llvm-svn: 46431
2008-01-28 03:48:02 +00:00
Chris Lattner
359756ea4b Fix PR1932 by disabling an xform invalid for fdiv.
llvm-svn: 46429
2008-01-28 00:58:18 +00:00
Chris Lattner
7250586ec9 Fix PR1938 by forcing the code that uses an undefined value to branch one
way or the other.  Rewriting the code itself prevents subsequent analysis
passes from making contradictory conclusions about the code that could 
cause an infeasible path to be made feasible.

llvm-svn: 46427
2008-01-28 00:32:30 +00:00
Chris Lattner
26a8116f49 Update this test. Due to dag combiner improvements, we now compile
f7/f11 to:

_f7:
	eor r0, r0, #2, 2 @ -2147483648
	bx lr
_f11:
	bic r0, r0, #2, 2 @ -2147483648
	bx lr

instead of:

_f7:
	fmsr s0, r0
	fnegs s0, s0
	fmrs r0, s0
	bx lr

_f11:
	fmsr s0, r0
	fabss s0, s0
	fmrs r0, s0
	bx lr

llvm-svn: 46423
2008-01-27 23:26:37 +00:00
Nick Lewycky
cd28ef8950 Be more careful modifying the use_list while also iterating through it.
llvm-svn: 46417
2008-01-27 18:35:00 +00:00
Duncan Sands
e77256b325 Revert r46393: readonly/readnone functions are no
longer allowed to write through byval arguments.

llvm-svn: 46416
2008-01-27 18:12:58 +00:00
Chris Lattner
2ab1fd3824 Implement some dag combines that allow doing fneg/fabs/fcopysign in integer
registers if used by a bitconvert or using a bitconvert.  This allows us to
avoid constant pool loads and use cheaper integer instructions when the
values come from or end up in integer regs anyway.  For example, we now 
compile CodeGen/X86/fp-in-intregs.ll to:

_test1:
	movl	$2147483648, %eax
	xorl	4(%esp), %eax
	ret
_test2:
	movl	$1065353216, %eax
	orl	4(%esp), %eax
	andl	$3212836864, %eax
	ret

Instead of:
_test1:
	movss	4(%esp), %xmm0
	xorps	LCPI2_0, %xmm0
	movd	%xmm0, %eax
	ret
_test2:
	movss	4(%esp), %xmm0
	andps	LCPI3_0, %xmm0
	movss	LCPI3_1, %xmm1
	andps	LCPI3_2, %xmm1
	orps	%xmm0, %xmm1
	movd	%xmm1, %eax
	ret

bitconverts can happen due to various calling conventions that require
fp values to passed in integer regs in some cases, e.g. when returning
a complex.

llvm-svn: 46414
2008-01-27 17:42:27 +00:00
Bill Wendling
0e2b8c2c45 The CorrelatedExpressions pass is now no more.
llvm-svn: 46409
2008-01-27 06:13:32 +00:00
Chris Lattner
aa553aa0c1 Fold fptrunc(add (fpextend x), (fpextend y)) -> add(x,y), as GCC does.
llvm-svn: 46406
2008-01-27 05:29:54 +00:00
Chris Lattner
e66aea6532 New test to verify that "merging 4 loads into a vec load" continues to work and
continues to infer alignment info.

llvm-svn: 46403
2008-01-26 20:06:45 +00:00
Chris Lattner
682346a7b0 Infer alignment of loads and increase their alignment when we can tell they are
from the stack.  This allows us to compile stack-align.ll to:

_test:
	movsd	LCPI1_0, %xmm0
	movapd	%xmm0, %xmm1
***	andpd	4(%esp), %xmm1
	andpd	_G, %xmm0
	addsd	%xmm1, %xmm0
	movl	20(%esp), %eax
	movsd	%xmm0, (%eax)
	ret

instead of:

_test:
	movsd	LCPI1_0, %xmm0
**	movsd	4(%esp), %xmm1
**	andpd	%xmm0, %xmm1
	andpd	_G, %xmm0
	addsd	%xmm1, %xmm0
	movl	20(%esp), %eax
	movsd	%xmm0, (%eax)
	ret

llvm-svn: 46401
2008-01-26 19:45:50 +00:00
Chris Lattner
f0c3240135 remove a useless xfailed test.
llvm-svn: 46400
2008-01-26 19:35:46 +00:00
Duncan Sands
9fae964ef7 Invert this test, because it is wrong if we allow
readonly functions to use byval parameters as local
storage (how much do we want this?).

llvm-svn: 46399
2008-01-26 12:33:01 +00:00
Bill Wendling
7b83688c73 If there's no instructions being emitted on X86 for a function, emit a
nop. Emit the nop directly for PPC.

llvm-svn: 46398
2008-01-26 09:03:52 +00:00
Bill Wendling
26fb9335f5 Need to convert to LLVM code and not C.
llvm-svn: 46397
2008-01-26 06:56:08 +00:00
Bill Wendling
3e622b88b6 Rename the .c to .ll
llvm-svn: 46396
2008-01-26 06:53:40 +00:00
Bill Wendling
7151e8d92c Move testcase to the code gen directory.
llvm-svn: 46395
2008-01-26 06:53:06 +00:00
Duncan Sands
792234c366 Create an explicit copy for byval parameters even
when inlining a readonly function.

llvm-svn: 46393
2008-01-26 06:41:49 +00:00
Bill Wendling
1e56a2ffb6 If we have a function like this:
void bork() {
  int *address = 0;
  *address = 0;
}

It's compiled into LLVM code that looks like this:

define void @bork() noreturn nounwind  {
entry:
        unreachable
}

This is bad on some platforms (like PPC) because it will generate the label for
the function but no body. The label could end up being associated with some
non-code related stuff, like a section. This places a "trap" instruction if the
SimplifyCFG pass removed all code from the function leaving only one
"unreachable" instruction.

llvm-svn: 46387
2008-01-26 01:43:44 +00:00
Devang Patel
c40820e322 Add another testcase.
llvm-svn: 46385
2008-01-26 01:21:48 +00:00
Chris Lattner
53a98f46fd Fix some bugs in SimplifyNodeWithTwoResults where it would call deletenode to
delete a node even if it was not dead in some cases.  Instead, just add it to
the worklist.  Also, make sure to use the CombineTo methods, as it was doing
things that were unsafe: the top level combine loop could touch dangling memory.

This fixes CodeGen/Generic/2008-01-25-dag-combine-mul.ll

llvm-svn: 46384
2008-01-26 01:09:19 +00:00
Evan Cheng
e62e9a8d96 New test case.
llvm-svn: 46382
2008-01-26 00:35:43 +00:00
Chris Lattner
c2df169459 add a testcase for a bug Duncan pointed out.
llvm-svn: 46372
2008-01-25 22:36:24 +00:00
Duncan Sands
ced29554f7 Test for PR1942.
llvm-svn: 46357
2008-01-25 17:36:44 +00:00
Owen Anderson
a4ff15c69f DeadStoreElimination can treat byval parameters as if there were alloca's for the purpose of removing end-of-function stores.
llvm-svn: 46351
2008-01-25 10:10:33 +00:00
Chris Lattner
79076fdf2a Add target-specific dag combines for FAND(x,0) and FOR(x,0). This allows
us to compile:

double test(double X) {
  return copysign(0.0, X);
}

into:

_test:
	andpd	LCPI1_0(%rip), %xmm0
	ret

instead of:
_test:
	pxor	%xmm1, %xmm1
	andpd	LCPI1_0(%rip), %xmm1
	movapd	%xmm0, %xmm2
	andpd	LCPI1_1(%rip), %xmm2
	movapd	%xmm1, %xmm0
	orpd	%xmm2, %xmm0
	ret

llvm-svn: 46344
2008-01-25 05:46:26 +00:00
Devang Patel
587591ba1b New test.
llvm-svn: 46333
2008-01-24 23:55:34 +00:00
Chris Lattner
cd5013eb2f Teach basicaa that 'byval' arguments define a new memory location that
can't be aliased to other known objects.  This allows us to know that byval 
pointer args don't alias globals, etc.

llvm-svn: 46315
2008-01-24 18:00:32 +00:00
Chris Lattner
16a8f126d3 Significantly simplify and improve handling of FP function results on x86-32.
This case returns the value in ST(0) and then has to convert it to an SSE
register.  This causes significant codegen ugliness in some cases.  For 
example in the trivial fp-stack-direct-ret.ll testcase we used to generate:

_bar:
	subl	$28, %esp
	call	L_foo$stub
	fstpl	16(%esp)
	movsd	16(%esp), %xmm0
	movsd	%xmm0, 8(%esp)
	fldl	8(%esp)
	addl	$28, %esp
	ret

because we move the result of foo() into an XMM register, then have to
move it back for the return of bar.

Instead of hacking ever-more special cases into the call result lowering code
we take a much simpler approach: on x86-32, fp return is modeled as always 
returning into an f80 register which is then truncated to f32 or f64 as needed.
Similarly for a result, we model it as an extension to f80 + return.

This exposes the truncate and extensions to the dag combiner, allowing target
independent code to hack on them, eliminating them in this case.  This gives 
us this code for the example above:

_bar:
	subl	$12, %esp
	call	L_foo$stub
	addl	$12, %esp
	ret

The nasty aspect of this is that these conversions are not legal, but we want
the second pass of dag combiner (post-legalize) to be able to hack on them.
To handle this, we lie to legalize and say they are legal, then custom expand
them on entry to the isel pass (PreprocessForFPConvert).  This is gross, but
less gross than the code it is replacing :)

This also allows us to generate better code in several other cases.  For 
example on fp-stack-ret-conv.ll, we now generate:

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstps	8(%esp)
	movl	16(%esp), %eax
	cvtss2sd	8(%esp), %xmm0
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

where before we produced (incidentally, the old bad code is identical to what
gcc produces):

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstpl	(%esp)
	cvtsd2ss	(%esp), %xmm0
	cvtss2sd	%xmm0, %xmm0
	movl	16(%esp), %eax
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

Note that we generate slightly worse code on pr1505b.ll due to a scheduling 
deficiency that is unrelated to this patch.

llvm-svn: 46307
2008-01-24 08:07:48 +00:00
Chris Lattner
214c11ee6f take these with a pr #
llvm-svn: 46303
2008-01-24 06:35:44 +00:00
Evan Cheng
91089e6d66 Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type.
llvm-svn: 46286
2008-01-23 23:17:41 +00:00
Evan Cheng
d436c2e724 SSE varargs arguments are passed in memory.
llvm-svn: 46262
2008-01-22 23:26:53 +00:00
Chris Lattner
973072dc77 update this test to pass with duncan's change.
llvm-svn: 46246
2008-01-22 05:31:58 +00:00
Nick Lewycky
78780f175b Multiply can be evaluated in a different type, so long as the target type has
a smaller bitwidth.

llvm-svn: 46244
2008-01-22 05:08:48 +00:00
Devang Patel
6fae526290 New test.
llvm-svn: 46220
2008-01-21 22:15:58 +00:00
Devang Patel
6d3139addd New test.
llvm-svn: 46209
2008-01-21 19:28:13 +00:00
Dale Johannesen
7807e86260 Implement flt_rounds for PowerPC.
llvm-svn: 46174
2008-01-18 19:55:37 +00:00
Chris Lattner
49fd213770 remove extraneous &&'s from tests, as Scott is apparently not going to.
llvm-svn: 46173
2008-01-18 19:53:43 +00:00
Dale Johannesen
b2d9e41233 Test is correct again for the moment.
llvm-svn: 46172
2008-01-18 19:53:31 +00:00
Chris Lattner
febc7ea9bf Fix a latent bug exposed by my truncstore patch. We compiled stfiwx-2.ll to:
_test:
	fctiwz f0, f1
	stfiwx f0, 0, r4
	blr 

instead of:

_test:
	fctiwz f0, f1
	stfd f0, -8(r1)
	nop
	nop
	lwz r2, -4(r1)
	stb r2, 0(r4)
	blr 

The former is not correct (stores 4 bytes, not 1).

llvm-svn: 46161
2008-01-18 16:54:56 +00:00
Scott Michel
506e61bad1 Forward progress: crtbegin.c now compiles successfully!
Fixed CellSPU's A-form (local store) address mode, so that all globals,
externals, constant pool and jump table symbols are now wrapped within
a SPUISD::AFormAddr pseudo-instruction. This now identifies all local
store memory addresses, although it requires a bit of legerdemain during
instruction selection to properly select loads to and stores from local
store, properly generating "LQA" instructions.

Also added mul_ops.ll test harness for exercising integer multiplication.

llvm-svn: 46142
2008-01-17 20:38:41 +00:00