5127 Commits

Author SHA1 Message Date
Anton Korobeynikov
244a615291 Disable stack realignment for these tests
llvm-svn: 50172
2008-04-23 18:25:44 +00:00
Anton Korobeynikov
1898ce20e2 Fix test becase ABI stack alignment dropped to 'normal' value
llvm-svn: 50171
2008-04-23 18:25:16 +00:00
Anton Korobeynikov
15c5a2ce26 Fix test, instruction count is valid only if stack is not realigned
llvm-svn: 50170
2008-04-23 18:24:48 +00:00
Chris Lattner
721ea7ca10 Rewrite multiple return value handling in SCCP. Before, the -sccp pass
would turn every getresult instruction into undef.  This helps with
rdar://5778210

llvm-svn: 50140
2008-04-23 05:38:20 +00:00
Chris Lattner
0dd624d232 remove this testcase. It isn't testing loop rotate, it is testing all
of -std-compile-opts and is now failing because other passes are generating
IR that looks different to input of loop rotate.  Devang, please 
introduce a testcase that only runs loop rotate.

llvm-svn: 50136
2008-04-23 05:36:04 +00:00
Chris Lattner
5a4e46d886 returning an empty multiple return list is not valid.
llvm-svn: 50135
2008-04-23 05:29:14 +00:00
Chris Lattner
be858fc296 make this test more interesting.
llvm-svn: 50128
2008-04-23 03:49:32 +00:00
Chris Lattner
d059ac2e32 distill down the essense of this test.
llvm-svn: 50125
2008-04-23 03:03:42 +00:00
Dale Johannesen
547a55caf1 new test
llvm-svn: 50123
2008-04-23 01:22:22 +00:00
Evan Cheng
680839e258 Don't do: "(X & 4) >> 1 == 2 --> (X & 4) == 4" if there are more than one uses of the shift result.
llvm-svn: 50118
2008-04-23 00:38:06 +00:00
Chris Lattner
e304ae5621 Start doing the significantly useful part of jump threading: handle cases
where a comparison has a phi input and that phi is a constant.  For example,
stuff like:

  Threading edge through bool from 'bb2149' to 'bb2231' with cost: 1, across block:
bb2237:		; preds = %bb2231, %bb2149
	%tmp2328.rle = phi i32 [ %tmp2232, %bb2231 ], [ %tmp2232439, %bb2149 ]		; <i32> [#uses=2]
	%done.0 = phi i32 [ %done.2, %bb2231 ], [ 0, %bb2149 ]		; <i32> [#uses=1]
	%tmp2239 = icmp eq i32 %done.0, 0		; <i1> [#uses=1]
	br i1 %tmp2239, label %bb2231, label %bb2327

or

bb38.i298:		; preds = %bb33.i295, %bb1693
	%tmp39.i296.rle = phi %struct.ibox* [ null, %bb1693 ], [ %tmp39.i296.rle1109, %bb33.i295 ]		; <%struct.ibox*> [#uses=2]
	%minspan.1.i291.reg2mem.1 = phi i32 [ 32000, %bb1693 ], [ %minspan.0.i288, %bb33.i295 ]		; <i32> [#uses=1]
	%tmp40.i297 = icmp eq %struct.ibox* %tmp39.i296.rle, null		; <i1> [#uses=1]
	br i1 %tmp40.i297, label %implfeeds.exit311, label %bb43.i301

This triggers thousands of times in spec.

llvm-svn: 50110
2008-04-22 21:40:39 +00:00
Chris Lattner
c59cf9c8da Dig through multiple levels of AND to thread jumps if needed.
llvm-svn: 50106
2008-04-22 20:46:09 +00:00
Chris Lattner
dcbc6443ae Teach jump threading to thread through blocks like:
br (and X, phi(Y, Z, false)), label L1, label L2

This triggers once on 252.eon and 6 times on 176.gcc.  Blocks 
in question often look like this:

bb262:		; preds = %bb261, %bb248
	%iftmp.251.0 = phi i1 [ true, %bb261 ], [ false, %bb248 ]		; <i1> [#uses=4]
	%tmp270 = icmp eq %struct.rtx_def* %tmp.0.i, null		; <i1> [#uses=1]
	%bothcond = or i1 %iftmp.251.0, %tmp270		; <i1> [#uses=1]
	br i1 %bothcond, label %bb288, label %bb273

In this case, it is clear that it doesn't matter if tmp.0.i is null when coming from bb261.  When coming from bb248, it is all that matters.


Another random example:

check_asm_operands.exit:		; preds = %check_asm_operands.exit.thr_comm, %bb30.i, %bb12.i, %bb6.i413
	%tmp.0.i420 = phi i1 [ true, %bb6.i413 ], [ true, %bb12.i ], [ true, %bb30.i ], [ false, %check_asm_operands.exit.thr_comm ; <i1> [#uses=1]
	call void @llvm.stackrestore( i8* %savedstack ) nounwind 
	%tmp4389 = icmp eq i32 %added_sets_1.0, 0		; <i1> [#uses=1]
	%tmp4394 = icmp eq i32 %added_sets_2.0, 0		; <i1> [#uses=1]
	%bothcond80 = and i1 %tmp4389, %tmp4394		; <i1> [#uses=1]
	%bothcond81 = and i1 %bothcond80, %tmp.0.i420		; <i1> [#uses=1]
	br i1 %bothcond81, label %bb4398, label %bb4397

Here is the case from 252.eon:

bb290.i.i:		; preds = %bb23.i57.i.i, %bb8.i39.i.i, %bb100.i.i, %bb100.i.i, %bb85.i.i110
	%myEOF.1.i.i = phi i1 [ true, %bb100.i.i ], [ true, %bb100.i.i ], [ true, %bb85.i.i110 ], [ true, %bb8.i39.i.i ], [ false, %bb23.i57.i.i ]		; <i1> [#uses=2]
	%i.4.i.i = phi i32 [ %i.1.i.i, %bb85.i.i110 ], [ %i.0.i.i, %bb100.i.i ], [ %i.0.i.i, %bb100.i.i ], [ %i.3.i.i, %bb8.i39.i.i ], [ %i.3.i.i, %bb23.i57.i.i ]		; <i32> [#uses=3]
	%tmp292.i.i = load i8* %tmp16.i.i100, align 1		; <i8> [#uses=1]
	%tmp293.not.i.i = icmp ne i8 %tmp292.i.i, 0		; <i1> [#uses=1]
	%bothcond.i.i = and i1 %tmp293.not.i.i, %myEOF.1.i.i		; <i1> [#uses=1]
	br i1 %bothcond.i.i, label %bb202.i.i, label %bb301.i.i
  Factoring out 3 common predecessors.

On the path from any blocks other than bb23.i57.i.i, the load and compare 
are dead.

llvm-svn: 50096
2008-04-22 07:05:46 +00:00
Chris Lattner
4638234905 add a basic testcase.
llvm-svn: 50093
2008-04-22 06:35:14 +00:00
Nick Lewycky
1b583954ad Start removing 'unwinds to' support from mainline in preparation for 2.3.
llvm-svn: 50086
2008-04-22 05:16:02 +00:00
Chris Lattner
14be19cf1e optimize "p != gep p, ..." better. This allows us to compile
getelementptr-seteq.ll into:

define i1 @test(i64 %X, %S* %P) {
	%C = icmp eq i64 %X, -1		; <i1> [#uses=1]
	ret i1 %C
}

instead of:

define i1 @test(i64 %X, %S* %P) {
	%A.idx.mask = and i64 %X, 4611686018427387903		; <i64> [#uses=1]
	%C = icmp eq i64 %A.idx.mask, 4611686018427387903		; <i1> [#uses=1]
	ret i1 %C
}

And fixes the second half of PR2235.  This speeds up the insertion sort
case by 45%, from 1.12s to 0.77s.  In practice, this will significantly
speed up for loops structured like:

for (double *P = Base + N; P != Base; --P)
  ...

Which happens frequently for C++ iterators.

llvm-svn: 50079
2008-04-22 02:53:33 +00:00
Dan Gohman
93b5be1824 Implement an x86-64 ABI detail of passing structs by hidden first
argument. The x86-64 ABI requires the incoming value of %rdi to
be copied to %rax on exit from a function that is returning a
large C struct.

Also, add a README-X86-64 entry detailing the missed optimization
opportunity and proposing an alternative approach.

llvm-svn: 50075
2008-04-21 23:59:07 +00:00
Duncan Sands
717a1e09aa Make these structs larger to ensure that they
are returned by struct return.

llvm-svn: 50038
2008-04-21 08:17:05 +00:00
Duncan Sands
cfb3631483 Make the struct bigger, to ensure it is returned
by struct return.

llvm-svn: 50037
2008-04-21 08:12:03 +00:00
Owen Anderson
bc6046416f Refactor memcpyopt based on Chris' suggestions. Consolidate several functions
and simplify code that was fallout from the separation of memcpyopt and gvn.

llvm-svn: 50034
2008-04-21 07:45:10 +00:00
Chris Lattner
2c5b96fbee A better fix for my previous patch, MOVZQI2PQIrr just requires SSE2.
llvm-svn: 49986
2008-04-20 05:52:46 +00:00
Chris Lattner
8503e2d236 Not all x86-64 machines have sse3 apparently.
llvm-svn: 49985
2008-04-20 05:47:56 +00:00
Chris Lattner
a9d8d647ca rename *.llx -> *.ll, last batch.
llvm-svn: 49971
2008-04-19 22:32:52 +00:00
Chris Lattner
c310b1f1f3 rename *.llx -> *.ll
llvm-svn: 49970
2008-04-19 22:29:10 +00:00
Chris Lattner
63bd1df323 rename *.llx -> *.ll
llvm-svn: 49969
2008-04-19 22:26:29 +00:00
Chris Lattner
8cde1e71f0 Implement PR2206.
llvm-svn: 49967
2008-04-19 22:17:26 +00:00
Chris Lattner
1303e72c66 refactor handling of symbolic constant folding, picking up
a few new cases( see Integer/a1.ll), but not anything that
would happen in practice.

llvm-svn: 49965
2008-04-19 21:58:19 +00:00
Evan Cheng
f583b3feb6 64-bit atomic operations.
llvm-svn: 49949
2008-04-19 02:30:38 +00:00
Dan Gohman
ac2fac937c Teach llvm-as to accept function types with multiple return types.
llvm-svn: 49945
2008-04-19 00:24:39 +00:00
Evan Cheng
073659986f Be more careful with insert_subreg and extract_subreg where either source or destination operand has already been coalesced with another register that's defined by a insert_subreg or extract_subreg.
llvm-svn: 49843
2008-04-17 07:58:04 +00:00
Owen Anderson
cd1b9c4b43 Make GVN able to remove unnecessary calls to read-only functions again.
llvm-svn: 49842
2008-04-17 05:36:50 +00:00
Evan Cheng
7c2c3333ca Fix a sub-register indice propagation bug.
llvm-svn: 49832
2008-04-17 00:06:42 +00:00
Evan Cheng
e2e899b5c2 Don't forget about sub-register indices when rematting instructions.
llvm-svn: 49830
2008-04-16 23:44:44 +00:00
Evan Cheng
44a0a0c8ee After reading memory that's already freed.
llvm-svn: 49810
2008-04-16 20:24:25 +00:00
Evan Cheng
4b16ea6247 Really test what's intended.
llvm-svn: 49802
2008-04-16 18:21:55 +00:00
Evan Cheng
6d05ce493b Rewrite LiveVariable liveness computation. The new implementation is much simplified. It eliminated the nasty recursive routines and removed the partial def / use bookkeeping. There is also potential for performance improvement by replacing the conservative handling of partial physical register definitions. The code is currently disabled until live interval analysis is taught of the name scheme.
This patch also fixed a couple of nasty corner cases.

llvm-svn: 49784
2008-04-16 09:46:40 +00:00
Owen Anderson
64fc7a4268 XFAIL this test for the moment. The real solution is to prevent ADCE
from transforming loops and adding a separate loop pass for removing
loops with know trip counts.  Until that happens, ADCE is miscompiling this code.

llvm-svn: 49769
2008-04-16 04:25:42 +00:00
Dan Gohman
be8f2b452b Add support for the form of the SSE41 extractps instruction that
puts its result in a 32-bit GPR.

llvm-svn: 49762
2008-04-16 02:32:24 +00:00
Dan Gohman
cf79877623 Recreate the size SDNode instead of reusing the old one in the x86
memcpy lowering code; this ensures that the size node has the desired
result type. This fixes a regression from r49572 with @llvm.memcpy.i64
on x86-32.

llvm-svn: 49761
2008-04-16 01:32:32 +00:00
Dan Gohman
7d27552962 Add movd instructions to move from MMX registers
to 64-bit GPR registers on x86-64.

llvm-svn: 49757
2008-04-15 23:55:07 +00:00
Dale Johannesen
45e14f7753 Don't assume a tail call can't reference a byval
argument to the outer function, this isn't correct.

llvm-svn: 49731
2008-04-15 17:41:34 +00:00
Dan Gohman
3b99b3c807 Treat EntryToken nodes as "passive" so that they aren't added to the
ScheduleDAG; they don't correspond to any actual instructions so they
don't need to be scheduled.

This fixes a bug where the EntryToken was being scheduled multiple
times in some cases, though it ended up not causing any trouble because 
EntryToken doesn't expand into anything. With this fixed the schedulers
reliably schedule the expected number of units, so we can check this
with an assertion.

This requires a tweak to test/CodeGen/X86/loop-hoist.ll because it
ends up getting scheduled differently in a trivial way, though it was
enough to fool the prcontext+grep that the test does.

llvm-svn: 49701
2008-04-15 01:22:18 +00:00
Dan Gohman
cce2b42edc Upgrade these tests for the current intrinsic prototypes.
llvm-svn: 49669
2008-04-14 18:19:18 +00:00
Dale Johannesen
d9a9c746d8 Remove -unwind-tables-optional everywhere, since
this is now the default.

llvm-svn: 49667
2008-04-14 17:56:54 +00:00
Owen Anderson
a6d1d8dec2 The functionality being tested was removed because it was horribly unsafe.
llvm-svn: 49610
2008-04-13 09:51:06 +00:00
Arnold Schwaighofer
82af0e6a43 This patch corrects the handling of byval arguments for tailcall
optimized x86-64 (and x86) calls so that they work (... at least for
my test cases).

Should fix the following problems:

Problem 1: When i introduced the optimized handling of arguments for
tail called functions (using a sequence of copyto/copyfrom virtual
registers instead of always lowering to top of the stack) i did not
handle byval arguments correctly e.g they did not work at all :).

Problem 2: On x86-64 after the arguments of the tail called function
are moved to their registers (which include ESI/RSI etc), tail call
optimization performs byval lowering which causes xSI,xDI, xCX
registers to be overwritten. This is handled in this patch by moving
the arguments to virtual registers first and after the byval lowering
the arguments are moved from those virtual registers back to
RSI/RDI/RCX.

llvm-svn: 49584
2008-04-12 18:11:06 +00:00
Dan Gohman
15edbf989f Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal
on any current target and aren't optimized in DAGCombiner. Instead
of using intermediate nodes, expand the operations, choosing between
simple loads/stores, target-specific code, and library calls,
immediately.

Previously, the code to emit optimized code for these operations
was only used at initial SelectionDAG construction time; now it is
used at all times. This fixes some cases where rep;movs was being
used for small copies where simple loads/stores would be better.

This also cleans up code that checks for alignments less than 4;
let the targets make that decision instead of doing it in
target-independent code. This allows x86 to use rep;movs in
low-alignment cases.

Also, this fixes a bug that resulted in the use of rep;stos for
memsets of 0 with non-constant memory size when the alignment was
at least 4. It's better to use the library in this case, which
can be significantly faster when the size is large.

This also preserves more SourceValue information when memory
intrinsics are lowered into simple loads/stores.

llvm-svn: 49572
2008-04-12 04:36:06 +00:00
Dan Gohman
41f9d24d52 Fix a bug that prevented x86-64 from using rep.movsq for
8-byte-aligned data.

llvm-svn: 49571
2008-04-12 02:35:39 +00:00
Evan Cheng
6e52146f16 If a PHI node has a single implicit_def source, replace it with an implicit_def instead of a copy.
llvm-svn: 49543
2008-04-11 17:54:45 +00:00
Owen Anderson
15e930588a Add testcase for PR2213.
llvm-svn: 49517
2008-04-11 05:13:32 +00:00