Commit Graph

4000 Commits

Author SHA1 Message Date
Chris Lattner
e304ae5621 Start doing the significantly useful part of jump threading: handle cases
where a comparison has a phi input and that phi is a constant.  For example,
stuff like:

  Threading edge through bool from 'bb2149' to 'bb2231' with cost: 1, across block:
bb2237:		; preds = %bb2231, %bb2149
	%tmp2328.rle = phi i32 [ %tmp2232, %bb2231 ], [ %tmp2232439, %bb2149 ]		; <i32> [#uses=2]
	%done.0 = phi i32 [ %done.2, %bb2231 ], [ 0, %bb2149 ]		; <i32> [#uses=1]
	%tmp2239 = icmp eq i32 %done.0, 0		; <i1> [#uses=1]
	br i1 %tmp2239, label %bb2231, label %bb2327

or

bb38.i298:		; preds = %bb33.i295, %bb1693
	%tmp39.i296.rle = phi %struct.ibox* [ null, %bb1693 ], [ %tmp39.i296.rle1109, %bb33.i295 ]		; <%struct.ibox*> [#uses=2]
	%minspan.1.i291.reg2mem.1 = phi i32 [ 32000, %bb1693 ], [ %minspan.0.i288, %bb33.i295 ]		; <i32> [#uses=1]
	%tmp40.i297 = icmp eq %struct.ibox* %tmp39.i296.rle, null		; <i1> [#uses=1]
	br i1 %tmp40.i297, label %implfeeds.exit311, label %bb43.i301

This triggers thousands of times in spec.

llvm-svn: 50110
2008-04-22 21:40:39 +00:00
Chris Lattner
c59cf9c8da Dig through multiple levels of AND to thread jumps if needed.
llvm-svn: 50106
2008-04-22 20:46:09 +00:00
Chris Lattner
dcbc6443ae Teach jump threading to thread through blocks like:
br (and X, phi(Y, Z, false)), label L1, label L2

This triggers once on 252.eon and 6 times on 176.gcc.  Blocks 
in question often look like this:

bb262:		; preds = %bb261, %bb248
	%iftmp.251.0 = phi i1 [ true, %bb261 ], [ false, %bb248 ]		; <i1> [#uses=4]
	%tmp270 = icmp eq %struct.rtx_def* %tmp.0.i, null		; <i1> [#uses=1]
	%bothcond = or i1 %iftmp.251.0, %tmp270		; <i1> [#uses=1]
	br i1 %bothcond, label %bb288, label %bb273

In this case, it is clear that it doesn't matter if tmp.0.i is null when coming from bb261.  When coming from bb248, it is all that matters.


Another random example:

check_asm_operands.exit:		; preds = %check_asm_operands.exit.thr_comm, %bb30.i, %bb12.i, %bb6.i413
	%tmp.0.i420 = phi i1 [ true, %bb6.i413 ], [ true, %bb12.i ], [ true, %bb30.i ], [ false, %check_asm_operands.exit.thr_comm ; <i1> [#uses=1]
	call void @llvm.stackrestore( i8* %savedstack ) nounwind 
	%tmp4389 = icmp eq i32 %added_sets_1.0, 0		; <i1> [#uses=1]
	%tmp4394 = icmp eq i32 %added_sets_2.0, 0		; <i1> [#uses=1]
	%bothcond80 = and i1 %tmp4389, %tmp4394		; <i1> [#uses=1]
	%bothcond81 = and i1 %bothcond80, %tmp.0.i420		; <i1> [#uses=1]
	br i1 %bothcond81, label %bb4398, label %bb4397

Here is the case from 252.eon:

bb290.i.i:		; preds = %bb23.i57.i.i, %bb8.i39.i.i, %bb100.i.i, %bb100.i.i, %bb85.i.i110
	%myEOF.1.i.i = phi i1 [ true, %bb100.i.i ], [ true, %bb100.i.i ], [ true, %bb85.i.i110 ], [ true, %bb8.i39.i.i ], [ false, %bb23.i57.i.i ]		; <i1> [#uses=2]
	%i.4.i.i = phi i32 [ %i.1.i.i, %bb85.i.i110 ], [ %i.0.i.i, %bb100.i.i ], [ %i.0.i.i, %bb100.i.i ], [ %i.3.i.i, %bb8.i39.i.i ], [ %i.3.i.i, %bb23.i57.i.i ]		; <i32> [#uses=3]
	%tmp292.i.i = load i8* %tmp16.i.i100, align 1		; <i8> [#uses=1]
	%tmp293.not.i.i = icmp ne i8 %tmp292.i.i, 0		; <i1> [#uses=1]
	%bothcond.i.i = and i1 %tmp293.not.i.i, %myEOF.1.i.i		; <i1> [#uses=1]
	br i1 %bothcond.i.i, label %bb202.i.i, label %bb301.i.i
  Factoring out 3 common predecessors.

On the path from any blocks other than bb23.i57.i.i, the load and compare 
are dead.

llvm-svn: 50096
2008-04-22 07:05:46 +00:00
Chris Lattner
003d69adef refactor some code, no functionality change.
llvm-svn: 50094
2008-04-22 06:36:15 +00:00
Chris Lattner
8837037473 remove dead code.
llvm-svn: 50080
2008-04-22 03:21:48 +00:00
Chris Lattner
14be19cf1e optimize "p != gep p, ..." better. This allows us to compile
getelementptr-seteq.ll into:

define i1 @test(i64 %X, %S* %P) {
	%C = icmp eq i64 %X, -1		; <i1> [#uses=1]
	ret i1 %C
}

instead of:

define i1 @test(i64 %X, %S* %P) {
	%A.idx.mask = and i64 %X, 4611686018427387903		; <i64> [#uses=1]
	%C = icmp eq i64 %A.idx.mask, 4611686018427387903		; <i1> [#uses=1]
	ret i1 %C
}

And fixes the second half of PR2235.  This speeds up the insertion sort
case by 45%, from 1.12s to 0.77s.  In practice, this will significantly
speed up for loops structured like:

for (double *P = Base + N; P != Base; --P)
  ...

Which happens frequently for C++ iterators.

llvm-svn: 50079
2008-04-22 02:53:33 +00:00
Chris Lattner
6e88cf849b fix grammar-o, thanks to Duncan for noticing.
llvm-svn: 50047
2008-04-21 18:25:01 +00:00
Owen Anderson
b171c54227 Remove unneeded #include's.
llvm-svn: 50035
2008-04-21 07:47:38 +00:00
Owen Anderson
bc6046416f Refactor memcpyopt based on Chris' suggestions. Consolidate several functions
and simplify code that was fallout from the separation of memcpyopt and gvn.

llvm-svn: 50034
2008-04-21 07:45:10 +00:00
Chris Lattner
ca82082080 don't assume that the argument passed to fprintf("%s" is a string. This
fixes a crash in opt on 433.milc.

llvm-svn: 50023
2008-04-21 03:18:33 +00:00
Chris Lattner
9a21fbcf81 Use the new SplitBlockPredecessors to implement a todo.
llvm-svn: 50022
2008-04-21 02:57:57 +00:00
Chris Lattner
57e11a167d Move SplitBlockPredecessors out of loopsimplify into BasicBlockUtils.h
as a global helper function.  At the same type, switch it from taking
a vector of predecessors to an arbitrary sequential input.  This allows
us to switch LoopSimplify to use a SmallVector for various temporary
vectors that it passed into SplitBlockPredecessors.

llvm-svn: 50020
2008-04-21 01:28:02 +00:00
Chris Lattner
ba4567c335 Move domtree/frontier updating earlier, allowing us to use it to update phi
nodes, removing a hack.

llvm-svn: 50019
2008-04-21 01:05:08 +00:00
Chris Lattner
53fa4554be Factor dominator tree and frontier updating into SplitBlockPredecessors
instead of doing it after every call.

llvm-svn: 50018
2008-04-21 00:54:38 +00:00
Chris Lattner
34894abd7f fit some more code in 80 cols.
llvm-svn: 50016
2008-04-21 00:25:49 +00:00
Chris Lattner
7da61d5766 simplify code, fit in 80 cols.
llvm-svn: 50015
2008-04-21 00:23:14 +00:00
Chris Lattner
45b3ace5a3 fit in 80 cols
llvm-svn: 50014
2008-04-21 00:19:16 +00:00
Chris Lattner
56fbe56c25 finish the first cut of a jump threading pass implementation.
llvm-svn: 50006
2008-04-20 22:39:42 +00:00
Chris Lattner
4037996a0c replace a slow and verbose version of Instruction::isUsedOutsideOfBlock with
a call to Instruction::isUsedOutsideOfBlock.

llvm-svn: 50005
2008-04-20 22:18:22 +00:00
Chris Lattner
c714a8cee8 we can only thread blocks when there is a pred we can determine the succ of.
llvm-svn: 50003
2008-04-20 21:18:09 +00:00
Chris Lattner
4fcfb808fb improve comments, infrastructure, and add some validity checks for threading.
Add a cost function.

llvm-svn: 50002
2008-04-20 21:13:06 +00:00
Chris Lattner
c8c74f39db Add a new Jump Threading pass, which will handle cases
such as those in PR2235.  Right now the pass is not very
effective. :)

llvm-svn: 50000
2008-04-20 20:35:01 +00:00
Torok Edwin
e038c595c1 g++-4.3 build-fix: CHAR_BIT requires <climits>.
llvm-svn: 49989
2008-04-20 08:33:11 +00:00
Chris Lattner
f390d62b7f Switch to using Simplified ConstantFP::get API.
llvm-svn: 49977
2008-04-20 00:41:09 +00:00
Chris Lattner
d299f7b8cf Allow argpromote to promote struct arguments with a specified number
of elements.  Patch by Matthijs Kooijman!

llvm-svn: 49962
2008-04-19 19:50:01 +00:00
Owen Anderson
cd1b9c4b43 Make GVN able to remove unnecessary calls to read-only functions again.
llvm-svn: 49842
2008-04-17 05:36:50 +00:00
Scott Michel
31b3639a2d Remove unused variable
llvm-svn: 49838
2008-04-17 01:30:44 +00:00
Scott Michel
4b37c88f48 Workaround for PR2207, in which pred_iterator assert gets triggered due to a
wee problem in Xcode 2.[45]/gcc 4.0.1.

llvm-svn: 49831
2008-04-16 23:46:39 +00:00
Chuck Rose III
fbfb612c4e VisualStudio project files updated. #include <algorithm> added to make VisualStudio happy. Also had to undefine setjmp because of #include <csetjmp> turning setjmp into _setjmp in VisualStudio.
llvm-svn: 49743
2008-04-15 21:27:11 +00:00
Dan Gohman
77049e31b6 Remove unnecessary <sstream> includes.
llvm-svn: 49681
2008-04-14 20:40:47 +00:00
Dan Gohman
3e7d0f3882 Minor whitespace and comment cleanups.
llvm-svn: 49671
2008-04-14 18:26:16 +00:00
Owen Anderson
8aaa632351 Revert r49614. As Dan pointed out, some of these aren't correct.
llvm-svn: 49657
2008-04-14 17:38:21 +00:00
Owen Anderson
b54defaff0 Replace calls of the form V1->setName(V2->getName()) with V1->takeName(V2),
which is significantly more efficient.

llvm-svn: 49614
2008-04-13 19:15:17 +00:00
Owen Anderson
f55bae07b7 Fix PR2213 by simultaneously making GVN more aggressive with the return values
of calls and less aggressive with non-readnone calls.

llvm-svn: 49516
2008-04-11 05:11:49 +00:00
Dan Gohman
318d9a6605 Teach InstCombine's ComputeMaskedBits to handle pointer expressions
in addition to integer expressions. Rewrite GetOrEnforceKnownAlignment
as a ComputeMaskedBits problem, moving all of its special alignment
knowledge to ComputeMaskedBits as low-zero-bits knowledge.

Also, teach ComputeMaskedBits a few basic things about Mul and PHI
instructions.

This improves ComputeMaskedBits-based simplifications in a few cases,
but more noticeably it significantly improves instcombine's alignment
detection for loads, stores, and memory intrinsics.

llvm-svn: 49492
2008-04-10 18:43:06 +00:00
Chris Lattner
0bd4d7eef1 Disable an xform we've had for a long time, pow(x,0.5) -> sqrt.
This is not safe for all inputs.

llvm-svn: 49458
2008-04-10 02:07:51 +00:00
Chris Lattner
be01a5f699 Generalize getUnaryFloatFunction to handle any FP unary function, automatically
figuring out the suffix to use.  implement pow(2,x) -> exp2(x).

llvm-svn: 49437
2008-04-09 17:48:11 +00:00
Chris Lattner
0eb93cf8d0 use the new ConstantFP::get method to make this work with
long double and simplify the code.

llvm-svn: 49435
2008-04-09 17:17:35 +00:00
Devang Patel
47b4a18b75 Be conservative if getresult operand is neither call nor invoke.
llvm-svn: 49430
2008-04-09 15:58:24 +00:00
Owen Anderson
ca7e0e21f3 Factor a bunch of functionality related to memcpy and memset transforms out of
GVN and into its own pass.

llvm-svn: 49419
2008-04-09 08:23:16 +00:00
Owen Anderson
0d844f6205 Remove accidentally duplicated code.
llvm-svn: 49418
2008-04-09 07:55:01 +00:00
Chris Lattner
976ea8990e many cleanups to the pow optimizer. Allow it to handle powf,
add support for  pow(x, 2.0) -> x*x.

llvm-svn: 49411
2008-04-09 00:07:45 +00:00
Devang Patel
1cf9e1e337 Fix insert point handling for multiple return values.
llvm-svn: 49367
2008-04-08 02:24:08 +00:00
Owen Anderson
4ad5a5201c Add operator= implementations to SparseBitVector, allowing it to be used in GVN. This results
in both time and memory savings for GVN.  For example, one testcase went from 10.5s to 6s with
this patch.

llvm-svn: 49345
2008-04-07 17:38:23 +00:00
Duncan Sands
64f15131d8 Use Intrinsic::getDeclaration in more places.
llvm-svn: 49338
2008-04-07 13:45:04 +00:00
Duncan Sands
98ed2df5f3 The "stacksave is not nounwind problem" no longer
needs to be fixed here - a previous commit made sure
that intrinsics always get the right attributes.
So remove no-longer needed code, and while there use
Intrinsic::getDeclaration rather than getOrInsertFunction. 

llvm-svn: 49337
2008-04-07 13:43:58 +00:00
Duncan Sands
9622724bc3 Use Intrinsic::getDeclaration to get hold of
intrinsics.  Fix up the argument type (should
be i8*, was an array*).

llvm-svn: 49336
2008-04-07 13:41:19 +00:00
Owen Anderson
93ab00f1d9 Make GVN more memory efficient, particularly on code that contains a large number of
allocations, which GVN can't optimize anyways.

llvm-svn: 49329
2008-04-07 09:59:07 +00:00
Dale Johannesen
5c2c09c01b Mark calls to llvm.stacksave, llvm.stackrestore as
nounwind.  When such calls are inlined into something
else that is invoked, they were getting changed to invokes,
which is badness.

llvm-svn: 49299
2008-04-07 00:08:48 +00:00
Chris Lattner
f8fac07b94 silence a warning when assertions are disabled.
llvm-svn: 49283
2008-04-06 21:44:08 +00:00