Devang Patel
a8d63b0b30
Do not strip llvm.used values.
...
llvm-svn: 46045
2008-01-16 03:33:05 +00:00
Duncan Sands
81e35b4d47
I noticed that the trampoline straightening transformation could
...
drop attributes on varargs call arguments. Also, it could generate
invalid IR if the transformed call already had the 'nest' attribute
somewhere (this can never happen for code coming from llvm-gcc,
but it's a theoretical possibility). Fix both problems.
llvm-svn: 45973
2008-01-14 19:52:09 +00:00
Chris Lattner
efb498eddb
Fix the miscompilation of MiBench/consumer-lame that was exposed by Evan's
...
byval work. This miscompilation is due to the program indexing an array out
of range and us doing a transformation that broke this.
llvm-svn: 45949
2008-01-14 02:09:12 +00:00
Chris Lattner
5d135cc20a
The isNotSuitableForSRA property is now dead, don't compute it.
...
llvm-svn: 45948
2008-01-14 01:32:52 +00:00
Chris Lattner
760b05a331
Change SRAGlobal to not depend on isNotSuitableForSRA, which makes it very
...
difficult to understand the invariants.
llvm-svn: 45947
2008-01-14 01:31:05 +00:00
Chris Lattner
c2933482a0
Make the 'shrink global to bool' optimization more self contained, and thus
...
easier to show that its safe. No functionality change.
llvm-svn: 45946
2008-01-14 01:17:44 +00:00
Chris Lattner
d22a5f6314
Turn a memcpy from a double* into a load/store of double instead of
...
a load/store of i64. The later prevents promotion/scalarrepl of the
source and dest in many cases.
This fixes the 300% performance regression of the byval stuff on
stepanov_v1p2.
llvm-svn: 45945
2008-01-14 00:28:35 +00:00
Chris Lattner
8560bb9d98
factor memcpy/memmove simplification out to its own SimplifyMemTransfer
...
method, no functionality change.
llvm-svn: 45944
2008-01-13 23:50:23 +00:00
Chris Lattner
5fbf76aaf4
simplify some code. If we can infer alignment for source and dest that are
...
greater than memcpy alignment, and if we lower to load/store, use the best
alignment info we have.
llvm-svn: 45943
2008-01-13 22:30:28 +00:00
Chris Lattner
4f69f1a721
simplify some code by adding a InsertBitCastBefore method,
...
make memmove->memcpy conversion a bit simpler.
llvm-svn: 45942
2008-01-13 22:23:22 +00:00
Chris Lattner
32eae5daa5
Fix PR1907, a nasty miscompilation because instcombine didn't
...
realize that ne & sgt was a signed comparison (it was only
looking at whether the left compare was signed).
llvm-svn: 45937
2008-01-13 20:59:02 +00:00
Duncan Sands
560625b701
Small simplification.
...
llvm-svn: 45932
2008-01-13 08:12:17 +00:00
Duncan Sands
7414cc131b
When turning a call to a bitcast function into a direct call,
...
if this becomes a varargs call then deal correctly with any
parameter attributes on the newly vararg call arguments.
llvm-svn: 45931
2008-01-13 08:02:44 +00:00
Chris Lattner
d90840eddc
we don't have to make an explicit copy of a byval argument when
...
inlining a function if we know that the function does not write
to *any* memory. This implements test/Transforms/Inline/byval2.ll
llvm-svn: 45912
2008-01-12 18:54:29 +00:00
Chris Lattner
fb5876f0be
Allow clients to specify the inline threshold when creating
...
the inliner pass. Patch by Robert Zeh.
llvm-svn: 45903
2008-01-12 06:49:13 +00:00
Duncan Sands
6f49217a5e
When DAE drops the varargs part of a function, ensure any
...
attributes on the vararg call arguments are also dropped.
llvm-svn: 45892
2008-01-11 23:13:45 +00:00
Chris Lattner
a1246ba8ed
Teach argpromote to ruthlessly hack small byval structs when it can
...
get away with it, which exposes opportunities to eliminate the memory
objects entirely. For example, we now compile byval.ll to:
define internal void @f1(i32 %b.0, i64 %b.1) {
entry:
%tmp2 = add i32 %b.0, 1 ; <i32> [#uses=0]
ret void
}
define i32 @main() nounwind {
entry:
call void @f1( i32 1, i64 2 )
ret i32 0
}
This seems like it would trigger a lot for code that passes around small
structs (e.g. SDOperand's or _Complex)...
llvm-svn: 45886
2008-01-11 22:31:41 +00:00
Chris Lattner
8644bdca87
Use smallptrset instead of std::set for efficiency.
...
llvm-svn: 45878
2008-01-11 19:36:30 +00:00
Chris Lattner
44aaf42d14
a byval argument is guaranteed to be valid to load.
...
llvm-svn: 45877
2008-01-11 19:34:32 +00:00
Chris Lattner
129a0e4f7d
Update this code to use eraseFromParent where possible. Compute
...
whether an argument is byval and pass into isSafeToPromoteArgument.
llvm-svn: 45876
2008-01-11 19:20:39 +00:00
Chris Lattner
85a0b511cc
replace a loop with a constant time check.
...
llvm-svn: 45875
2008-01-11 18:55:10 +00:00
Chris Lattner
c9666b967f
another minor datastructure tweak.
...
llvm-svn: 45874
2008-01-11 18:47:45 +00:00
Chris Lattner
a6b0783f14
start using smallvector to avoid vector heap thrashing.
...
llvm-svn: 45873
2008-01-11 18:43:58 +00:00
Chris Lattner
bf51fecdc4
When inlining a functino with a byval argument, make an explicit
...
copy of it in case the callee modifies the struct.
llvm-svn: 45853
2008-01-11 06:09:30 +00:00
Chris Lattner
67f581b344
Implement PR1795, an instcombine hack for forming GEPs with integer pointer arithmetic.
...
llvm-svn: 45745
2008-01-08 07:23:51 +00:00
Duncan Sands
7955cf0cd7
Small cleanup for handling of type/parameter attribute
...
incompatibility.
llvm-svn: 45704
2008-01-07 17:16:06 +00:00
Gordon Henriksen
f0803127c6
Deleting an empty file. Thanks, /usr/bin/patch!
...
llvm-svn: 45675
2008-01-07 02:29:04 +00:00
Gordon Henriksen
db4f51e1b9
With this patch, the LowerGC transformation becomes the
...
ShadowStackCollector, which additionally has reduced overhead with
no sacrifice in portability.
Considering a function @fun with 8 loop-local roots,
ShadowStackCollector introduces the following overhead
(x86):
; shadowstack prologue
movl L_llvm_gc_root_chain$non_lazy_ptr, %eax
movl (%eax), %ecx
movl $___gc_fun, 20(%esp)
movl $0, 24(%esp)
movl $0, 28(%esp)
movl $0, 32(%esp)
movl $0, 36(%esp)
movl $0, 40(%esp)
movl $0, 44(%esp)
movl $0, 48(%esp)
movl $0, 52(%esp)
movl %ecx, 16(%esp)
leal 16(%esp), %ecx
movl %ecx, (%eax)
; shadowstack loop overhead
(none)
; shadowstack epilogue
movl 48(%esp), %edx
movl %edx, (%ecx)
; shadowstack metadata
.align 3
___gc_fun: # __gc_fun
.long 8
.space 4
In comparison to LowerGC:
; lowergc prologue
movl L_llvm_gc_root_chain$non_lazy_ptr, %eax
movl (%eax), %ecx
movl %ecx, 48(%esp)
movl $8, 52(%esp)
movl $0, 60(%esp)
movl $0, 56(%esp)
movl $0, 68(%esp)
movl $0, 64(%esp)
movl $0, 76(%esp)
movl $0, 72(%esp)
movl $0, 84(%esp)
movl $0, 80(%esp)
movl $0, 92(%esp)
movl $0, 88(%esp)
movl $0, 100(%esp)
movl $0, 96(%esp)
movl $0, 108(%esp)
movl $0, 104(%esp)
movl $0, 116(%esp)
movl $0, 112(%esp)
; lowergc loop overhead
leal 44(%esp), %eax
movl %eax, 56(%esp)
leal 40(%esp), %eax
movl %eax, 64(%esp)
leal 36(%esp), %eax
movl %eax, 72(%esp)
leal 32(%esp), %eax
movl %eax, 80(%esp)
leal 28(%esp), %eax
movl %eax, 88(%esp)
leal 24(%esp), %eax
movl %eax, 96(%esp)
leal 20(%esp), %eax
movl %eax, 104(%esp)
leal 16(%esp), %eax
movl %eax, 112(%esp)
; lowergc epilogue
movl 48(%esp), %edx
movl %edx, (%ecx)
; lowergc metadata
(none)
llvm-svn: 45670
2008-01-07 01:30:53 +00:00
Duncan Sands
fd975e4b3d
The transform that tries to turn calls to bitcast functions into
...
direct calls bails out unless caller and callee have essentially
equivalent parameter attributes. This is illogical - the callee's
attributes should be of no relevance here. Rework the logic, which
incidentally fixes a crash when removed arguments have attributes.
llvm-svn: 45658
2008-01-06 18:27:01 +00:00
Duncan Sands
b8489f09a2
When transforming a call to a bitcast function into
...
a direct call with cast parameters and cast return
value (if any), instcombine was prepared to cast any
non-void return value into any other, whether castable
or not. Add a new predicate for testing whether casting
is valid, and check it both for the return value and
(as a cleanup) for the parameters.
llvm-svn: 45657
2008-01-06 10:12:28 +00:00
Chris Lattner
7e1c3aa702
remove a couple more unsafe xforms in the face of overflow.
...
llvm-svn: 45613
2008-01-05 01:22:42 +00:00
Chris Lattner
983697dfac
remove the (x-y) < 0 comparison xform, it miscompiles
...
things that are not equality comparisons, for example:
(2147479553+4096)-2147479553 < 0 != (2147479553+4096) < 2147479553
llvm-svn: 45612
2008-01-05 01:18:20 +00:00
Wojciech Matyjewicz
9ec15f974f
fix typo
...
llvm-svn: 45594
2008-01-04 20:02:18 +00:00
Chris Lattner
d4c66656a1
Fix PR1896
...
llvm-svn: 45568
2008-01-04 05:04:53 +00:00
Chris Lattner
26b89fd30a
don't hoist FP additions into unconditional adds + selects. This
...
could theoretically introduce a trap, but is also a performance issue.
This speeds up ptrdist/ks by 8%.
llvm-svn: 45533
2008-01-03 07:25:26 +00:00
Chris Lattner
028f584087
add missing #include
...
llvm-svn: 45516
2008-01-02 23:41:05 +00:00
Chris Lattner
ad9a6ccb83
Remove attribution from file headers, per discussion on llvmdev.
...
llvm-svn: 45418
2007-12-29 20:36:04 +00:00
Chris Lattner
8193d4af33
remove attribution from lib Makefiles.
...
llvm-svn: 45415
2007-12-29 20:09:26 +00:00
Christopher Lamb
dfad5f19b4
Disable null pointer folding transforms for non-generic address spaces. This should probably be a target-specific predicate based on address space. That way for targets where this isn't applicable the predicate can be optimized away.
...
llvm-svn: 45403
2007-12-29 07:56:53 +00:00
Chris Lattner
2369d2f4ab
dead calls to llvm.stacksave can be deleted, even though they
...
have potential side-effects.
llvm-svn: 45392
2007-12-29 00:59:12 +00:00
Owen Anderson
ebd3e9c500
Repair a transform that Chris noticed a bug in. Thanks to Nicholas for pointing out my stupid mistakes when writing this patch. :-)
...
llvm-svn: 45384
2007-12-28 07:42:12 +00:00
Chris Lattner
2456399ce5
disable this instcombine xform, it miscompiles:
...
define i32 @main() {
entry:
%z = alloca i32 ; <i32*> [#uses=2]
store i32 0, i32* %z
%tmp = load i32* %z ; <i32> [#uses=1]
%sub = sub i32 %tmp, 1 ; <i32> [#uses=1]
%cmp = icmp ult i32 %sub, 0 ; <i1> [#uses=1]
%retval = select i1 %cmp, i32 1, i32 0 ; <i32> [#uses=1]
ret i32 %retval
}
into ret 1, instead of ret 0.
Christopher, please investigate.
llvm-svn: 45383
2007-12-28 06:24:31 +00:00
Gordon Henriksen
90d48b077d
Fixing several transforms which would drop the collector attribute
...
when copying functions.
llvm-svn: 45356
2007-12-25 22:16:06 +00:00
Chris Lattner
90df7f7424
Don't break critical edges for single-bb loops, this helps with PR1877, though
...
it is only a partial fix. This change is noise for most programs, but
speeds up Shootout-C++/matrix by 20%, Ptrdist/ks by 24%, smg2000 by 8%,
hexxagon by 9%, bzip2 by 9% (not sure I trust this), ackerman by 13%, etc.
OTOH, it slows down Shootout/fib2 by 40% (I'll update PR1877 with this info).
llvm-svn: 45354
2007-12-25 19:06:45 +00:00
Gordon Henriksen
c0a3899bbf
GC poses hazards to the inliner. Consider:
...
define void @f() {
...
call i32 @g()
...
}
define void @g() {
...
}
The hazards are:
- @f and @g have GC, but they differ GC. Inlining is invalid. This
may never occur.
- @f has no GC, but @g does. g's GC must be propagated to @f .
The other scenarios are safe:
- @f and @g have the same GC.
- @f and @g have no GC.
- @g has no GC.
This patch adds inliner checks for the former two scenarios.
llvm-svn: 45351
2007-12-25 03:10:07 +00:00
Chris Lattner
7e1e1f2933
add a -backedge-hack llc-beta option to codegenprepare.
...
When specified, don't split backedges of single-bb loops.
This helps address PR1877
llvm-svn: 45344
2007-12-24 19:32:55 +00:00
Chris Lattner
d64df490ca
implement InstCombine/shift-trunc-shift.ll. This allows
...
us to compile:
#include <math.h>
int t1(double d) { return signbit(d); }
into:
_t1:
movd %xmm0, %rax
shrq $63, %rax
ret
instead of:
_t1:
movd %xmm0, %rax
shrq $32, %rax
shrl $31, %eax
ret
on x86-64.
llvm-svn: 45311
2007-12-22 09:07:47 +00:00
Devang Patel
e035f776e9
If succ has succ itself as one of the predecessors then do
...
not merge current bb and succ even if bb's terminator is
unconditional branch to succ.
llvm-svn: 45305
2007-12-22 01:32:53 +00:00
Duncan Sands
85ca85c070
Make DAE not wipe out attributes on calls, and not drop
...
return attributes on the floor. In the case of a call
to a varargs function where the varargs arguments are
being removed, any call attributes on those arguments
need to be dropped. I didn't do this because I plan to
make it illegal to have such attributes (see next patch).
With this change, compiling the gcc filter2 eh test at -O0
and then running opt -std-compile-opts on it results in
a correctly working program (compiling at -O1 or higher
results in the test failing due to a problem with how we
output eh info into the IR).
llvm-svn: 45285
2007-12-21 19:16:16 +00:00
Christopher Lamb
7ca648a7b1
Implement review feedback, including additional transforms
...
(icmp slt (sub A B) 1) -> (icmp sle A B)
icmp sgt (sub A B) -1) -> (icmp sge A B)
and add testcase.
llvm-svn: 45256
2007-12-20 07:21:11 +00:00