Commit Graph

8810 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
e2913e1ad5 Print IV chain numbers while collecting them.
llvm-svn: 155567
2012-04-25 18:01:32 +00:00
Lang Hames
7f69fbca29 Reverting r155468. Chris and Chandler have convinced me that it's dangerous and
in poor taste.

Talking through some alternate solutions with Chandler.

llvm-svn: 155530
2012-04-25 02:16:54 +00:00
Dan Gohman
64171a0b3b Simplify the known retain count tracking; use a boolean state instead
of a precise count. Also, move RRInfo's Partial field into PtrState,
now that it won't increase the size.

llvm-svn: 155513
2012-04-25 00:50:46 +00:00
Dan Gohman
3a24d34041 Build custom predecessor and successor lists for each basic block.
These lists exclude invoke unwind edges and loop backedges which
are being ignored. This makes it easier to ignore them
consistently.

llvm-svn: 155500
2012-04-24 22:53:18 +00:00
Lang Hames
08eb5f2340 Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. This fixes
<rdar://problem/11291436>.

llvm-svn: 155468
2012-04-24 18:58:36 +00:00
Jakob Stoklund Olesen
6c1440cf27 Reapply r155136 after fixing PR12599.
Original commit message:

Defer some shl transforms to DAGCombine.

The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

llvm-svn: 155362
2012-04-23 17:39:52 +00:00
Alexander Potapenko
1f7d60fd8a Fix issue 67 by checking that the interface functions weren't redefined in the compiled source file.
llvm-svn: 155346
2012-04-23 10:47:31 +00:00
Kostya Serebryany
67e32a152a [tsan] use llvm/ADT/Statistic.h for tsan stats
llvm-svn: 155341
2012-04-23 08:44:59 +00:00
Jakob Stoklund Olesen
3d22f26e88 Revert r155136 "Defer some shl transforms to DAGCombine."
While the patch was perfect and defect free, it exposed a really nasty
bug in X86 SelectionDAG that caused an llc crash when compiling lencod.

I'll put the patch back in after fixing the SelectionDAG problem.

llvm-svn: 155181
2012-04-20 00:38:45 +00:00
Bill Wendling
fd7c52fe58 Put this expensive check below the less expensive ones.
llvm-svn: 155166
2012-04-19 23:31:07 +00:00
Dan Gohman
f4472e9a1f Avoid a bug in the path count computation, preventing an infinite
loop repeatedlt making the same change. This is for rdar://11256239.

llvm-svn: 155160
2012-04-19 21:50:46 +00:00
Jakob Stoklund Olesen
1507d20c57 Defer some shl transforms to DAGCombine.
The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

llvm-svn: 155136
2012-04-19 16:46:26 +00:00
Dan Gohman
a99c119e05 Don't crash on code where the user put __attribute__((constructor)) on
a function with arguments. This fixes rdar://11265785.

llvm-svn: 155073
2012-04-18 22:24:33 +00:00
Bill Wendling
c37741ca5a Use a heavy hammer to fix PR12573.
If the loop contains invoke instructions, whose unwind edge escapes the loop,
then don't try to unswitch the loop. Doing so may cause the unwind edge to be
split, which not only is non-trivial but doesn't preserve loop simplify
information.

Fixes PR12573

llvm-svn: 154987
2012-04-18 06:00:09 +00:00
Andrew Trick
a5981a21f9 loop-reduce: Add an early bailout to catch extremely large loops.
This introduces a threshold of 200 IV Users, which is very
conservative but should be sufficient to avoid serious compile time
sink or stack overflow. The llvm test-suite with LTO never exceeds 190
users per loop.

The bug doesn't relate to a specific type of loop. Checking in an
arbitrary giant loop as a unit test would be silly.

Fixes rdar://11262507.

llvm-svn: 154983
2012-04-18 04:00:10 +00:00
Joe Groff
cc9c07aacc fix pr12559: mark unavailable win32 math libcalls
also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint

llvm-svn: 154960
2012-04-17 23:05:54 +00:00
Hal Finkel
5e614e7520 Fix style violation in BBVectorize (pointed out by Bill Wendling)
llvm-svn: 154810
2012-04-16 12:39:17 +00:00
Bill Wendling
66282c6d7f Add a Fixme.
llvm-svn: 154793
2012-04-16 04:23:52 +00:00
Hal Finkel
4f7adc1f50 Simplify checking for pointer types in BBVectorize (this change was suggested by Duncan).
llvm-svn: 154787
2012-04-16 03:49:42 +00:00
Hal Finkel
028d6e153e Fix an error in BBVectorize important for vectorizing pointer types.
When vectorizing pointer types it is important to realize that potential
pairs cannot be connected via the address pointer argument of a load or store.
This is because even after vectorization, the address is still a scalar because
the address of the higher half of the pair is implicit from the address of the
lower half (it need not be, and should not be, explicitly computed).

llvm-svn: 154735
2012-04-14 07:32:50 +00:00
Hal Finkel
c55edb7b35 Enhance BBVectorize to more-properly handle pointer values and vectorize GEPs.
llvm-svn: 154734
2012-04-14 07:32:43 +00:00
Hal Finkel
12b4c41203 Add support to BBVectorize for vectorizing selects.
llvm-svn: 154700
2012-04-13 20:45:45 +00:00
Dan Gohman
0387e6b701 Add some comments, and fix a few places that missed setting Changed.
llvm-svn: 154687
2012-04-13 18:57:48 +00:00
Dan Gohman
d5743c7fd0 Consider ObjC runtime calls objc_storeWeak and others which make a copy of
their argument as "escape" points for objc_retainBlock optimization.
This fixes rdar://11229925.

llvm-svn: 154682
2012-04-13 18:28:58 +00:00
Hal Finkel
f8611de2a6 By default, use Early-CSE instead of GVN for vectorization cleanup.
As has been suggested by Duncan and others, Early-CSE and GVN should
do similar redundancy elimination, but Early-CSE is much less expensive.
Most of my autovectorization benchmarks show a performance regresion, but
all of these are < 0.1%, and so I think that it is still worth using
the less expensive pass.

llvm-svn: 154673
2012-04-13 17:15:33 +00:00
Dan Gohman
81ac0c921f Use the new Use-aware dominates method to apply the objc runtime
library return value optimization for phi uses. Even when the
phi itself is not dominated, the specific use may be dominated.

llvm-svn: 154647
2012-04-13 01:08:28 +00:00
Bill Wendling
8659a23f4a Code-gen may inject code into the IR before it emits the ASM. The linker
obviously cannot know that this code is present, let alone used. So prevent the
internalize pass from internalizing those global values which code-gen may
insert.

llvm-svn: 154645
2012-04-13 01:06:27 +00:00
Dan Gohman
6a5b02f8ee Don't move objc_autorelease calls past autorelease pool boundaries when
optimizing autorelease calls on phi nodes with null operands.
This fixes rdar://11207070.

llvm-svn: 154642
2012-04-13 00:59:57 +00:00
Chad Rosier
b41586c8e1 Typo.
llvm-svn: 154522
2012-04-11 19:21:58 +00:00
Chandler Carruth
80c3e3bbba Add two statistics to help track how we are computing the inline cost.
Yea, 'NumCallerCallersAnalyzed' isn't a great name, suggestions welcome.

llvm-svn: 154492
2012-04-11 10:15:10 +00:00
Kostya Serebryany
3047a70ed9 [tsan] two more compile-time optimizations:
- don't isntrument reads from constant globals.
Saves ~1.5% of instrumented instructions on CPU2006
(counting static instructions, not their execution).
- don't insrument reads from vtable (which is a global constant too).
Saves ~5%.

I did not measure the run-time impact of this,
but it is certainly non-negative.

llvm-svn: 154444
2012-04-10 22:29:17 +00:00
Kostya Serebryany
01d463472d [tsan] compile-time instrumentation: do not instrument a read if
a write to the same temp follows in the same BB.
Also add stats printing.

On Spec CPU2006 this optimization saves roughly 4% of instrumented reads
(which is 3% of all instrumented accesses):
Writes            : 161216
Reads             : 446458
Reads-before-write: 18295

llvm-svn: 154418
2012-04-10 18:18:56 +00:00
Andrew Trick
7230fee696 Fix 12513: Loop unrolling breaks with indirect branches.
Take this opportunity to generalize the indirectbr bailout logic for
loop transformations. CFG transformations will never get indirectbr
right, and there's no point trying.

llvm-svn: 154386
2012-04-10 05:14:42 +00:00
Andrew Trick
83a330c1b9 whitespace
llvm-svn: 154385
2012-04-10 05:14:37 +00:00
Chandler Carruth
b3fb4be360 Teach InstCombine to nuke a common alloca pattern -- an alloca which has
GEPs, bit casts, and stores reaching it but no other instructions. These
often show up during the iterative processing of the inliner, SROA, and
DCE. Once we hit this point, we can completely remove the alloca. These
were actually showing up in the final, fully optimized code in a bunch
of inliner tests I've been working on, and notably they show up after
LLVM finishes optimizing away all function calls involved in
hash_combine(a, b).

llvm-svn: 154285
2012-04-08 14:36:56 +00:00
Hongbin Zheng
48758c581f Refactor: Use positive field names in VectorizeConfig.
llvm-svn: 154249
2012-04-07 03:56:23 +00:00
Chandler Carruth
020a15db9d Sink the collection of return instructions until after *all*
simplification has been performed. This is a bit less efficient
(requires another ilist walk of the basic blocks) but shouldn't matter
in practice. More importantly, it's just too much work to keep track of
all the various ways the return instructions can be mutated while
simplifying them. This fixes yet another crasher, reported by Daniel
Dunbar.

llvm-svn: 154179
2012-04-06 17:21:31 +00:00
Duncan Sands
c7d0fdb71f Make GVN's propagateEquality non-recursive. No intended functionality change.
The modifications are a lot more trivial than they appear to be in the diff!

llvm-svn: 154174
2012-04-06 15:31:09 +00:00
Chandler Carruth
dc52b30dac Sink the return instruction collection until after we're done deleting
dead code, including dead return instructions in some cases. Otherwise,
we end up having a bogus poniter to a return instruction that blows up
much further down the road.

It turns out that this pattern is both simpler to code, easier to update
in the face of enhancements to the inliner cleanup, and likely cheaper
given that it won't add dead instructions to the list.

Thanks to John Regehr's numerous test cases for teasing this out.

llvm-svn: 154157
2012-04-06 01:11:52 +00:00
Dan Gohman
a5e2200b2a Fix accidentally inverted logic from r152803, and make the
testcase slightly less trivial. This fixes rdar://11171718.

llvm-svn: 154118
2012-04-05 20:27:21 +00:00
Hongbin Zheng
4da3f9fa46 BBVectorize: Add the const modifier to the VectorizeConfig because we won't
modify it.

llvm-svn: 154098
2012-04-05 16:07:49 +00:00
Hongbin Zheng
7a4e40f87f Introduce the VectorizeConfig class, with which we can control the behavior
of the BBVectorizePass without using command line option. As pointed out
  by Hal, we can ask the TargetLoweringInfo for the architecture specific
  VectorizeConfig to perform vectorizing with architecture specific
  information.

llvm-svn: 154096
2012-04-05 15:46:55 +00:00
Hongbin Zheng
8d380b332d Add the function "vectorizeBasicBlock" which allow users vectorize a
BasicBlock in other passes, e.g. we can call vectorizeBasicBlock in the
 loop unroll pass right after the loop is unrolled.

llvm-svn: 154089
2012-04-05 08:05:16 +00:00
Jakob Stoklund Olesen
e1ae4f161c Pass the right sign to TLI->isLegalICmpImmediate.
LSR can fold three addressing modes into its ICmpZero node:

  ICmpZero BaseReg + Offset      => ICmp BaseReg, -Offset
  ICmpZero -1*ScaleReg + Offset  => ICmp ScaleReg, Offset
  ICmpZero BaseReg + -1*ScaleReg => ICmp BaseReg, ScaleReg

The first two cases are only used if TLI->isLegalICmpImmediate() likes
the offset.

Make sure the right Offset sign is passed to this method in the second
case. The ARM version is not symmetric.

<rdar://problem/11184260>

llvm-svn: 154079
2012-04-05 03:10:56 +00:00
Rafael Espindola
88a1aeb123 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Hongbin Zheng
c50b4781ab LoopUnrollPass: Use variable "Threshold" instead of "CurrentThreshold" when
reducing unroll count, otherwise the reduced unroll count is not taking
  the "OptimizeForSize" attribute into account.

llvm-svn: 154007
2012-04-04 11:44:08 +00:00
Bill Wendling
1db4186413 Add an option to turn off the expensive GVN load PRE part of GVN.
llvm-svn: 153902
2012-04-02 22:16:50 +00:00
Stepan Dyatkovskiy
0ddc03ebad Fast fix for PR12343:
http://llvm.org/bugs/show_bug.cgi?id=12343

We have not trivial way for splitting edges that are goes from indirect branch. We can do it with some tricks, but it should be additionally discussed. And it is still dangerous due to difficulty of indirect branches controlling.

Fix forbids this case for unswitching.

llvm-svn: 153879
2012-04-02 17:16:45 +00:00
Chandler Carruth
8b30ff4d0f Belatedly address some code review from Chris.
As a side note, I really dislike array_pod_sort... Do we really still
care about any STL implementations that get this so wrong? Does libc++?

llvm-svn: 153834
2012-04-01 10:41:24 +00:00
Chandler Carruth
1a2234d527 Fix a pretty scary bug I introduced into the always inliner with
a single missing character. Somehow, this had gone untested. I've added
tests for returns-twice logic specifically with the always-inliner that
would have caught this, and fixed the bug.

Thanks to Matt for the careful review and spotting this!!! =D

llvm-svn: 153832
2012-04-01 10:21:05 +00:00