Commit Graph

26406 Commits

Author SHA1 Message Date
Nick Lewycky
dd2222ab27 Remove redundant test for vector-nature. Scan the vector first to see whether
our optz'n will apply to it, then build the replacement vector only if needed.

llvm-svn: 61279
2008-12-20 16:48:00 +00:00
Dan Gohman
ab5072f624 Use SmallVector's pop_back_val.
llvm-svn: 61277
2008-12-20 16:42:33 +00:00
Dan Gohman
8c5bea15ca Use the correct Preds and Succs lists in setHeightDirty()
and setDepthDirty(), respectively. This fixes PR3241.

llvm-svn: 61276
2008-12-20 16:34:57 +00:00
Dan Gohman
e75b2ce6e2 Use ~0u instead of -1u as the special value, to hopefully avoid
warnings on compilers that warn about such things.

llvm-svn: 61263
2008-12-19 22:23:43 +00:00
Evan Cheng
da55c4ffb7 Fix PR3149. If an early clobber def is a physical register and it is tied to an input operand, it effectively extends the live range of the physical register. Currently we do not have a good way to represent this.
172     %ECX<def> = MOV32rr %reg1039<kill>
180     INLINEASM <es:subl $5,$1
        sbbl $3,$0>, 10, %EAX<def>, 14, %ECX<earlyclobber,def>, 9, %EAX<kill>,
36, <fi#0>, 1, %reg0, 0, 9, %ECX<kill>, 36, <fi#1>, 1, %reg0, 0
188     %EAX<def> = MOV32rr %EAX<kill>
196     %ECX<def> = MOV32rr %ECX<kill>
204     %ECX<def> = MOV32rr %ECX<kill>
212     %EAX<def> = MOV32rr %EAX<kill>
220     %EAX<def> = MOV32rr %EAX
228     %reg1039<def> = MOV32rr %ECX<kill>

The early clobber operand ties ECX input to the ECX def.

The live interval of ECX is represented as this:
%reg20,inf = [46,47:1)[174,230:0)  0@174-(230) 1@46-(47)

The right way to represent this is something like
%reg20,inf = [46,47:2)[174,182:1)[181:230:0)  0@174-(182) 1@181-230 @2@46-(47)

Of course that won't work since that means overlapping live ranges defined by two val#.

The workaround for now is to add a bit to val# which says the val# is redefined by a early clobber def somewhere. This prevents the move at 228 from being optimized away by SimpleRegisterCoalescing::AdjustCopiesBackFrom.

llvm-svn: 61259
2008-12-19 20:58:01 +00:00
John Criswell
b7e13addf7 The fields for the stoppoint debug intrinsic have not changed, so update the
version number assertions.

llvm-svn: 61257
2008-12-19 19:56:36 +00:00
Gordon Henriksen
1f4a555efc C bindings for dyn_cast_or_null.
This operation can be used to build dyn_cast, isa, and cast.

llvm-svn: 61252
2008-12-19 18:39:45 +00:00
Chris Lattner
7819d9c8be Add support for writing LLVM IR to a specified BitstreamWriter.
Patch by Lukasz Janyst!

llvm-svn: 61251
2008-12-19 18:37:59 +00:00
Dan Gohman
22b7b328a4 Move the patterns which have i8 immediates before the patterns
that have i32 immediates so that they get selected first. This
currently only matters in the JIT, as assemblers will
automatically use the smallest encoding.

llvm-svn: 61250
2008-12-19 18:25:21 +00:00
Evan Cheng
17b53ef5b0 - CodeGenPrepare does not split loop back edges but it only knows about back edges of single block loops. It now does a DFS walk to find loop back edges.
- Use SplitBlockPredecessors to factor out common predecessors of the critical edge destination. This is disabled for now due to some regressions.

llvm-svn: 61248
2008-12-19 18:03:11 +00:00
Chris Lattner
27c3b1df00 Fix some release-assert warnings
llvm-svn: 61244
2008-12-19 17:03:38 +00:00
Rafael Espindola
7593f0004f Fix bug 3202.
The EH_frame and .eh symbols are now private, except for darwin9 and earlier.
The patch also fixes the definition of PrivateGlobalPrefix on pcc linux.

llvm-svn: 61242
2008-12-19 10:55:56 +00:00
Nick Lewycky
4f2d81176d Update the .cvs files for nocapture.
llvm-svn: 61241
2008-12-19 09:41:54 +00:00
Nick Lewycky
b8719a653f Commit missed files from nocapture change.
llvm-svn: 61240
2008-12-19 09:38:31 +00:00
Nick Lewycky
8f96b51785 Resubmit support for the 'nocapture' attribute.
The problematic part of this patch is that we were out of attribute bits,
requiring some fancy bit hacking to make it fit (by shrinking alignment)
without breaking existing users or the file format.

This change will require users to rebuild llvm-gcc to match llvm.

llvm-svn: 61239
2008-12-19 06:39:12 +00:00
Bill Wendling
d4a3c71eb1 Perform this loop only when the -debug flag is specified.
llvm-svn: 61238
2008-12-19 02:09:57 +00:00
Dan Gohman
3991753a76 Initialize the ImplicitDefed member, to avoid getting stale
data from a previous block.

llvm-svn: 61237
2008-12-19 00:46:20 +00:00
Bill Wendling
4ca9e94f91 Didn't mean to commit this.
llvm-svn: 61222
2008-12-18 22:19:50 +00:00
Dan Gohman
42b2f38113 Teach LowerSubregs to preserve kill/dead information when lowering
subreg instructions.

llvm-svn: 61220
2008-12-18 22:14:08 +00:00
Bill Wendling
5ec9cb2217 Re-XFAIL this test until debug stuff settles down.
llvm-svn: 61219
2008-12-18 22:13:31 +00:00
Dan Gohman
ca2ab1f2c8 Make LowerSubregs' debug output for EXTRACT_SUBREG consistent with
that of INSERT_SUBREG and SUBREG_TO_REG.

llvm-svn: 61218
2008-12-18 22:11:34 +00:00
Dan Gohman
7000e62d3a Fix a copy+pasto in an assertion message.
llvm-svn: 61217
2008-12-18 22:07:25 +00:00
Dan Gohman
34e47d552b Fix indentation level.
llvm-svn: 61216
2008-12-18 22:06:01 +00:00
Dan Gohman
1c74326cea When emitting instructions that define EFLAGS and the EFLAGS value isn't
used, mark the defs as dead.

llvm-svn: 61215
2008-12-18 22:03:42 +00:00
Dan Gohman
54790143b2 When setting up the frame pointer, add it as a live-in register to all
non-entry blocks, so that it doesn't appear use-before-def anywhere.

llvm-svn: 61214
2008-12-18 22:01:52 +00:00
Dan Gohman
47de8c174c Print subreg information in MachineInstr::dump.
llvm-svn: 61213
2008-12-18 21:51:27 +00:00
Mon P Wang
9f8945c5b9 Fixed x86 code generation of multiple for v2i64. It was incorrect for SSE4.1.
llvm-svn: 61211
2008-12-18 21:42:19 +00:00
Mon P Wang
84ad2a383d Added support for vector widening.
llvm-svn: 61209
2008-12-18 20:03:17 +00:00
Evan Cheng
d3d1efc584 Remove dead comments.
llvm-svn: 61201
2008-12-18 09:01:18 +00:00
Nick Lewycky
c6e4019d57 Oops! Left out a line.
Simplifying the sdiv might allow further simplifications for our users.

llvm-svn: 61196
2008-12-18 06:42:28 +00:00
Nick Lewycky
ab50d88e6a Make all the vector elements positive in an srem of constant vector.
llvm-svn: 61195
2008-12-18 06:31:11 +00:00
Chris Lattner
6ecf1b2bb1 Fix PR2929 by making bugpoint/code extract propagate the nothrow
bit from the original function to the cloned one.

llvm-svn: 61194
2008-12-18 05:52:56 +00:00
Dan Gohman
fae8a30dce Give MachineLICM a name, for -time-passes etc.
llvm-svn: 61184
2008-12-18 01:37:56 +00:00
Dan Gohman
6b4f972c9f Move post-RA scheduling before branch folding for now, because branch
folding's tail merging doesn't currently preserve liveness information
which post-RA scheduling requires.

llvm-svn: 61183
2008-12-18 01:36:42 +00:00
Owen Anderson
9a489bf18a Re-apply r61158 in a form that no longer breaks tests.
llvm-svn: 61182
2008-12-18 01:27:19 +00:00
Dale Johannesen
4209bca535 Revert previous patch, appears to break bootstrap.
llvm-svn: 61181
2008-12-18 01:23:41 +00:00
Dan Gohman
fb30c38893 Mark the x86 fp stack registers as "reserved". This tells LiveVariables
and the RegisterScavenger not to expect traditional liveness 
techniques are applicable to these registers, since we don't fully
modify the effects of push and pop after stackification.

llvm-svn: 61179
2008-12-18 01:05:09 +00:00
Dale Johannesen
3e0c1f771b Fix the time regression I introduced in 464.h264ref with
my last patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  (This patch does not handle 
all the cases where this can happen.)  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Everything above is exercised in
CodeGen/X86/lsr-negative-stride.ll (and ifcvt4 in ARM which is
the same IR).

llvm-svn: 61178
2008-12-18 00:57:22 +00:00
Chris Lattner
d159077cb9 reapply this hunk from Bill's reversion in r61169, it is conservative
and safe and orthogonal from turning off load pre.

llvm-svn: 61177
2008-12-18 00:51:32 +00:00
Chris Lattner
005d68a2a9 make instnamer name unnamed blocks as well as instructions and args.
llvm-svn: 61175
2008-12-18 00:33:11 +00:00
Bill Wendling
3eb7c0254b Temporarily revert r61027. It was causing a bootstrap failure in "release" mode
with everyone's favorite error messages:

Comparing stages 2 and 3
warning: ./cc1-checksum.o differs
warning: ./cc1plus-checksum.o differs
Bootstrap comparison failure!
./c-decl.o differs
./cp/decl.o differs
./df-core.o differs
./gcc.o differs
./i386.o differs
./stor-layout.o differs
./tree-pretty-print.o differs
./tree.o differs
make[2]: *** [compare] Error 1
make[1]: *** [stage3-bubble] Error 2

See PR3227.

llvm-svn: 61169
2008-12-17 23:31:20 +00:00
Devang Patel
ceeecba890 Today the front-ends (llvm-gcc and clang) generate multiple llvm.dbg.compile_units to identify source file for various debug entities. Each llvm.dbg.compile_unit matches one file on the disk. However, the backend only supports one DW_TAG_compile_unit per .o file. The backend selects first compile_unit from the vector to construct DW_TAG_compile_unit entry, which is not correct in all cases.
First step to resolve this is, record file name and directory directly in debug info for various debug entities. 

llvm-svn: 61164
2008-12-17 22:39:29 +00:00
Owen Anderson
5f1bc95673 Revert r61158 for now, as it caused some test failures.
llvm-svn: 61159
2008-12-17 22:17:27 +00:00
Owen Anderson
446162d848 Fix miscompilations caused by renumbering, and enable it as part of prealloc splitting.
llvm-svn: 61158
2008-12-17 22:06:59 +00:00
Chris Lattner
a2aa680882 This adds some missing functions to the C binding:
- ability to insert previously created instructions using a builder
- creation of aliases
- creation of inline asm constants

Patch by Zoltan Varga!

llvm-svn: 61153
2008-12-17 21:39:50 +00:00
Bill Wendling
d364440e53 Forgot to revert r61031 when I reverted r61019, r61030, and r61040.
llvm-svn: 61150
2008-12-17 20:59:57 +00:00
Mon P Wang
bc3622287b Fix expansion of vsetcc to set the high bit for true instead of 1.
llvm-svn: 61129
2008-12-17 08:49:47 +00:00
Chris Lattner
c6134bffaf insert some sequence points and preincrement an iterator to avoid
iterator invalidation problems.

llvm-svn: 61124
2008-12-17 05:42:08 +00:00
Chris Lattner
196c166a06 Enhance heap sra to be substantially more aggressive w.r.t PHI
nodes.  This allows it to do fairly general phi insertion if a 
load from a pointer global wants to be SRAd but the load is used
by (recursive) phi nodes.  This fixes a pessimization on ppc
introduced by Load PRE.

llvm-svn: 61123
2008-12-17 05:28:49 +00:00
Dan Gohman
a8796f4908 Double the amount of memory reserved for SUnits. This is a
temporary workaround for an obscure bug. When node cloning is
used, it is possible that more SUnits will be created, and
if the SUnits std::vector has to reallocate, it will
invalidate all the graph edges.

llvm-svn: 61122
2008-12-17 04:30:46 +00:00
Dan Gohman
6ee60e3ac3 Use getDepth() and getHeight() instead of accessing the
Depth and Height members directly, as they may not be
current.

llvm-svn: 61121
2008-12-17 04:25:52 +00:00
Eli Friedman
4aae828bf8 Fix for PR3225: disable a broken optimization in
DAGTypeLegalizer::ExpandShiftWithKnownAmountBit.

In terms of restoring the optimization, the best fix here isn't 
obvious... any ideas?

llvm-svn: 61119
2008-12-17 03:35:17 +00:00
Dale Johannesen
7a81d1b0ab Clarify that the scale factor from CheckForIVReuse
can be negative.  Keep track of whether all uses of
an IV are outside the loop.  Some cosmetics; no
functional change.

llvm-svn: 61109
2008-12-16 22:16:28 +00:00
Dale Johannesen
e348900657 A new dag combine; several permutations of this
are there under ADD, this one was missing.

llvm-svn: 61107
2008-12-16 22:13:49 +00:00
Owen Anderson
36aba82416 Add code to renumber split intervals into new vregs. This is disabled for now until I finish working out some iterator invalidation issues.
llvm-svn: 61104
2008-12-16 21:35:08 +00:00
Chris Lattner
c4cc4a328f Fix another crash found by inspection. If we have a PHI node merging
the load multiple times, make sure the check the uses of the PHI to 
ensure they are transformable.

llvm-svn: 61102
2008-12-16 21:24:51 +00:00
Chris Lattner
8b1f2f76d7 fix a crash found by inspection.
llvm-svn: 61101
2008-12-16 21:04:51 +00:00
Eli Friedman
de614f9842 Add a helper to remove a branch and DCE the condition, and use it
consistently for deleting branches.  In addition to being slightly 
more readable, this makes SimplifyCFG a bit better 
about cleaning up after itself when it makes conditions unused.

llvm-svn: 61100
2008-12-16 20:54:32 +00:00
Dan Gohman
38316f49bd Eliminate the loop that walks the critical path. Instead, just track the
position in the critical path during the main instruction walk.  This
eliminates the need for the CritialAntiDep DenseMap.

llvm-svn: 61096
2008-12-16 19:27:52 +00:00
Bill Wendling
f807a68f2e Temporarily revert r61019, r61030, and r61040. These were breaking LLVM Release
builds.

llvm-svn: 61094
2008-12-16 19:06:48 +00:00
Dan Gohman
f8f5625748 Preserve SourceValue information when lowering produces multiple loads from
different offsets within the same stack slot.

llvm-svn: 61093
2008-12-16 18:25:36 +00:00
Evan Cheng
96d87db03b We have decided not to support inline asm where an output operand with a matching input operand with incompatible type (i.e. either one is a floating point and the other is an integer or the sizes of the types differ). SelectionDAGBuild will catch these and exit with an error.
llvm-svn: 61092
2008-12-16 18:21:39 +00:00
Oscar Fuentes
9a300fe336 CMake: Added DbgInfoPrinter.cpp to lib/Analysis/CMakeFiles.txt.
llvm-svn: 61087
2008-12-16 12:25:04 +00:00
Torok Edwin
9562aa1742 Add -print-dbginfo pass that prints LLVM IR with comments inserted to show
which source/line a certain BB/instruction comes from, original variable names,
and original (unmangled) C++ name of functions.

llvm-svn: 61085
2008-12-16 09:09:19 +00:00
Torok Edwin
fe974a7ca9 Add utility functions to search for DbgStopPointInst corresponding to an
instruction or BasicBlock, and to search for DbgDeclareInst corresponding to a
variable.

llvm-svn: 61084
2008-12-16 09:07:36 +00:00
Torok Edwin
2e2c464771 use different name for parameter to make it clear that we set DIDescriptor::GV
llvm-svn: 61083
2008-12-16 09:06:01 +00:00
Nick Lewycky
1b0fc83809 Generalize support for analyzing loops to include SLE/SGE loop exit conditions
and support for non-unit strides with signed exit conditions.

llvm-svn: 61082
2008-12-16 08:30:01 +00:00
Chris Lattner
e35c79577f switch some std::set/std::map to SmallPtrSet/DenseMap.
llvm-svn: 61081
2008-12-16 07:34:30 +00:00
Chris Lattner
b3becc5776 fix PR3217: fully cached queries need to be verified against the
visited set before they are used.  If used, their blocks need to be
added to the visited set so that subsequent queries don't use conflicting
pointer values in the cache result blocks.

llvm-svn: 61080
2008-12-16 07:10:09 +00:00
Dan Gohman
10eb3ccaeb Enable anti-dependence breaking by default when post-RA scheduling is enabled.
llvm-svn: 61078
2008-12-16 06:21:45 +00:00
Dan Gohman
9f37a0296b When breaking an anti-dependency, don't use a register which has seen
one of its aliases defined. This is conservative, but tricky subreg
corner cases are outside the primary aim of this pass.

llvm-svn: 61077
2008-12-16 06:20:58 +00:00
Dan Gohman
c3e24d559b Add initial support for back-scheduling address computations,
especially in the case of addresses computed from loop induction
variables.

llvm-svn: 61075
2008-12-16 03:35:01 +00:00
Dan Gohman
e2cf452271 Remove some special-case logic in ScheduleDAGSDNodes's
latency computation code that is no longer needed with the
new method for handling latencies.

llvm-svn: 61074
2008-12-16 03:31:11 +00:00
Dan Gohman
40a40dd7c1 Fix some register-alias-related bugs in the post-RA scheduler liveness
computation code. Also, avoid adding output-depenency edges when both
defs are dead, which frequently happens with EFLAGS defs.

Compute Depth and Height lazily, and always in terms of edge latency
values. For the schedulers that don't care about latency, edge latencies
are set to 1.

Eliminate Cycle and CycleBound, and LatencyPriorityQueue's Latencies array.
These are all subsumed by the Depth and Height fields.

llvm-svn: 61073
2008-12-16 03:25:46 +00:00
Dan Gohman
67e694b0ea Add a simple target-independent heuristic to allow targets with no
instruction itinerary data to back-schedule loads.

llvm-svn: 61070
2008-12-16 02:38:22 +00:00
Dan Gohman
8ddcdef08a Move addPred and removePred out-of-line.
llvm-svn: 61067
2008-12-16 01:05:52 +00:00
Dan Gohman
23aae3bba9 Make addPred and removePred return void, since the return value is not
currently used by anything.

llvm-svn: 61066
2008-12-16 01:00:55 +00:00
Dan Gohman
d6ad3f6178 This getEdgeAttributes doesn't need a template argument.
llvm-svn: 61065
2008-12-16 00:55:00 +00:00
Chris Lattner
9255745f90 enhance heap-sra to apply to fixed sized array allocations, not just
variable sized array allocations.

llvm-svn: 61051
2008-12-15 21:44:34 +00:00
Mon P Wang
bb3c2994f0 Added support for splitting and scalarizing vector shifts.
llvm-svn: 61050
2008-12-15 21:44:00 +00:00
Chris Lattner
2356082b5e Use stripPointerCasts.
llvm-svn: 61047
2008-12-15 21:20:32 +00:00
Chris Lattner
15ac84e027 minor tweaks for formatting, allow bitcast in ValueIsOnlyUsedLocallyOrStoredToOneGlobal.
llvm-svn: 61046
2008-12-15 21:08:54 +00:00
Chris Lattner
592852605f refactor some code into a new TryToOptimizeStoreOfMallocToGlobal function.
Use GetElementPtrInst::hasAllZeroIndices where possible.

llvm-svn: 61045
2008-12-15 21:02:25 +00:00
Chris Lattner
0e79aa6595 Teach basicaa to use the nocapture attribute when possible. When the
intrinsics are properly marked nocapture, the fixme should be addressed.

llvm-svn: 61040
2008-12-15 18:59:22 +00:00
Dan Gohman
f3c46b3496 Fix printing of PseudoSourceValues in SDNode graphs.
llvm-svn: 61036
2008-12-15 17:28:10 +00:00
Chris Lattner
f678691da6 add some more notes.
llvm-svn: 61033
2008-12-15 08:32:28 +00:00
Chris Lattner
8119a1f70d Add a testcase for GCC PR 23455, which lpre handles now. Add some
comments about why we're not getting other cases.

llvm-svn: 61032
2008-12-15 07:49:24 +00:00
Nick Lewycky
212b42c4c0 Update generated files after nocapture syntax change.
llvm-svn: 61031
2008-12-15 07:31:07 +00:00
Nick Lewycky
504288e7af It turns out that "align 1" and unaligned are different. Add a bias to the
alignment attribute such that 0 means unaligned.

This will probably require a rebuild of llvm-gcc because of the change to
Attributes.h. If you see many test failures on "make check", please rebuild
your llvm-gcc.

llvm-svn: 61030
2008-12-15 07:29:55 +00:00
Mon P Wang
2f96113348 Added support to LegalizeType for expanding the operands of scalar to vector
and insert vector element.  Modified extract vector element to extend the
result to match the expected promoted type.

llvm-svn: 61029
2008-12-15 06:57:02 +00:00
Chris Lattner
30c1871282 gvn now hoists this load out of the hot non-call path.
llvm-svn: 61028
2008-12-15 06:34:48 +00:00
Chris Lattner
b467a5b4a5 Enable Load PRE. This teaches GVN to push partially redundant loads up the
CFG when there is exactly one predecessor where the load is not available.
This is designed to not increase code size but still eliminate partially
redundant loads.  This fires 1765 times on 403.gcc even though it doesn't
do critical edge splitting yet (the most common reason for it to fail).

llvm-svn: 61027
2008-12-15 05:28:29 +00:00
Chris Lattner
be89ad1615 if we have a phi translation failure of the start block,
return *just* a clobber of the start block, not other 
random stuff as well.

llvm-svn: 61026
2008-12-15 04:58:29 +00:00
Owen Anderson
90af4c9640 Ifdef out some code that I didn't mean to enable by default yet.
llvm-svn: 61024
2008-12-15 03:52:17 +00:00
Chris Lattner
22cfa14eed make GVN try to rename inputs to the resultant replaced values, which
cleans up the generated code a bit.  This should have the added benefit of
not randomly renaming functions/globals like my previous patch did. :)

llvm-svn: 61023
2008-12-15 03:46:38 +00:00
Chris Lattner
c92b131639 Implement initial support for PHI translation in memdep. This means that
memdep keeps track of how PHIs affect the pointer in dep queries, which 
allows it to eliminate the load in cases like rle-phi-translate.ll, which
basically end up being:

BB1:
   X = load P
   br BB3
BB2:
   Y = load Q
   br BB3
BB3:
   R = phi [P] [Q]
   load R

turning "load R" into a phi of X/Y.  In addition to additional exposed
opportunities, this makes memdep safe in many cases that it wasn't before
(which is required for load PRE) and also makes it substantially more 
efficient.  For example, consider:


bb1:  // has many predecessors.
   P = some_operator()
   load P

In this example, previously memdep would scan all the predecessors of BB1
to see if they had something that would mustalias P.  In some cases (e.g.
test/Transforms/GVN/rle-must-alias.ll) it would actually find them and end
up eliminating something.  In many other cases though, it would scan and not
find anything useful.  MemDep now stops at a block if the pointer is defined
in that block and cannot be phi translated to predecessors.  This causes it
to miss the (rare) cases like rle-must-alias.ll, but makes it faster by not
scanning tons of stuff that is unlikely to be useful.  For example, this
speeds up GVN as a whole from 3.928s to 2.448s (60%)!.  IMO, scalar GVN 
should be enhanced to simplify the rle-must-alias pointer base anyway, which
would allow the loads to be eliminated.

In the future, this should be enhanced to phi translate through geps and 
bitcasts as well (as indicated by FIXMEs) making memdep even more powerful.

llvm-svn: 61022
2008-12-15 03:35:32 +00:00
Owen Anderson
c2d2c0bdf3 Add support for slow-path GVN with full phi construction for scalars. This is disabled for now, as it actually pessimizes code in the abscence
of phi translation for load elimination.  This slow down GVN a bit, by about 2% on 403.gcc.

llvm-svn: 61021
2008-12-15 02:03:00 +00:00
Nick Lewycky
120e01b631 Fix whitespace in comment.
Remove TODO; icmp isn't a binary operator, so this function will never deal
with them.

llvm-svn: 61020
2008-12-15 01:35:36 +00:00
Nick Lewycky
8bdae4db80 Introducing nocapture, a parameter attribute for pointers to indicate that the
callee will not introduce any new aliases of that pointer.

The attributes had all bits allocated already, so I decided to collapse
alignment. Alignment was previously stored as a 16-bit integer from bits 16 to
32 of the attribute, but it was required to be a power of 2. Now it's stored in
log2 encoded form in five bits from 16 to 21. That gives us 11 more bits of
space.

You may have already noticed that you only need four bits to encode a 16-bit
power of two, so why five bits? Because the AsmParser accepted 32-bit
alignments, even though we couldn't store them (they were silently discarded).
Now we can store them in memory, but not in the bitcode.

The bitcode format was already storing these as 64-bit VBR integers. So, the
bitcode format stays the same, keeping the alignment values stored as 16 bit
raw values. There's some hideous code in the reader and writer that deals with
this, waiting to be ripped out the moment we run out of bits again and have to
replace the parameter attributes table encoding.

llvm-svn: 61019
2008-12-15 01:34:58 +00:00
Chris Lattner
10a0fb1e83 silence warning when asserts disabled.
llvm-svn: 61014
2008-12-14 21:38:24 +00:00
Chris Lattner
05dda70cd4 silence warning when asserts disabled.
llvm-svn: 61013
2008-12-14 21:37:33 +00:00
Chris Lattner
9458712db4 eliminate warning when asserts disabled.
llvm-svn: 61012
2008-12-14 21:36:23 +00:00
Owen Anderson
47efff5b14 Generalize GVN's phi construciton routine to work for things other than loads.
llvm-svn: 61009
2008-12-14 19:10:35 +00:00
Duncan Sands
ef671b5627 Reapply r60997, this time without forgetting that
target constants are allowed to have an illegal
type.

llvm-svn: 61006
2008-12-14 09:43:15 +00:00
Bill Wendling
380fbdc9f8 Temporarily revert r60997. It was causing this failure:
Running /Users/void/llvm/llvm.src/test/CodeGen/Generic/dg.exp ...
FAIL: /Users/void/llvm/llvm.src/test/CodeGen/Generic/asm-large-immediate.ll
Failed with exit(1) at line 1
while running:  llvm-as < /Users/void/llvm/llvm.src/test/CodeGen/Generic/asm-large-immediate.ll |  llc | /usr/bin/grep 68719476738
Assertion failed: ((TypesNeedLegalizing || getTypeAction(VT) == Legal) && "Illegal type introduced after type legalization?"), function HandleOp, file /Users/void/llvm/llvm.src/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp, line 493.
0   llc               0x0085392e char const* std::find<char const*, char>(char const*, char const*, char const&) + 98
1   llc               0x00853e63 llvm::sys::PrintStackTraceOnErrorSignal() + 593
2   libSystem.B.dylib 0x96cac09b _sigtramp + 43
3   libSystem.B.dylib 0xffffffff _sigtramp + 1765097359
4   libSystem.B.dylib 0x96d24ec2 raise + 26
5   libSystem.B.dylib 0x96d3447f abort + 73
6   libSystem.B.dylib 0x96d26063 __assert_rtn + 101
7   llc               0x004f9018 llvm::cast_retty<llvm::SubprogramDesc, llvm::DebugInfoDesc*>::ret_type llvm::cast<llvm::Sub
...

llvm-svn: 61001
2008-12-13 23:53:00 +00:00
Duncan Sands
7cddec2a2f LegalizeDAG is not supposed to introduce illegal
types into the DAG if they were not already there.
Check this with an assertion.

llvm-svn: 60997
2008-12-13 22:33:38 +00:00
Chris Lattner
0be74c4208 These messages should always be emitted when NDEBUG is unset, not when
NDEBUG is unset and -debug is passed.

llvm-svn: 60986
2008-12-13 18:37:58 +00:00
Bill Wendling
34182ae3ae Temporarily revert r60973. It's inexplicably causing a failure when self-hosting LLVM:
llvm[2]: Linking Release executable opt (without symbols)
...
Undefined symbols:
  "llvm::APFloat::IEEEsingle", referenced from:
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
  "llvm::APFloat::IEEEdouble", referenced from:
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
ld: symbol(s) not found

This is in release mode. To replicate, compile llvm and llvm-gcc in optimized
mode. Then build llvm, in optimized mode, with the newly created compiler.

llvm-svn: 60977
2008-12-13 09:28:44 +00:00
Torok Edwin
8898288749 Fix getFieldAs() to use the parameter instead of 6.
Add missing DIType constructor, needed by DIVariable::getType().

llvm-svn: 60976
2008-12-13 08:25:29 +00:00
Mon P Wang
2880dc8e8c Remove assertion to allow promotion of a truncating store operand
llvm-svn: 60975
2008-12-13 08:16:43 +00:00
Mon P Wang
da91e0e191 Added basic support for expanding VSETCC
llvm-svn: 60974
2008-12-13 08:15:14 +00:00
Chris Lattner
8753175cd6 make RLE preserve the name of the load that it replaces. This is just
a pretification of the IR.

llvm-svn: 60973
2008-12-13 07:22:47 +00:00
Duncan Sands
1faa6258eb On big-endian machines it is wrong to do a full
width register load followed by a truncating
store for the copy, since the load will not place
the value in the lower bits.  Probably partial
loads/stores can never happen here, but fix it
anyway.

llvm-svn: 60972
2008-12-13 07:18:38 +00:00
Misha Brukman
5e6eec9337 Fix spelling.
llvm-svn: 60971
2008-12-13 05:21:37 +00:00
Devang Patel
5b7938b1cc Do not print empty DW_AT_comp_dir.
llvm-svn: 60965
2008-12-12 21:57:54 +00:00
Duncan Sands
ddce2cb415 When expanding unaligned loads and stores do not make
use of illegal integer types: instead, use a stack slot
and copying via integer registers.  The existing code
is still used if the bitconvert is to a legal integer
type.

This fires on the PPC testcases 2007-09-08-unaligned.ll
and vec_misaligned.ll.  It looks like equivalent code
is generated with these changes, just permuted, but
it's hard to tell.

With these changes, nothing in LegalizeDAG produces
illegal integer types anymore.  This is a prerequisite
for removing the LegalizeDAG type legalization code.

While there I noticed that the existing code doesn't
handle trunc store of f64 to f32: it turns this into
an i64 store, which represents a 4 byte stack smash.
I added a FIXME about this.  Hopefully someone more
motivated than I am will take care of it.

llvm-svn: 60964
2008-12-12 21:47:02 +00:00
Bill Wendling
13e4a3d0b0 - Use patterns instead of creating completely new instruction matching patterns,
which are identical to the original patterns.

- Change the multiply with overflow so that we distinguish between signed and
  unsigned multiplication. Currently, unsigned multiplication with overflow
  isn't working!

llvm-svn: 60963
2008-12-12 21:15:41 +00:00
Evan Cheng
56d9fc70bd Fix add/sub expansion: don't create ADD / SUB with two results (seems like everyone is doing this these days :-). Patch by Daniel M Gessel!
llvm-svn: 60958
2008-12-12 18:49:09 +00:00
Nick Lewycky
51228d6707 Revert my re-instated reverted commit, fixes the bootstrap build on x86-64 linux.
llvm-svn: 60951
2008-12-12 17:09:07 +00:00
Duncan Sands
06ecf57a87 When using a 4 byte jump table on a 64 bit machine,
do an extending load of the 4 bytes rather than a
potentially illegal (type) i32 load followed by a
sign extend.

llvm-svn: 60945
2008-12-12 08:13:38 +00:00
Duncan Sands
9f8a7550b6 Don't make use of an illegal type (i64) when
lowering f64 function arguments.

llvm-svn: 60944
2008-12-12 08:05:40 +00:00
Mon P Wang
53d0c96c6f Added support for SELECT v8i8 v4i16 for X86 (MMX)
Added support for TRUNC v8i16 to v8i8 for X86 (MMX)

llvm-svn: 60916
2008-12-12 01:25:51 +00:00
Bill Wendling
5d026e47c1 Redo the arithmetic with overflow architecture. I was changing the semantics of
ISD::ADD to emit an implicit EFLAGS. This was horribly broken. Instead, replace
the intrinsic with an ISD::SADDO node. Then custom lower that into an
X86ISD::ADD node with a associated SETCC that checks the correct condition code
(overflow or carry). Then that gets lowered into the correct X86::ADDOvf
instruction.

Similar for SUB and MUL instructions.

llvm-svn: 60915
2008-12-12 00:56:36 +00:00
Evan Cheng
dfa19a4009 Fix a 80 col. violation.
llvm-svn: 60901
2008-12-11 22:02:02 +00:00
Nick Lewycky
312d95be37 Sneaky, sneaky: move the -1 to the outside of the SMax. Reinstate the
optimization of SGE/SLE with unit stride, now that it works properly.

llvm-svn: 60881
2008-12-11 17:40:14 +00:00
Torok Edwin
9d454874f3 fix grammar, thanks Duncan!
llvm-svn: 60875
2008-12-11 11:44:49 +00:00
Torok Edwin
34056e3cc9 introduce BasicBlock::getUniquePredecessor()
llvm-svn: 60872
2008-12-11 10:36:07 +00:00
Mon P Wang
f578029326 Avoid generating a convert_rndsat node when the src and dest type are the same.
llvm-svn: 60869
2008-12-11 03:30:13 +00:00
Bill Wendling
060f17c854 Clarify FIXME.
llvm-svn: 60867
2008-12-11 01:26:44 +00:00
Mon P Wang
80cfaeecfe Whitespace clean up (tabs with spaces)
llvm-svn: 60866
2008-12-11 00:44:22 +00:00
Mon P Wang
4448877ed7 Make fix for r60829 less conservative to allow the proper optimization for
vec_extract-sse4.ll.

llvm-svn: 60865
2008-12-11 00:26:16 +00:00
Bill Wendling
02555039a0 Add a newline after this debug output.
llvm-svn: 60861
2008-12-10 23:24:43 +00:00
Bill Wendling
292263313b If ADD, SUB, or MUL have an overflow bit that's used, don't do transformation on
them. The DAG combiner expects that nodes that are transformed have one value
result.

llvm-svn: 60857
2008-12-10 22:36:00 +00:00
Evan Cheng
fc73640f83 Preliminary ARM debug support based on patch by Mikael of FlexyCore.
llvm-svn: 60851
2008-12-10 21:54:21 +00:00
Evan Cheng
487c9ff802 Some code clean up.
llvm-svn: 60850
2008-12-10 21:49:05 +00:00
Bill Wendling
417d88be16 Only perform SETO/SETC to JO/JC conversion if extractvalue is coming from an arithmetic with overflow instruction.
llvm-svn: 60844
2008-12-10 19:44:24 +00:00
Duncan Sands
81499a8e1c For amusement, implement SADDO, SSUBO, UADDO, USUBO
for promoted integer types, eg: i16 on ppc-32, or
i24 on any platform.  Complete support for arbitrary
precision integers would require handling expanded
integer types, eg: i128, but I couldn't be bothered.

llvm-svn: 60834
2008-12-10 12:30:42 +00:00
Duncan Sands
ecb1273c5b Don't dereference the end() iterator. This was
causing a bunch of failures when running
"make ENABLE_EXPENSIVE_CHECKS=1 check".

llvm-svn: 60832
2008-12-10 09:38:36 +00:00
Mon P Wang
308879dcfc Fixed a bug when trying to optimize a extract vector element of a
bit convert that changes the number of elements of a shuffle.

llvm-svn: 60829
2008-12-10 03:59:02 +00:00
Evan Cheng
caa31a82fc Fix MachineCodeEmitter to use uintptr_t instead of intptr_t. This avoids some overflow issues. Patch by Thomas Jablin.
llvm-svn: 60828
2008-12-10 02:32:19 +00:00
Bill Wendling
d33b6dfd4f Whitespace changes.
llvm-svn: 60826
2008-12-10 02:01:32 +00:00
Evan Cheng
1264f4bc9c Fix a bug introduced by r59265. If lazy compilation is disabled, return actual function ptr instead of ptr to stub if function is already compiled.
llvm-svn: 60822
2008-12-10 01:33:59 +00:00
Chris Lattner
3987712b2d move an entry, add some notes, remove a completed item (IMPLICIT_DEF)
llvm-svn: 60821
2008-12-10 01:30:48 +00:00
Chris Lattner
e2b5854e41 Allow basicaa to walk through geps with identical indices in
parallel, allowing it to decide that P/Q must alias if A/B
must alias in things like:
 P = gep A, 0, i, 1
 Q = gep B, 0, i, 1

This allows GVN to delete 62 more instructions out of 403.gcc.

llvm-svn: 60820
2008-12-10 01:04:47 +00:00
Bill Wendling
a3b718a3c9 Whitespace fixes.
llvm-svn: 60818
2008-12-10 00:28:22 +00:00
Dan Gohman
1967880025 Update CalcLatency to work in terms of edge latencies, rather than
node latencies. Use CalcLatency instead of manual code in
CalculatePriorities to keep it consistent. Previously it
computed slightly different results.

llvm-svn: 60817
2008-12-10 00:24:36 +00:00
Evan Cheng
9419dfe08a Fix a couple of Dwarf bugs.
- Emit DW_AT_byte_size for struct and union of size zero.
- Emit DW_AT_declaration for forward type declaration.

llvm-svn: 60812
2008-12-10 00:15:44 +00:00
Scott Michel
0b5c67e1e0 CellSPU:
- Fix bug 3185, with misc other cleanups.
- Needed to implement SPUInstrInfo::InsertBranch(). CAUTION: Not sure what
  gets or needs to get passed to InsertBranch() to insert a conditional
  branch. This will abort for now until a good test case shows up.

llvm-svn: 60811
2008-12-10 00:15:19 +00:00
Bill Wendling
1c1dacdd42 Implement fast-isel conversion of a branch instruction that's branching on an
overflow/carry from the "arithmetic with overflow" intrinsics. It searches the
machine basic block from bottom to top to find the SETO/SETC instruction that is
its conditional. If an instruction modifies EFLAGS before it reaches the
SETO/SETC instruction, then it defaults to the normal instruction emission.

llvm-svn: 60807
2008-12-09 23:19:12 +00:00
Dan Gohman
036cc300ad Rewrite the SDep class, and simplify some of the related code.
The Cost field is removed. It was only being used in a very limited way,
to indicate when the scheduler should attempt to protect a live register,
and it isn't really needed to do that. If we ever want the scheduler to
start inserting copies in non-prohibitive situations, we'll have to
rethink some things anyway.

A Latency field is added. Instead of giving each node a single
fixed latency, each edge can have its own latency. This will eventually
be used to model various micro-architecture properties more accurately.

The PointerIntPair class and an internal union are now used, which
reduce the overall size.

llvm-svn: 60806
2008-12-09 22:54:47 +00:00