230 Commits

Author SHA1 Message Date
Bill Wendling
9d46fcd59c More removal of std::cerr and DEBUG, replacing with DOUT instead.
llvm-svn: 31806
2006-11-17 02:09:07 +00:00
Evan Cheng
51733ed4a3 Fixed some spiller bugs exposed by the recent two-address code changes. Now
there may be other def(s) apart from the use&def two-address operand. We need
to check if the register reuse for a use&def operand may conflicts with another
def. Provide a mean to recover from the conflict if it is detected when the
defs are processed later.

llvm-svn: 31439
2006-11-04 00:21:55 +00:00
Evan Cheng
93cdd149f7 Rename
llvm-svn: 31364
2006-11-01 23:18:32 +00:00
Evan Cheng
d8697deca3 Two-address instructions no longer have to be A := A op C. Now any pair of dest / src operands can be tied together.
llvm-svn: 31363
2006-11-01 23:06:55 +00:00
Chris Lattner
c040e53372 restore my previous patch, now that the X86 backend bug has been fixed:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20061009/038518.html

llvm-svn: 30906
2006-10-12 17:45:38 +00:00
Evan Cheng
c935741b1d Backing out Chris' last commit. It's breaking llvm-gcc bootstrapping.
It's turning:
        movl -24(%ebp), %esp
        subl $16, %esp
        movl -24(%ebp), %ecx
into
        movl -24(%ebp), %esp
        subl $16, %esp
        movl %esp, (%esp)

llvm-svn: 30902
2006-10-12 08:00:47 +00:00
Chris Lattner
86a012ab61 If we see a load from a stack slot into a physreg, consider it as providing
the stack slot.  This fixes PR943.

llvm-svn: 30898
2006-10-12 02:34:07 +00:00
Chris Lattner
13a5dcddce Fix a long-standing wart in the code generator: two-address instruction lowering
actually *removes* one of the operands, instead of just assigning both operands
the same register.  This make reasoning about instructions unnecessarily complex,
because you need to know if you are before or after register allocation to match
up operand #'s with the target description file.

Changing this also gets rid of a bunch of hacky code in various places.

This patch also includes changes to fold loads into cmp/test instructions in
the X86 backend, along with a significant simplification to the X86 spill
folding code.

llvm-svn: 30108
2006-09-05 02:12:02 +00:00
Chris Lattner
3d27be1333 s|llvm/Support/Visibility.h|llvm/Support/Compiler.h|
llvm-svn: 29911
2006-08-27 12:54:02 +00:00
Chris Lattner
bdf121060c Take advantage of the recent improvements to the liveintervals set (tracking
instructions which define each value#) to simplify and improve the coallescer.
In particular, this patch:

1. Implements iterative coallescing.
2. Reverts an unsafe hack from handlePhysRegDef, superceeding it with a
   better solution.
3. Implements PR865, "coallescing" away the second copy in code like:

   A = B
   ...
   B = A

This also includes changes to symbolically print registers in intervals
when possible.

llvm-svn: 29862
2006-08-24 22:43:55 +00:00
Bill Wendling
04f2246400 Added a check so that if we have two machine instructions in this form
MOV R0, R1
    MOV R1, R0

the second machine instruction is removed. Added a regression test.

llvm-svn: 29792
2006-08-21 07:33:33 +00:00
Jim Laskey
4b49c23571 Eliminate data relocations by using NULL instead of global empty list.
llvm-svn: 29250
2006-07-21 21:15:20 +00:00
Andrew Lenharth
c496b418b5 Reduce number of exported symbols
llvm-svn: 29220
2006-07-20 17:28:38 +00:00
Chris Lattner
e097e6f7c7 Shave another 27K off libllvmgcc.dylib with visibility hidden
llvm-svn: 28973
2006-06-28 22:17:39 +00:00
Chris Lattner
10d6341618 Move some methods out of MachineInstr into MachineOperand
llvm-svn: 28102
2006-05-04 17:52:23 +00:00
Chris Lattner
fd0a5478a1 Fix a latent bug that my spiller patch last week exposed: we were leaving
instructions in the virtregfolded map that were deleted.  Because they
were deleted, newly allocated instructions could end up at the same address,
magically finding themselves in the map.  The solution is to remove entries
from the map when we delete the instructions.

llvm-svn: 28041
2006-05-01 22:03:24 +00:00
Chris Lattner
ab7dbe0cc9 When promoting a load to a reg-reg copy, where the load was a previous
instruction folded with spill code, make sure the remove the load from
the virt reg folded map.

llvm-svn: 28040
2006-05-01 21:17:10 +00:00
Chris Lattner
4dee67c2cd Remove previous patch, which wasn't quite right.
llvm-svn: 28039
2006-05-01 21:16:03 +00:00
Evan Cheng
a656242690 Remove temp. option -spiller-check-liveout, it didn't cause any failure nor performance regressions.
llvm-svn: 28029
2006-05-01 08:54:57 +00:00
Evan Cheng
f71f0f2e0b Local spiller kills a store if the folded restore is turned into a copy.
But this is incorrect if the spilled value live range extends beyond the
current BB.
It is currently controlled by a temporary option -spiller-check-liveout.

llvm-svn: 28024
2006-04-30 08:41:47 +00:00
Chris Lattner
79c50d96c9 Mapping of physregs can make it so that the designated and input physregs are
the same.  In this case, don't emit a noop copy.

llvm-svn: 28008
2006-04-28 04:43:18 +00:00
Chris Lattner
84e95d00b5 When we have a two-address instruction where the input cannot be clobbered
and is already available, instead of falling back to emitting a load, fall
back to emitting a reg-reg copy.  This generates significantly better code
for some SSE testcases, as SSE has lots of two-address instructions and
none of them are read/modify/write.  As one example, this change does:

        pshufd %XMM5, XMMWORD PTR [%ESP + 84], 255
        xorps %XMM2, %XMM5
        cmpltps %XMM1, %XMM0
-       movaps XMMWORD PTR [%ESP + 52], %XMM0
-       movapd %XMM6, XMMWORD PTR [%ESP + 52]
+       movaps %XMM6, %XMM0
        cmpltps %XMM6, XMMWORD PTR [%ESP + 68]
        movapd XMMWORD PTR [%ESP + 52], %XMM6
        movaps %XMM6, %XMM0
        cmpltps %XMM6, XMMWORD PTR [%ESP + 36]
        cmpltps %XMM3, %XMM0
-       movaps XMMWORD PTR [%ESP + 20], %XMM0
-       movapd %XMM7, XMMWORD PTR [%ESP + 20]
+       movaps %XMM7, %XMM0
        cmpltps %XMM7, XMMWORD PTR [%ESP + 4]
        movapd XMMWORD PTR [%ESP + 20], %XMM7
        cmpltps %XMM4, %XMM0

... which is far better than a store followed by a load!

llvm-svn: 28001
2006-04-28 01:46:50 +00:00
Chris Lattner
7d01f95a57 Fix a bug that Evan exposed with some changes he's making, and that was
exposed with a fastcc problem (breaking pcompress2 on x86 with -enable-x86-fastcc).

When reloading a reused reg, make sure to invalidate the reloaded reg, and
check to see if there are any other pending uses of the same register.

llvm-svn: 26369
2006-02-25 02:17:31 +00:00
Chris Lattner
28a0b8bec7 Remove debugging printout :)
Add a minor compile time win, no codegen change.

llvm-svn: 26368
2006-02-25 02:03:40 +00:00
Chris Lattner
525522e429 Refactor some code from being inline to being out in a new class with methods.
This gets rid of two gotos, which is always nice, and also adds some comments.

No functionality change, this is just a refactor.

llvm-svn: 26367
2006-02-25 01:51:33 +00:00
Jeff Cohen
57a004abfe Fix VC++ warning.
llvm-svn: 25957
2006-02-04 03:27:39 +00:00
Chris Lattner
c93403a7fb Handle another case exposed on X86.
llvm-svn: 25949
2006-02-03 23:50:46 +00:00
Chris Lattner
71d20c4e18 Fix a nasty problem on two-address machines in the following situation:
store EAX -> [ss#0]
[ss#0] += 1
...
use(EAX)

In this case, it is not valid to rewrite this as:


store EAX -> [ss#0]
EAX += 1
store EAX -> [ss#0]  ;;; this would also delete the store above
...
use(EAX)

... because EAX is not a dead at that point.  Keep track of which registers
we are allowed to clobber, and which ones we aren't, and don't clobber the
ones we're not supposed to.  :)

This should resolve the issues on X86 last night.

llvm-svn: 25948
2006-02-03 23:28:46 +00:00
Chris Lattner
507a3a7bd1 significantly simplify the VirtRegMap code by pulling the SpillSlotsAvailable
and PhysRegsAvailable maps out into a new AvailableSpills struct.  No
functionality change.

This paves the way for a bugfix, coming up next.

llvm-svn: 25947
2006-02-03 23:13:58 +00:00
Jeff Cohen
3276ff7ac6 Fix VC++ compilation error caused by using a std::map iterator variable to receive
a std::multimap iterator value.  For some reason, GCC doesn't have a problem with this.

llvm-svn: 25927
2006-02-03 03:48:54 +00:00
Chris Lattner
e18ef0d4a6 Remove move copies and dead stuff by not clobbering the result reg of a noop copy.
llvm-svn: 25926
2006-02-03 03:16:14 +00:00
Chris Lattner
774d4a190b Simplify some code
llvm-svn: 25924
2006-02-03 03:06:49 +00:00
Chris Lattner
1ef239afb4 Add code that checks for noop copies, which triggers when either:
1. a target doesn't know how to fold load/stores into copies, or
2. the spiller rewrites the input to a copy to the same register as the dest
   instead of to the reloaded reg.

This will be moved/improved in the near future, but allows elimination of
some ancient x86 hacks.  This eliminates 92 copies from SMG2000 on X86 and
163 copies from 252.eon.

llvm-svn: 25922
2006-02-03 02:02:59 +00:00
Chris Lattner
b7f24de4c8 Physregs may hold multiple stack slot values at the same time. Keep track
of this, and use it to our advantage (bwahahah).  This allows us to eliminate another
60 instructions from smg2000 on PPC (probably significantly more on X86).  A common
old-new diff looks like this:

        stw r2, 3304(r1)
-       lwz r2, 3192(r1)
        stw r2, 3300(r1)
-       lwz r2, 3192(r1)
        stw r2, 3296(r1)
-       lwz r2, 3192(r1)
        stw r2, 3200(r1)
-       lwz r2, 3192(r1)
        stw r2, 3196(r1)
-       lwz r2, 3192(r1)
+       or r2, r2, r2
        stw r2, 3188(r1)

and

-       lwz r31, 604(r1)
-       lwz r13, 604(r1)
-       lwz r14, 604(r1)
-       lwz r15, 604(r1)
-       lwz r16, 604(r1)
-       lwz r30, 604(r1)
+       or r31, r30, r30
+       or r13, r30, r30
+       or r14, r30, r30
+       or r15, r30, r30
+       or r16, r30, r30
+       or r30, r30, r30

Removal of the R = R copies is coming next...

llvm-svn: 25919
2006-02-03 00:36:31 +00:00
Chris Lattner
f3aef1b004 Fix a deficiency in the spiller that Evan noticed. In particular, consider
this code:

  store [stack slot #0],  R10
    = add R14, [stack slot #0]

The spiller didn't know that the store made the value of [stackslot#0] available
in R10 *IF* the store came from a copy instruction with the store folded into it.

This patch teaches VirtRegMap to look at these stores and recognize the values
they make available.  In one case Evan provided, this code:

        divsd %XMM0, %XMM1
        movsd %XMM1, QWORD PTR [%ESP + 40]
1)      movsd QWORD PTR [%ESP + 48], %XMM1
2)      movsd %XMM1, QWORD PTR [%ESP + 48]
        addsd %XMM1, %XMM0
3)      movsd QWORD PTR [%ESP + 48], %XMM1
        movsd QWORD PTR [%ESP + 4], %XMM0

turns into:

        divsd %XMM0, %XMM1
        movsd %XMM1, QWORD PTR [%ESP + 40]
        addsd %XMM1, %XMM0
3)      movsd QWORD PTR [%ESP + 48], %XMM1
        movsd QWORD PTR [%ESP + 4], %XMM0

In this case, instruction #2 was removed because of the value made
available by #1, and inst #1 was later deleted because it is now
never used before the stack slot is redefined by #3.

This occurs here and there in a lot of code with high spilling, on PPC
most of the removed loads/stores are LSU-reject-causing loads, which is
nice.

On X86, things are much better (because it spills more), where we nuke
about 1% of the instructions from SMG2000 and several hundred from eon.

More improvements to come...

llvm-svn: 25917
2006-02-02 23:29:36 +00:00
Chris Lattner
bb53acd03c Move isLoadFrom/StoreToStackSlot from MRegisterInfo to TargetInstrInfo,a far more logical place. Other methods should also be moved if anyoneis interested. :)
llvm-svn: 25913
2006-02-02 20:12:32 +00:00
Chris Lattner
de02d7727f Add explicit #includes of <iostream>
llvm-svn: 25515
2006-01-22 23:41:00 +00:00
Chris Lattner
0511055276 Add an assertion, update DefInst even though no one uses it (dangling pointers
don't help anyone)

llvm-svn: 25081
2006-01-04 06:47:48 +00:00
Chris Lattner
fabe55f155 Fix the LLC regressions on X86 last night. In particular, when undoing
previous copy elisions and we discover we need to reload a register, make
sure to use the regclass of the original register for the reload, not the
class of the current register.  This avoid using 16-bit loads to reload 32-bit
values.

llvm-svn: 23645
2005-10-06 17:19:06 +00:00
Chris Lattner
55149d7835 Fix a bug in the local spiller, where we could take code like this:
store r12 -> [ss#2]
  R3 = load [ss#1]
  use R3
  R3 = load [ss#2]
  R4 = load [ss#1]

and turn it into this code:

  store R12 -> [ss#2]
  R3 = load [ss#1]
  use R3
  R3 = R12
  R4 = R3    <- oops!

The problem was that promoting R3 = load[ss#2] to a copy missed the fact that
the instruction invalidated R3 at that point.

llvm-svn: 23638
2005-10-05 18:30:19 +00:00
Chris Lattner
5a6199f387 Change this code ot pass register classes into the stack slot spiller/reloader
code.  PrologEpilogInserter hasn't been updated yet though, so targets cannot
use this info.

llvm-svn: 23536
2005-09-30 01:29:00 +00:00
Chris Lattner
2f838f2192 Teach the local spiller to turn stack slot loads into register-register copies
when possible, avoiding the load (and avoiding the copy if the value is already
in the right register).

This patch came about when I noticed code like the following being generated:

  store R17 -> [SS1]
  ...blah...
  R4 = load [SS1]

This was causing an LSU reject on the G5.  This problem was due to the register
allocator folding spill code into a reg-reg copy (producing the load), which
prevented the spiller from being able to rewrite the load into a copy, despite
the fact that the value was already available in a register.  In the case
above, we now rip out the R4 load and replace it with a R4 = R17 copy.

This speeds up several programs on X86 (which spills a lot :) ), e.g.
smg2k from 22.39->20.60s, povray from 12.93->12.66s, 168.wupwise from
68.54->53.83s (!), 197.parser from 7.33->6.62s (!), etc.  This may have a larger
impact in some cases on the G5 (by avoiding LSU rejects), though it probably
won't trigger as often (less spilling in general).

Targets that implement folding of loads/stores into copies should implement
the isLoadFromStackSlot hook to get this.

llvm-svn: 23388
2005-09-19 06:56:21 +00:00
Chris Lattner
1410003751 Use continue in the use-processing loop to make it clear what the early exits
are, simplify logic, and cause things to not be nested as deeply.  This also
uses MRI->areAliases instead of an explicit loop.

No functionality change, just code cleanup.

llvm-svn: 23296
2005-09-09 20:29:51 +00:00
Misha Brukman
835702a094 Remove trailing whitespace
llvm-svn: 21420
2005-04-21 22:36:52 +00:00
Chris Lattner
6a6056e93d Make sure to notice that explicit physregs are used in the function
llvm-svn: 21084
2005-04-04 21:35:34 +00:00
Chris Lattner
ae09d93b35 Update these register allocators to set the PhysRegUsed info in MachineFunction.
llvm-svn: 19791
2005-01-23 22:45:13 +00:00
Chris Lattner
0ad02bdd3d Improve compatibility with acc
llvm-svn: 19549
2005-01-14 15:54:24 +00:00
Chris Lattner
c8b07dd339 Clean up the MachineBasicBlock.h file, percolating #includes into this file.
Patch contributed by Morten Ofstad

llvm-svn: 17251
2004-10-26 15:35:58 +00:00
Chris Lattner
2152236351 This patch fixes the nasty bug that caused 175.vpr to fail for X86 last night.
The problem occurred when trying to reload this instruction:

MOV32mr %reg2326, 8, %reg2297, 4, %reg2295

The value of reg2326 was available in EBX, so it was reused from there, instead
of reloading it into EDX.

The value of reg2297 was available in EDX, so it was reused from there, instead
of reloading it into EDI.

The value of reg2295 was not available, so we tried reloading it into EBX, its
assigned register.  However, we checked and saw that we already reloaded
something into EBX, so we chose what reg2326 was assigned to (EDX) and reloaded
into that register instead.

Unfortunately EDX had already been used by reg2297, so reloading into EDX
clobbered the value used by the reg2326 operand, breaking the program.

The fix for this is to check that the newly picked register is ok.  In this
case we now find that EDX is already used and try using EDI, which succeeds.

llvm-svn: 17006
2004-10-15 03:19:31 +00:00
Chris Lattner
9af0572a37 This patch adds and improves debugging output. No functionality changes.
llvm-svn: 17005
2004-10-15 03:16:29 +00:00