4674 Commits

Author SHA1 Message Date
Evan Cheng
f8b1257d2e Add a quick and dirty "loop aligner pass". x86 uses it to align its loops to 16-byte boundaries.
llvm-svn: 47703
2008-02-28 00:43:03 +00:00
Dale Johannesen
8039d40b76 Handle load/store of misaligned vectors that are the
same size as an int type by doing a bitconvert of
load/store of the int type (same algorithm as floating point).
This makes them work for ppc Altivec.  There was some
code that purported to handle loads of (some) vectors
by splitting them into two smaller vectors, but getExtLoad
rejects subvector loads, so this could never have worked;
the patch removes it.

llvm-svn: 47696
2008-02-27 22:36:00 +00:00
Evan Cheng
da92e34fe3 Fix a bug in dead spill slot elimination.
llvm-svn: 47687
2008-02-27 19:57:11 +00:00
Dan Gohman
16ba74da61 Remove the `else', at Evan's insistence.
llvm-svn: 47686
2008-02-27 19:44:57 +00:00
Duncan Sands
0139087442 Add a FIXME about the VECTOR_SHUFFLE evil hack.
llvm-svn: 47676
2008-02-27 17:39:13 +00:00
Duncan Sands
77ff6715ad LegalizeTypes support for EXTRACT_VECTOR_ELT. The
approach taken is different to that in LegalizeDAG
when it is a question of expanding or promoting the
result type: for example, if extracting an i64 from
a <2 x i64>, when i64 needs expanding, it bitcasts
the vector to <4 x i32>, extracts the appropriate
two i32's, and uses those for the Lo and Hi parts.
Likewise, when extracting an i16 from a <4 x i16>,
and i16 needs promoting, it bitcasts the vector to
<2 x i32>, extracts the appropriate i32, twiddles
the bits if necessary, and uses that as the promoted
value.  This puts more pressure on bitcast legalization,
and I've added the appropriate cases.  They needed to
be added anyway since users can generate such bitcasts
too if they want to.  Also, when considering various
cases (Legal, Promote, Expand, Scalarize, Split) it is
a pain that expand can correspond to Expand, Scalarize
or Split, so I've changed the LegalizeTypes enum so it
lists those different cases - now Expand only means
splitting a scalar in two.
The code produced is the same as by LegalizeDAG for
all relevant testcases, except for
2007-10-31-extractelement-i64.ll, where the code seems
to have improved (see below; can an expert please tell
me if it is better or not).
Before < vs after >.

<       subl    $92, %esp
<       movaps  %xmm0, 64(%esp)
<       movaps  %xmm0, (%esp)
<       movl    4(%esp), %eax
<       movl    %eax, 28(%esp)
<       movl    (%esp), %eax
<       movl    %eax, 24(%esp)
<       movq    24(%esp), %mm0
<       movq    %mm0, 56(%esp)
---
>       subl    $44, %esp
>       movaps  %xmm0, 16(%esp)
>       pshufd  $1, %xmm0, %xmm1
>       movd    %xmm1, 4(%esp)
>       movd    %xmm0, (%esp)
>       movq    (%esp), %mm0
>       movq    %mm0, 8(%esp)

<       subl    $92, %esp
<       movaps  %xmm0, 64(%esp)
<       movaps  %xmm0, (%esp)
<       movl    12(%esp), %eax
<       movl    %eax, 28(%esp)
<       movl    8(%esp), %eax
<       movl    %eax, 24(%esp)
<       movq    24(%esp), %mm0
<       movq    %mm0, 56(%esp)
---
>       subl    $44, %esp
>       movaps  %xmm0, 16(%esp)
>       pshufd  $3, %xmm0, %xmm1
>       movd    %xmm1, 4(%esp)
>       movhlps %xmm0, %xmm0
>       movd    %xmm0, (%esp)
>       movq    (%esp), %mm0
>       movq    %mm0, 8(%esp)

<       subl    $92, %esp
<       movaps  %xmm0, 64(%esp)
---
>       subl    $44, %esp

<       movl    16(%esp), %eax
<       movl    %eax, 48(%esp)
<       movl    20(%esp), %eax
<       movl    %eax, 52(%esp)
<       movaps  %xmm0, (%esp)
<       movl    4(%esp), %eax
<       movl    %eax, 60(%esp)
<       movl    (%esp), %eax
<       movl    %eax, 56(%esp)
---
>       pshufd  $1, %xmm0, %xmm1
>       movd    %xmm1, 4(%esp)
>       movd    %xmm0, (%esp)
>       movd    %xmm1, 12(%esp)
>       movd    %xmm0, 8(%esp)

<       subl    $92, %esp
<       movaps  %xmm0, 64(%esp)
---
>       subl    $44, %esp

<       movl    24(%esp), %eax
<       movl    %eax, 48(%esp)
<       movl    28(%esp), %eax
<       movl    %eax, 52(%esp)
<       movaps  %xmm0, (%esp)
<       movl    12(%esp), %eax
<       movl    %eax, 60(%esp)
<       movl    8(%esp), %eax
<       movl    %eax, 56(%esp)
---
>       pshufd  $3, %xmm0, %xmm1
>       movd    %xmm1, 4(%esp)
>       movhlps %xmm0, %xmm0
>       movd    %xmm0, (%esp)
>       movd    %xmm1, 12(%esp)
>       movd    %xmm0, 8(%esp)

llvm-svn: 47672
2008-02-27 13:34:40 +00:00
Duncan Sands
4c1f3c0c5b LegalizeTypes support for legalizing the mask
operand of a VECTOR_SHUFFLE.  The mask is a
vector of constant integers.  The code in
LegalizeDAG doesn't bother to legalize the
mask, since it's basically just storage for
a bunch of constants, however LegalizeTypes
is more picky.  The problem is that there may
not exist any legal vector-of-integers type
with a legal element type, so it is impossible
to create a legal mask!  Unless of course you
cheat by creating a BUILD_VECTOR where the
operands have a different type to the element
type of the vector being built...  This is
pretty ugly but works - all relevant tests in
the testsuite pass, and produce the same
assembler with and without LegalizeTypes.

llvm-svn: 47670
2008-02-27 13:03:44 +00:00
Duncan Sands
cadfe810f3 LegalizeTypes support for INSERT_VECTOR_ELT.
llvm-svn: 47669
2008-02-27 10:18:23 +00:00
Evan Cheng
295ae42ede Don't track max alignment during stack object allocations since they can be deleted later. Let PEI compute it.
llvm-svn: 47668
2008-02-27 10:04:56 +00:00
Duncan Sands
f8ac836240 Support for legalizing MEMBARRIER.
llvm-svn: 47667
2008-02-27 08:53:44 +00:00
Bill Wendling
2cae66e28b Final de-tabification.
llvm-svn: 47663
2008-02-27 06:33:05 +00:00
Evan Cheng
7553230e3a Spiller now remove unused spill slots.
llvm-svn: 47657
2008-02-27 03:04:06 +00:00
Dan Gohman
2042802d32 Teach Legalize how to expand an EXTRACT_ELEMENT.
llvm-svn: 47656
2008-02-27 01:52:30 +00:00
Dan Gohman
938e74654b Convert the last remaining users of the non-APInt form of
ComputeMaskedBits to use the APInt form, and remove the
non-APInt form.

llvm-svn: 47654
2008-02-27 01:23:58 +00:00
Dan Gohman
689d8cac04 Convert SimplifyDemandedMask and ShrinkDemandedConstant to use APInt.
Change several cases in SimplifyDemandedMask that don't ever do any
simplifying to reuse the logic in ComputeMaskedBits instead of
duplicating it.

llvm-svn: 47648
2008-02-27 00:25:32 +00:00
Chris Lattner
6318b4aee9 Use a smallvector for inactiveCounts and initialize it lazily
instead of init'ing it maximally to zeros on entry.  getFreePhysReg
is pretty hot and only a few elements are typically used.  This speeds
up linscan by 5% on 176.gcc.

llvm-svn: 47631
2008-02-26 22:08:41 +00:00
Bill Wendling
8fb166bf6c Rename PrintableName to Name.
llvm-svn: 47629
2008-02-26 21:47:57 +00:00
Bill Wendling
50f5c4be14 Change "Name" to "AsmName" in the target register info. Gee, a refactoring tool
would have been a Godsend here!

llvm-svn: 47625
2008-02-26 21:11:01 +00:00
Evan Cheng
701b6a1dc3 Enable -coalescer-commute-instrs by default.
llvm-svn: 47623
2008-02-26 20:40:22 +00:00
Dan Gohman
8a8f3fe7e0 Avoid aborting on invalid shift counts.
llvm-svn: 47612
2008-02-26 18:50:50 +00:00
Chris Lattner
1a461075ef Fix PR2096, a regression introduced with my patch last night. This
also fixes cfrac, flops, and 175.vpr

llvm-svn: 47605
2008-02-26 17:09:59 +00:00
Duncan Sands
c63bc1577a Fix a nasty bug in LegalizeTypes (spotted in
CodeGen/PowerPC/illegal-element-type.ll): suppose
a node X is processed, and processing maps it to
a node Y.  Then X continues to exist in the DAG,
but with no users.  While processing some other
node, a new node may be created that happens to
be equal to X, and thus X will be reused rather
than a truly new node.  This can cause X to
"magically reappear", and since it is in the
Processed state in will not be reprocessed, so
at the end of type legalization the illegal node
X can still be present.  The solution is to replace
X with Y whenever X gets resurrected like this.

llvm-svn: 47601
2008-02-26 11:21:42 +00:00
Bill Wendling
af80fae2a7 De-tabify.
llvm-svn: 47598
2008-02-26 10:51:52 +00:00
Evan Cheng
8e99554e84 This is possible:
vr1 = extract_subreg vr2, 3
...
vr3 = extract_subreg vr1, 2
The end result is vr3 is equal to vr2 with subidx 2.

llvm-svn: 47592
2008-02-26 08:03:41 +00:00
Chris Lattner
5b4101cf68 Fix isNegatibleForFree to not return true for ConstantFP nodes
after legalize.  Just because a constant is legal (e.g. 0.0 in SSE) 
doesn't mean that its negated value is legal (-0.0).  We could make
this stronger by checking to see if the negated constant is actually
legal post negation, but it doesn't seem like a big deal.

llvm-svn: 47591
2008-02-26 07:04:54 +00:00
Evan Cheng
40c26c71c0 Refactor inline asm constraint matching code out of SDIsel into TargetLowering.
llvm-svn: 47587
2008-02-26 02:33:44 +00:00
Dan Gohman
afd0e4bad3 Make some static variables const.
llvm-svn: 47566
2008-02-25 21:39:34 +00:00
Dan Gohman
012abf0109 Convert MaskedValueIsZero and all its users to use APInt. Also add
a SignBitIsZero function to simplify a common use case.

llvm-svn: 47561
2008-02-25 21:11:39 +00:00
Evan Cheng
17c9a98b59 All remat'ed loads cannot be folded into two-address code. Not just argument loads. This change doesn't really have any impact on codegen.
llvm-svn: 47557
2008-02-25 19:24:01 +00:00
Duncan Sands
440b8ac3a0 In debug builds check that the key property holds: all
result and operand types are legal.

llvm-svn: 47546
2008-02-25 16:21:21 +00:00
Evan Cheng
fc540545f1 Correctly determine whether a argument load can be folded into its uses.
llvm-svn: 47545
2008-02-25 08:50:41 +00:00
Duncan Sands
2cdf47bdf7 Add support to LegalizeTypes for building legal vectors
out of illegal elements (BUILD_VECTOR).  Uses and beefs
up BUILD_PAIR, though it didn't really have to.  Like
most of LegalizeTypes, does not support soft-float.
This cures all "make check" vector building failures.

llvm-svn: 47537
2008-02-24 07:36:03 +00:00
Bill Wendling
a369a6add8 Some platforms use the same name for 32-bit and 64-bit registers (like
%r3 on PPC) in their ASM files. However, it's hard for humans to read
during debugging. Adding a new field to the register data that lets you
specify a different name to be printed than the one that goes into the
ASM file -- %x3 instead of %r3, for instance.

llvm-svn: 47534
2008-02-24 00:56:13 +00:00
Evan Cheng
d299f09bc5 Rematerialization logic was overly conservative when it comes to loads from fixed stack slots.
llvm-svn: 47529
2008-02-23 03:38:34 +00:00
Evan Cheng
2de70b3ff8 If remating a machine instr with virtual register operand, make sure the vr is avaliable at all uses regardless of whether it would be folded.
llvm-svn: 47526
2008-02-23 02:14:42 +00:00
Evan Cheng
95d3cb841d Recognize loads of arguments as re-materializable first. Therefore if isReallyTriviallyReMaterializable() returns true it doesn't confuse it as a "normal" re-materializable instruction.
llvm-svn: 47520
2008-02-23 01:44:27 +00:00
Evan Cheng
166cb23f62 Fix spill weight updating bug.
llvm-svn: 47507
2008-02-23 00:33:04 +00:00
Evan Cheng
bb645b395c Same isPhysRegAvailable bug as local register allocator.
llvm-svn: 47500
2008-02-22 20:31:32 +00:00
Evan Cheng
b8e7eb2b1b Really really bad local register allocator bug. On X86, it was never using ESI, EDI, and EBP because of a bug in RALocal::isPhysRegAvailable(). For example, when
it checks if ESI is available, it then looks at registers aliases to ESI. SIL is marked -2 (not allocatable) but isPhysRegAvailable() incorrectly assumes it is in use and returns false for ESI.

llvm-svn: 47499
2008-02-22 20:30:53 +00:00
Evan Cheng
e24db258fe Add debugging printfs.
llvm-svn: 47496
2008-02-22 19:57:06 +00:00
Evan Cheng
fadafa2109 Make sure reload of implicit uses are issued before remat's.
llvm-svn: 47492
2008-02-22 19:22:06 +00:00
Dale Johannesen
a96eb3a1d8 Pass alignment on ByVal parameters, from FE, all
the way through.  It is now used for codegen.

llvm-svn: 47484
2008-02-22 17:49:45 +00:00
Evan Cheng
fa73e0c64e Enable re-materialization of instructions which have virtual register operands if
the definition of the operand also reaches its uses.

llvm-svn: 47475
2008-02-22 09:24:50 +00:00
Evan Cheng
e16e349623 Fix compiler warning.
llvm-svn: 47468
2008-02-22 01:48:00 +00:00
Dan Gohman
de80982418 Fix a regression in 403.gcc and 186.crafty introduced in 47383. To test
that a value is >= 32, check that all of the high bits are zero, not
just one or more.

llvm-svn: 47467
2008-02-22 01:12:31 +00:00
Chris Lattner
b3c8d120dc Make the clobber analysis a bit more smart: we only are careful about
early clobbers if the clobber list contains a *register* not some thing
like {memory}, {dirflag} etc.

llvm-svn: 47457
2008-02-21 20:54:31 +00:00
Chris Lattner
4f87f1c087 Treat clobber operands like early clobbers: if we have
any, we force sdisel to do all regalloc for an asm.  This
leads to gross but correct codegen.

This fixes the rest of PR2078.

llvm-svn: 47454
2008-02-21 19:43:13 +00:00
Bill Wendling
27dcf967b0 Clear PhysRegPartUse for the sub register as well.
llvm-svn: 47453
2008-02-21 19:35:27 +00:00
Bill Wendling
82f9e2d468 Adjust the MaxAlignment for the special register scavenging spill slot.
llvm-svn: 47452
2008-02-21 19:33:53 +00:00
Evan Cheng
8072166220 Help testing.
llvm-svn: 47448
2008-02-21 19:20:21 +00:00