Commit Graph

431 Commits

Author SHA1 Message Date
Rafael Espindola
4509ec42b8 AnalyzeBranch doesn't change which successors a bb has, just the order
we try to branch to them.

Before we were creating successor lists with duplicated entries. Fixing that
found a bug in isBlockOnlyReachableByFallthrough that would causes it to
return the wrong answer for

-----------
...
jne foo
jmp bar

foo:
----------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132882 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-12 03:20:32 +00:00
Cameron Zwarich
aaa5f14d7c Fix an issue where the two-address conversion pass incorrectly rewrites untied
operands to an early clobber register. This fixes <rdar://problem/9566076>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132738 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-07 23:54:00 +00:00
Jakob Stoklund Olesen
5f2316a3b5 Switch AllocationOrder to using RegisterClassInfo instead of a BitVector
of reserved registers.

Use RegisterClassInfo in RABasic as well. This slightly changes som
allocation orders because RegisterClassInfo puts CSR aliases last.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132581 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-03 20:34:53 +00:00
Stuart Hastings
88882247d2 Since I can't reproduce the failures from 131261, re-trying with a
simplified version.  <rdar://problem/9298790>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131274 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-13 00:51:54 +00:00
Stuart Hastings
8ad145d729 Revert 131266 and 131261 due to buildbot complaints.
rdar://problem/9298790


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131269 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-13 00:15:17 +00:00
Stuart Hastings
4c576ca9db Tweak 131261 (thumb2-cbnz.ll) to generate the intended cbnz.
rdar://problem/9298790


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131266 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-13 00:10:03 +00:00
Stuart Hastings
5adc646380 Non-fast-isel followup to 129634; correctly handle branches controlled
by non-CMP expressions.  The executable test case (129821) would test
this as well, if we had an "-O0 -disable-arm-fast-isel" LLVM-GCC
tester.  Alas, the ARM assembly would be very difficult to check with
FileCheck.

The thumb2-cbnz.ll test is affected; it generates larger code (tst.w
vs. cmp #0), but I believe the new version is correct.
rdar://problem/9298790


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131261 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-12 23:36:41 +00:00
Eli Friedman
5e926ac651 Re-revert r130877; it's apparently causing a regression on 197.parser,
possibly related to cbnz formation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130977 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-06 05:23:07 +00:00
Eli Friedman
baf717a08a Re-commit r130862 with a minor change to avoid an iterator running off the edge in some cases.
Original message:

Teach MachineCSE how to do simple cross-block CSE involving physregs.  This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130877 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-04 22:10:36 +00:00
Eli Friedman
24d4c9911e Back out r130862; it appears to be breaking bootstrap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130867 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-04 20:48:42 +00:00
Eli Friedman
49cec1d818 Teach MachineCSE how to do simple cross-block CSE involving physregs. This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130862 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-04 19:54:24 +00:00
Jakob Stoklund Olesen
96169b189c Fix more register and coalescing dependencies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130859 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-04 19:02:11 +00:00
Jakob Stoklund Olesen
28e104bcb0 Explicitly request physreg coalesing for a bunch of Thumb2 unit tests.
These tests all follow the same pattern:

	mov	r2, r0
	movs	r0, #0
	$CMP	r2, r1
	it	eq
	moveq	r0, #1
	bx	lr

The first 'mov' can be eliminated by rematerializing 'movs r0, #0' below the
test instruction:

	$CMP	r0, r1
	mov.w	r0, #0
	it	eq
	moveq	r0, #1
	bx	lr

So far, only physreg coalescing can do that. The register allocators won't yet
split live ranges just to eliminate copies. They can learn, but this particular
problem is not likely to show up in real code. It only appears because r0 is
used for both the function argument and return value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130858 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-04 19:02:07 +00:00
Jakob Stoklund Olesen
d5b679c8ce Weekly fix of register allocation dependent unit tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130567 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-30 01:37:52 +00:00
Andrew Trick
d49ffe8284 Teach Thumb2 isel to fold and->rotr ==> ROR.
Generalization of Nate Begeman's patch!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130502 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-29 14:18:15 +00:00
Andrew Trick
e02a1501e7 Combine thumb2-ror tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130498 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-29 14:02:41 +00:00
Evan Cheng
554daa67bd Be careful about scheduling nodes above previous calls. It increase usages of
more callee-saved registers and introduce copies. Only allows it if scheduling
a node above calls would end up lessen register pressure.

Call operands also has added ABI restrictions for register allocation, so be
extra careful with hoisting them above calls.

rdar://9329627


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130245 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-26 21:31:35 +00:00
Benjamin Kramer
a42a757176 Make tests more useful.
lit needs a linter ...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130126 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-25 10:12:01 +00:00
Andrew Trick
83eb906906 Accidental function name mangling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130050 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-23 04:08:15 +00:00
Andrew Trick
1c3af779fc Thumb2 and ARM add/subtract with carry fixes.
Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>.
t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the
assembly printer correctly prints the 's' suffix.

Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags.

Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS.
Fixes ARM SBC lowering to check for live carry (potential bug).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130048 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-23 03:55:32 +00:00
Andrew Trick
5adfba283d whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130046 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-23 03:24:11 +00:00
Evan Cheng
db6cbe1ff1 In Thumb2 mode, lower frame indix references to:
add <rd>, sp, #<imm8>
ldr <rd>, [sp, #<imm8>]
When the offset from sp is multiple of 4 and in range of 0-1020.
This saves code size by utilizing 16-bit instructions.

rdar://9321541


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129971 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-22 01:42:52 +00:00
Andrew Trick
87896d9368 Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.
Additional fixes:
Do something reasonable for subtargets with generic
itineraries by handle node latency the same as for an empty
itinerary. Now nodes default to unit latency unless an itinerary
explicitly specifies a zero cycle stage or it is a TokenFactor chain.

Original fixes:
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make the ndoe latency adjustments work, I also
needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129421 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-13 00:38:32 +00:00
Chris Lattner
b99e000d79 fix two completely broken tests, which were matching due to PR9629.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129195 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-09 06:34:38 +00:00
Jakob Stoklund Olesen
c3178f85b5 Fix Thumb and Thumb2 tests to be register allocator independent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128690 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 23:31:50 +00:00
Eric Christopher
29aeed1bf8 Fix the bfi handling for or (and a mask) (and b mask). We need the two
masks to match inversely for the code as is to work. For the example given
we actually want:

bfi r0, r2, #1, #1

not #0, however, given the way the pattern is written it's not possible
at the moment.

Fixes rdar://9177502


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128320 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-26 01:21:03 +00:00
Cameron Zwarich
899eaa3569 Roll r127459 back in:
Optimize trivial branches in CodeGenPrepare, which often get created from the
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.

This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127498 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-11 21:52:04 +00:00
Daniel Dunbar
950d3db5f4 Revert r127459, "Optimize trivial branches in CodeGenPrepare, which often get
created from the", it broke some GCC test suite tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127477 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-11 19:30:30 +00:00
Cameron Zwarich
592ca3fda9 Optimize trivial branches in CodeGenPrepare, which often get created from the
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.

This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127459 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-11 04:54:27 +00:00
Bob Wilson
782b576749 Move a test that ended up in the wrong place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124933 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-05 04:15:50 +00:00
Evan Cheng
53519f015e Last round of fixes for movw + movt global address codegen.
1. Fixed ARM pc adjustment.
2. Fixed dynamic-no-pic codegen
3. CSE of pc-relative load of global addresses.

It's now enabled by default for Darwin.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123991 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-21 18:55:51 +00:00
Andrew Trick
d1dace8aea Enable support for precise scheduling of the instruction selection
DAG. Disable using "-disable-sched-cycles".

For ARM, this enables a framework for modeling the cpu pipeline and
counting stalls. It also activates several heuristics to drive
scheduling based on the model. Scheduling is inherently imprecise at
this stage, and until spilling is improved it may defeat attempts to
schedule. However, this framework provides greater control over
tuning codegen.

Although the flag is not target-specific, it should have very little
affect on the default scheduler used by x86. The only two changes that
affect x86 are:
- scheduling a high-latency operation bumps the current cycle so independent
  operations can have their latency covered. i.e. two independent 4
  cycle operations can produce results in 4 cycles, not 8 cycles.
- Two operations with equal register pressure impact and no
  latency-based stalls on their uses will be prioritized by depth before height
  (height is irrelevant if no stalls occur in the schedule below this point).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123971 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-21 06:19:05 +00:00
Bob Wilson
5e8b833707 Add ARM patterns to match EXTRACT_SUBVECTOR nodes.
Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle
vectors from being translated to EXTRACT_SUBVECTOR.
Patch by Tim Northover.

The test changes are needed to keep those spill-q tests from testing aligned
spills and restores.  If the only aligned stack objects are spill slots, we
no longer realign the stack frame.  Prior to this patch, an EXTRACT_SUBVECTOR
was legalized by loading from the stack, which created an aligned frame index.
Now, however, there is nothing except the spill slot in the stack frame, so
I added an aligned alloca.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122995 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-07 04:59:04 +00:00
Bob Wilson
4711d5cda3 Remove the rest of the *_sfp Neon instruction patterns.
Use the same COPY_TO_REGCLASS approach as for the 2-register *_sfp instructions.
This change made a big difference in the code generated for the
CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing
a fine job, but some instructions that were previously moved outside the loop
are not moved now.  It's using fewer VFP registers now, which is generally
a good thing, so I think the estimates for register pressure changed and that
affected the LICM behavior.  Since that isn't obviously wrong, I've just
changed the test file.  This completes the work for Radar 8711675.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121730 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 23:02:37 +00:00
Evan Cheng
a9688c4b57 (or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == #shamt. rdar://8752056
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121606 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-11 04:11:38 +00:00
Jim Grosbach
c6f9261711 ARM stm/ldm instructions require more than one register in the register list.
Otherwise, a plain str/ldr should be used instead. Make sure we account for
that in prologue/epilogue code generation.
rdar://8745460

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121391 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-09 18:31:13 +00:00
Bob Wilson
c24130bade The Thumb tADDrSPi instruction is not valid when the destination is SP.
Check for that and try narrowing it to tADDspi instead.  Radar 8724703.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120892 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-04 04:40:19 +00:00
Jim Grosbach
41ad0c4c73 When using the 'push' mnemonic for Thumb2 stmdb, be explicit when it's the
32-bit wide version by adding the .w suffix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120838 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-03 20:33:01 +00:00
Owen Anderson
9d63d90de5 Add correct encodings for STRD and LDRD, including fixup support. Additionally, update these to unified syntax.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120589 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-01 19:18:46 +00:00
Evan Cheng
ab5c703fdb Fix epilogue codegen to avoid leaving the stack pointer in an invalid
state. Previously Thumb2 would restore sp from fp like this:
mov sp, r7
sub, sp, #4
If an interrupt is taken after the 'mov' but before the 'sub', callee-saved
registers might be clobbered by the interrupt handler. Instead, try
restoring directly from sp:
add sp, #4
Or, if necessary (with VLA, etc.) use a scratch register to compute sp and
then restore it:
sub.w r4, r7, #8
mov sp, r7
rdar://8465407


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119977 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-22 18:12:04 +00:00
Eric Christopher
8b3ca6216d Rewrite stack callee saved spills and restores to use push/pop instructions.
Remove movePastCSLoadStoreOps and associated code for simple pointer
increments. Update routines that depended upon other opcodes for save/restore.

Adjust all testcases accordingly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119725 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 19:40:05 +00:00
Dale Johannesen
8abe08d7f9 These tests are looking for library function names that
appear to differ on Linux.  Try to make them pass on Linux.
Would be good for a Linux person to review this.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119572 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-17 21:57:32 +00:00
Evan Cheng
c4af4638df Remove ARM isel hacks that fold large immediates into a pair of add, sub, and,
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.

Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.

Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.

2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.

rdar://8663787, rdar://8241368


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119548 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-17 20:13:28 +00:00
Evan Cheng
8239daf7c8 Two sets of changes. Sorry they are intermingled.
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
   "optimize for latency". Call instructions don't have the right latency and
   this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
   not # of micro-ops since multi-latency instructions is completely executed
   even when the predicate is false. Also, some instruction will be "slower"
   when they are predicated due to the register def becoming implicit input.
   rdar://8598427


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118135 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-03 00:45:17 +00:00
Jim Grosbach
ab3d00e535 Revert r114340 (improvements in Darwin function prologue/epilogue), as it broke
assumptions about stack layout. Specifically, LR must be saved next to FP.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118026 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-02 17:35:25 +00:00
Bob Wilson
f74a429816 Overhaul memory barriers in the ARM backend. Radar 8601999.
There were a number of issues to fix up here:
* The "device" argument of the llvm.memory.barrier intrinsic should be
used to distinguish the "Full System" domain from the "Inner Shareable"
domain.  It has nothing to do with using DMB vs. DSB instructions.
* The compiler should never need to emit DSB instructions.  Remove the
ARMISD::SYNCBARRIER node and also remove the instruction patterns for DSB.
* Merge the separate DMB/DSB instructions for options only used for the
disassembler with the default DMB/DSB instructions.  Add the default
"full system" option ARM_MB::SY to the ARM_MB::MemBOpt enum.
* Add a separate ARMISD::MEMBARRIER_MCR node for subtargets that implement
a data memory barrier using the MCR instruction.
* Fix up encodings for these instructions (except MCR).
I also updated the tests and added a few new ones to check for DMB options
that were not currently being exercised.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117756 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-30 00:54:37 +00:00
Evan Cheng
089751535d Avoiding overly aggressive latency scheduling. If the two nodes share an
operand and one of them has a single use that is a live out copy, favor the
one that is live out. Otherwise it will be difficult to eliminate the copy
if the instruction is a loop induction variable update. e.g.

BB:
sub r1, r3, #1
str r0, [r2, r3]
mov r3, r1
cmp
bne BB

=>

BB:
str r0, [r2, r3]
sub r3, r3, #1
cmp
bne BB

This fixed the recent 256.bzip2 regression.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117675 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-29 18:09:28 +00:00
Evan Cheng
134982daa9 More accurate estimate / tracking of register pressure.
- Initial register pressure in the loop should be all the live defs into the
  loop. Not just those from loop preheader which is often empty.
- When an instruction is hoisted, update register pressure from loop preheader
  to the original BB.
- Treat only use of a virtual register as kill since the code is still SSA.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116956 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-20 22:03:58 +00:00
Dale Johannesen
e4d31593c5 Fix crash introduced in 116852. 8573915.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116955 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-20 22:03:37 +00:00
Dale Johannesen
575cd148ce Enable using vdup for vector constants which are splat of
integers by default, and remove the controlling flag, now
that LICM will hoist such vdup's.  8003375.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116852 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 20:00:17 +00:00
Evan Cheng
2312842de0 Re-enable register pressure aware machine licm with fixes. Hoist() may have
erased the instruction during LICM so UpdateRegPressureAfter() should not
reference it afterwards.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116845 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 18:58:51 +00:00
Daniel Dunbar
9869413802 Revert r116781 "- Add a hook for target to determine whether an instruction def
is", which breaks some nightly tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116816 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 17:14:24 +00:00
Evan Cheng
11e8b74a7a - Add a hook for target to determine whether an instruction def is
"long latency" enough to hoist even if it may increase spilling. Reloading
  a value from spill slot is often cheaper than performing an expensive
  computation in the loop. For X86, that means machine LICM will hoist
  SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON
  instructions.
- Enable register pressure aware machine LICM by default.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116781 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 00:55:07 +00:00
Bob Wilson
7d24705f65 Change register allocation order for ARM VFP and NEON registers to put the
callee-saved registers at the end of the lists.  Also prefer to avoid using
the low registers that are in register subclasses required by certain
instructions, so that those registers will more likely be available when needed.
This change makes a huge improvement in spilling in some cases.  Thanks to
Jakob for helping me realize the problem.

Most of this patch is fixing the testsuite.  There are quite a few places
where we're checking for specific registers.  I changed those to wildcards
in places where that doesn't weaken the tests.  The spill-q.ll and
thumb2-spill-q.ll tests stopped spilling with this change, so I added a bunch
of live values to force spills on those tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116055 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-08 06:15:13 +00:00
Owen Anderson
8614167572 Enable target-specific mul-lowering on ARM, even at -Os. Remove a test that this makes
irrelevant, but add a new test for the new, improved functionality.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114494 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-21 22:51:46 +00:00
Jim Grosbach
1dc335a79f Simplify ARM callee-saved register handling by removing the distinction
between the high and low registers for prologue/epilogue code. This was
a Darwin-only thing that wasn't providing a realistic benefit anymore.
Combining the save areas simplifies the compiler code and results in better
ARM/Thumb2 codegen.

For example, previously we would generate code like:
        push    {r4, r5, r6, r7, lr}
        add     r7, sp, #12
        stmdb   sp!, {r8, r10, r11}
With this change, we combine the register saves and generate:
        push    {r4, r5, r6, r7, r8, r10, r11, lr}
        add     r7, sp, #12

rdar://8445635



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114340 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-20 19:32:20 +00:00
Jim Grosbach
e6be85e9ff Teach the (non-MC) instruction printer to use the cannonical names for push/pop,
and shift instructions on ARM. Update the tests to match.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114230 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-17 22:36:38 +00:00
Jim Grosbach
1aaf4cb393 Move thumb2 tests to the thumb2 directory
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114206 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-17 20:34:09 +00:00
Evan Cheng
3ef1c8759a Teach if-converter to be more careful with predicating instructions that would
take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113570 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-10 01:29:16 +00:00
Bob Wilson
0f1e9457a5 Fix NEON VLD pseudo instruction itineraries that were incorrectly copied from
the VST pseudos.  The VLD/VST scheduling still needs work (see pr6722), but
at least we shouldn't confuse the loads with the stores.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113473 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-09 05:40:26 +00:00
Jim Grosbach
65482b1bb8 Re-apply r112883:
"For ARM stack frames that utilize variable sized objects and have either
large local stack areas or require dynamic stack realignment, allocate a
base register via which to access the local frame. This allows efficient
access to frame indices not accessible via the FP (either due to being out
of range or due to dynamic realignment) or the SP (due to variable sized
object allocation). In particular, this greatly improves efficiency of access
to spill slots in Thumb functions which contain VLAs."

r112986 fixed a latent bug exposed by the above.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112989 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-03 18:37:12 +00:00
Daniel Dunbar
6a8700301c Revert "For ARM stack frames that utilize variable sized objects and have either", it is breaking oggenc with Clang for ARMv6.
This reverts commit 8d6e29cfda270be483abf638850311670829ee65.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112962 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-03 15:26:42 +00:00
Jim Grosbach
1755b3964f For ARM stack frames that utilize variable sized objects and have either
large local stack areas or require dynamic stack realignment, allocate a
base register via which to access the local frame. This allows efficient
access to frame indices not accessible via the FP (either due to being out
of range or due to dynamic realignment) or the SP (due to variable sized
object allocation). In particular, this greatly improves efficiency of access
to spill slots in Thumb functions which contain VLAs.

rdar://7352504
rdar://8374540
rdar://8355680



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112883 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-02 22:29:01 +00:00
Jim Grosbach
e7c1416263 Now that register allocation properly considers reserved regs, simplify the
ARM register class allocation order functions to take advantage of that.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112841 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-02 18:14:29 +00:00
Chris Lattner
5bcb8a6112 temporarily revert r112664, it is causing a decoding conflict, and
the testcases should be merged.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112711 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-01 16:00:50 +00:00
Bill Wendling
43a6c5e2fc We have a chance for an optimization. Consider this code:
int x(int t) {
  if (t & 256)
    return -26;
  return 0;
}

We generate this:

     tst.w   r0, #256
     mvn     r0, #25
     it      eq
     moveq   r0, #0

while gcc generates this:

     ands    r0, r0, #256
     it      ne
     mvnne   r0, #25
     bx      lr

Scandalous really!

During ISel time, we can look for this particular pattern. One where we have a
"MOVCC" that uses the flag off of a CMPZ that itself is comparing an AND
instruction to 0. Something like this (greatly simplified):

  %r0 = ISD::AND ...
  ARMISD::CMPZ %r0, 0         @ sets [CPSR]
  %r0 = ARMISD::MOVCC 0, -26  @ reads [CPSR]

All we have to do is convert the "ISD::AND" into an "ARM::ANDS" that sets [CPSR]
when it's zero. The zero value will all ready be in the %r0 register and we only
need to change it if the AND wasn't zero. Easy!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112664 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-31 22:41:22 +00:00
Bob Wilson
7a9ef44b3b Add alignment arguments to all the NEON load/store intrinsics.
Update all the tests using those intrinsics and add support for
auto-upgrading bitcode files with the old versions of the intrinsics.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112271 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-27 17:13:24 +00:00
Daniel Dunbar
3cc3283fcb ARM/Thumb2: Fix a misselect in getARMCmp, when attempting to adjust a signed
comparison that would overflow.
 - The other under/overflow cases can't actually happen because the immediates
   which would trigger them are legal (so we don't enter this code), but
   adjusted the style to make it clear the transform is always valid.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112053 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-25 16:58:05 +00:00
Bob Wilson
f955f290c9 Change ARM PKHTB and PKHBT instructions to use a shift_imm operand to avoid
printing "lsl #0".  This fixes the remaining parts of pr7792.  Make
corresponding changes for encoding/decoding these instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111251 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-17 17:23:19 +00:00
Bob Wilson
dc66edaced Generalize a pattern for PKHTB: an SRL of 16-31 bits will guarantee
that the high halfword is zero.  The shift need not be exactly 16 bits.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111196 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-16 22:26:55 +00:00
Bob Wilson
b05b80160a Convert test to FileCheck.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111195 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-16 22:21:13 +00:00
Bob Wilson
703af3ab12 Temporarily disable tail calls on ARM to work around some linker problems.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111050 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-13 22:43:33 +00:00
Jim Grosbach
b5aa11f2d6 fix silly typo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110831 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-11 17:32:46 +00:00
Jim Grosbach
7166e622d7 Add a target triple, as the runtime library invocation varies a bit by
platform. It's apparently "bl __muldf3" on linux, for example. Since that's
not what we're checking here, it's more robust to just force a triple. We
just wwant to check that the inline FP instructions are only generated
on cpus that have them."



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110830 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-11 17:31:12 +00:00
Dan Gohman
3cc5d13f58 Temporarily disable some failing tests, until they can be
properly investigated.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110825 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-11 16:36:07 +00:00
Jim Grosbach
fcba5e6b64 cortex m4 has floating point support, but only single precision.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110810 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-11 15:44:15 +00:00
Evan Cheng
7b4d31176e Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110798 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-11 07:17:46 +00:00
Evan Cheng
11db068721 - Add subtarget feature -mattr=+db which determine whether an ARM cpu has the
memory and synchronization barrier dmb and dsb instructions.
- Change instruction names to something more sensible (matching name of actual
  instructions).
- Added tests for memory barrier codegen.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110785 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-11 06:22:01 +00:00
Evan Cheng
ac096808a3 Re-apply r110655 with fixes. Epilogue must restore sp from fp if the function stack frame has a var-sized object.
Also added a test case to check for the added benefit of this patch: it's optimizing away the unnecessary restore of sp from fp for some non-leaf functions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110707 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-10 19:30:19 +00:00
Jim Grosbach
6ccfc507dc Many Thumb2 instructions can reference the full ARM register set (i.e.,
have 4 bits per register in the operand encoding), but have undefined
behavior when the operand value is 13 or 15 (SP and PC, respectively).
The trivial coalescer in linear scan sometimes will merge a copy from
SP into a subsequent instruction which uses the copy, and if that
instruction cannot legally reference SP, we get bad code such as:
  mls r0,r9,r0,sp
instead of:
  mov r2, sp
  mls r0, r9, r0, r2

This patch adds a new register class for use by Thumb2 that excludes
the problematic registers (SP and PC) and is used instead of GPR
for those operands which cannot legally reference PC or SP. The
trivial coalescer explicitly requires that the register class
of the destination for the COPY instruction contain the source
register for the COPY to be considered for coalescing. This prevents
errant instructions like that above.

PR7499




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109842 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-30 02:41:01 +00:00
Dale Johannesen
f630c712b1 Implement vector constants which are splat of
integers with mov + vdup.  8003375.  This is
currently disabled by default because LICM will
not hoist a VDUP, so it pessimizes the code if
the construct occurs inside a loop (8248029).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109799 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-29 20:10:08 +00:00
Jim Grosbach
f27ca42552 update tests for smarter BIC usage
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108846 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-20 16:16:48 +00:00
Jim Grosbach
5423856e44 Add combiner patterns to more effectively utilize the BFI (bitfield insert)
instruction for non-constant operands. This includes the case referenced
in the README.txt regarding a bitfield copy.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108608 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-17 03:30:54 +00:00
Jim Grosbach
469bbdb597 Add basic support to code-gen the ARM/Thumb2 bit-field insert (BFI) instruction
and a combine pattern to use it for setting a bit-field to a constant
value. More to come for non-constant stores.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108570 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-16 23:05:05 +00:00
Jim Grosbach
502e0aa628 Improve 64-subtraction of immediates when parts of the immediate can fit
in the literal field of an instruction. E.g.,
long long foo(long long a) {
  return a - 734439407618LL;
}

rdar://7038284



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108339 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-14 17:45:16 +00:00
Bob Wilson
7a52e65ca6 Fix test to appease the buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108334 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-14 16:43:47 +00:00
Bob Wilson
a0148c360e Print "dregpair" NEON operands with a space between them, for readability and
consistency with other instructions that have lists of register operands.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107944 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-09 00:47:20 +00:00
Dale Johannesen
7835f1fcdb Changes to ARM tail calls, mostly cosmetic.
Add explicit testcases for tail calls within the same module.
Duplicate some code to humor those who think .w doesn't apply on ARM.
Leave this disabled on Thumb1, and add some comments explaining why it's hard
and won't gain much.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107851 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-08 01:18:23 +00:00
Evan Cheng
c36b5b9730 PR7503: uxtb16 is not available for ARMv7-M. Patch by Brian G. Lucas.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107122 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-29 05:38:36 +00:00
Bob Wilson
8eab75f390 Reapply my if-conversion cleanup from svn r106939 with fixes.
There are 2 changes relative to the previous version of the patch:

1) For the "simple" if-conversion case, there's no need to worry about
RemoveExtraEdges not handling an unanalyzable branch.  Predicated terminators
are ignored in this context, so RemoveExtraEdges does the right thing.
This might break someday if we ever treat indirect branches (BRIND) as
predicable, but for now, I just removed this part of the patch, because
in the case where we do not add an unconditional branch, we rely on keeping
the fall-through edge to CvtBBI (which is empty after this transformation).

The change relative to the previous patch is:

@@ -1036,10 +1036,6 @@
     IterIfcvt = false;
   }
 
-  // RemoveExtraEdges won't work if the block has an unanalyzable branch,
-  // which is typically the case for IfConvertSimple, so explicitly remove
-  // CvtBBI as a successor.
-  BBI.BB->removeSuccessor(CvtBBI->BB);
   RemoveExtraEdges(BBI);
 
   // Update block info. BB can be iteratively if-converted.


2) My patch exposed a bug in the code for merging the tail of a "diamond",
which had previously never been exercised.  The code was simply checking that
the tail had a single predecessor, but there was a case in
MultiSource/Benchmarks/VersaBench/dbms where that single predecessor was
neither edge of the diamond.  I added the following change to check for
that:

@@ -1276,7 +1276,18 @@
   // tail, add a unconditional branch to it.
   if (TailBB) {
     BBInfo TailBBI = BBAnalysis[TailBB->getNumber()];
-    if (TailBB->pred_size() == 1 && !TailBBI.HasFallThrough) {
+    bool CanMergeTail = !TailBBI.HasFallThrough;
+    // There may still be a fall-through edge from BBI1 or BBI2 to TailBB;
+    // check if there are any other predecessors besides those.
+    unsigned NumPreds = TailBB->pred_size();
+    if (NumPreds > 1)
+      CanMergeTail = false;
+    else if (NumPreds == 1 && CanMergeTail) {
+      MachineBasicBlock::pred_iterator PI = TailBB->pred_begin();
+      if (*PI != BBI1->BB && *PI != BBI2->BB)
+        CanMergeTail = false;
+    }
+    if (CanMergeTail) {
       MergeBlocks(BBI, TailBBI);
       TailBBI.IsDone = true;
     } else {

With these fixes, I was able to run all the SingleSource and MultiSource
tests successfully.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107110 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-29 00:55:23 +00:00
Bob Wilson
de4fe23139 Revert my if-conversion cleanup since it caused a bunch of nightly test
regressions.

--- Reverse-merging r106939 into '.':
U    test/CodeGen/Thumb2/thumb2-ifcvt3.ll
U    lib/CodeGen/IfConversion.cpp


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106951 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-26 17:47:06 +00:00
Eli Friedman
56df4a079b Remove bogus test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106941 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-26 04:59:56 +00:00
Bob Wilson
ccd9bcca14 Clean up some problems with extra CFG edges being introduced during
if-conversion.  The RemoveExtraEdges function doesn't work for blocks that
end with unanalyzable branches, so in those cases, the "extra" edges must
be explicitly removed.  The CopyAndPredicateBlock and MergeBlocks methods
can also avoid copying successor edges due to branches that have already
been removed.  The latter case is especially helpful when MergeBlocks is
called for handling "diamond" if-conversions, where otherwise you can end
up with some weird intermediate states in the CFG.  Unfortunately I've
been unable to find cases where this cleanup actually makes a significant
difference in the code.  There is one test where we manage to remove an
empty block at the end of a function.  Radar 6911268.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106939 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-26 04:27:33 +00:00
Bob Wilson
789fef987f PR7458: Try commuting Thumb2 instruction operands to put them into 2-address
form so they can be narrowed to 16-bit instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106762 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-24 16:50:20 +00:00
Dan Gohman
102f3851bb Eliminate the first have of the optimization which eliminates BRCOND
when the condition is constant. This optimization shouldn't be
necessary, because codegen shouldn't be able to find dead control
paths that the IR-level optimizer can't find. And it's undesirable,
because it encourages bugpoint to leave "br i1 false" branches
in its output. And it wasn't updating the CFG.

I updated all the tests I could, but some tests are too reduced
and I wasn't able to meaningfully preserve them.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106748 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-24 15:04:11 +00:00
Evan Cheng
4d54e5b2dd Tail merging pass shall not break up IT blocks. rdar://8115404
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106517 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-22 01:18:16 +00:00
Evan Cheng
859df5e9f9 Fix a crash caused by dereference of MBB.end(). rdar://8110842
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106399 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-20 00:54:38 +00:00
Evan Cheng
0110ac66eb Disable sibcall optimization for Thumb1 for now since Thumb1RegisterInfo::emitEpilogue is not expecting them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106368 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-19 01:01:32 +00:00
Evan Cheng
96c3da6436 Move ARM if-conversion before post-ra scheduling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106355 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-18 23:32:07 +00:00
Evan Cheng
86050dc8cc Allow ARM if-converter to be run after post allocation scheduling.
- This fixed a number of bugs in if-converter, tail merging, and post-allocation
  scheduler. If-converter now runs branch folding / tail merging first to
  maximize if-conversion opportunities.
- Also changed the t2IT instruction slightly. It now defines the ITSTATE
  register which is read by instructions in the IT block.
- Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't
  change the instruction ordering in the IT block (since IT mask has been
  finalized). It also ensures no other instructions can be scheduled between
  instructions in the IT block.

This is not yet enabled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106344 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-18 23:09:54 +00:00
Jakob Stoklund Olesen
0465bcffbb TwoAddressInstructionPass::CoalesceExtSubRegs can insert INSERT_SUBREG
instructions, but it doesn't really understand live ranges, so the first
INSERT_SUBREG uses an implicitly defined register.

Fix it in LiveVariableAnalysis by adding the <undef> flag.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106333 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-18 22:29:44 +00:00
Evan Cheng
6a5e2832d0 Fix an inverted condition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106330 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-18 22:17:13 +00:00
Dale Johannesen
c66cdf74a9 Enable tail calls on ARM by default, with some
basic tests.

This has been well tested on Darwin but not elsewhere.
It should work provided the linker correctly resolves
  B.W  <label in other function>
which it has not seen before, at least from llvm-based
compilers.  I'm leaving the arm-tail-calls switch in
until I see if there's any problems because of that;
it might need to be disabled for some environments.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106299 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-18 19:00:18 +00:00
Rafael Espindola
1e81966626 Remove arm_apcscc from the test files. It is the default and doing this
matches what llvm-gcc and clang now produce.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106221 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-17 15:18:27 +00:00
Jakob Stoklund Olesen
c66d0f2a93 Allow a register to be redefined multiple times in a basic block.
LiveVariableAnalysis was a bit picky about a register only being redefined once,
but that really isn't necessary.

Here is an example of chained INSERT_SUBREGs that we can handle now:

68      %reg1040<def> = INSERT_SUBREG %reg1040, %reg1028<kill>, 14
                register: %reg1040 +[70,134:0)
76      %reg1040<def> = INSERT_SUBREG %reg1040, %reg1029<kill>, 13
                register: %reg1040 replace range with [70,78:1) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,134:0)  0@78-(134) 1@70-(78)
84      %reg1040<def> = INSERT_SUBREG %reg1040, %reg1030<kill>, 12
                register: %reg1040 replace range with [78,86:2) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,86:2)[86,134:0)  0@86-(134) 1@70-(78) 2@78-(86)
92      %reg1040<def> = INSERT_SUBREG %reg1040, %reg1031<kill>, 11
                register: %reg1040 replace range with [86,94:3) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,86:2)[86,94:3)[94,134:0)  0@94-(134) 1@70-(78) 2@78-(86) 3@86-(94)

rdar://problem/8096390

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106152 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-16 21:29:40 +00:00
Evan Cheng
46df4eb46e Make post-ra scheduling, anti-dep breaking, and register scavenger (conservatively) aware of predicated instructions. This enables ARM to move if-conversion before post-ra scheduler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106091 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-16 07:35:02 +00:00
Rafael Espindola
2ebb4f81f7 Remove the arm_aapcscc marker from the tests. It is the default
for the linux targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106029 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-15 19:04:29 +00:00
Jakob Stoklund Olesen
40d07bbebb Add CoalescerPair helper class.
Given a copy instruction, CoalescerPair can determine which registers to
coalesce in order to eliminate the copy. It deals with all the subreg fun to
determine a tuple (DstReg, SrcReg, SubIdx) such that:

- SrcReg is a virtual register that will disappear after coalescing.
- DstReg is a virtual or physical register whose live range will be extended.
- SubIdx is 0 when DstReg is a physical register.
- SrcReg can be joined with DstReg:SubIdx.

CoalescerPair::isCoalescable() determines if another copy instruction is
compatible with the same tuple. This fixes some NEON miscompilations where
shuffles are getting coalesced as if they were copies.

The CoalescerPair class will replace a lot of the spaghetti logic in JoinCopy
later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105997 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-15 16:04:21 +00:00
Dale Johannesen
73c943fb43 More tail call removal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105485 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-04 21:14:24 +00:00
Dale Johannesen
da8dd92c9f Remove tail call. A tail call version will follow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105438 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-04 00:03:37 +00:00
Dale Johannesen
127e5244a0 Remove tail call to preserve this test. A tail
call version will follow.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105422 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-03 21:57:48 +00:00
Dale Johannesen
9452ce30bc Make this test not use tail calls. A tail call
version will follow.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105419 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-03 21:53:01 +00:00
Bob Wilson
bb7ecb2bf5 Thumb2 RSBS instructions were being printed without the 'S' suffix.
Fix it by changing the T2I_rbin_s_is multiclass to handle the CPSR
output and 'S' suffix in the same way as T2I_bin_s_irs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104531 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-24 18:44:06 +00:00
Bob Wilson
be751cfe9c Recognize more BUILD_VECTORs and VECTOR_SHUFFLEs that can be implemented by
copying VFP subregs.  This exposed a bunch of dead code in the *spill-q.ll
tests, so I tweaked those tests to keep that code from being optimized away.
Radar 7872877.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104415 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-22 00:23:12 +00:00
Evan Cheng
9085f98b32 t2LEApcrel and tLEApcrel are re-materializable. This makes it possible to hoist more loads during machine LICM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104115 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-19 07:28:01 +00:00
Jim Grosbach
4b77f6a85a Clean up the conditional for handling of sign_extend_inreg based on
whether the extract instructions are available.

rdar://7956878



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103277 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-07 18:34:55 +00:00
Jim Grosbach
29402132f3 Cleanup of ARMv7M support. Move hardware divide and Thumb2 extract/pack
instructions to subtarget features and update tests to reflect.
PR5717.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103136 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-05 23:44:43 +00:00
Jim Grosbach
bc1c98d538 fix copy/paste oops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103122 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-05 21:07:46 +00:00
Jim Grosbach
3a548e717f Add tests for ARMV7M divide instruction use
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103120 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-05 20:47:15 +00:00
Dan Gohman
30fc5bbfd1 Fix a bug which prevented tail merging of return instructions in
beneficial cases. See the changes in test/CodeGen/X86/tail-opts.ll and
test/CodeGen/ARM/ifcvt2.ll for details.

The fix is to change HashEndOfMBB to hash at most one instruction,
instead of trying to apply heuristics about when it will be profitable to
consider more than one instruction. The regular tail-merging heuristics
are already prepared to handle the same cases, and they're more precise.

Also, make test/CodeGen/ARM/ifcvt5.ll and
test/CodeGen/Thumb2/thumb2-branch.ll slightly more complex so that they
continue to test what they're intended to test.

And, this eliminates the problem in
test/CodeGen/Thumb2/2009-10-15-ITBlockBranch.ll, the testcase from
PR5204. Update it accordingly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102907 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-03 14:35:47 +00:00
Bob Wilson
5dfa87ecc6 Handle register-to-register copies within the tGPR class.
Radar 7896289


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102396 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-26 23:20:08 +00:00
Jim Grosbach
3a1287b470 Update ARM DAGtoDAG for matching UBFX instruction for unsigned bitfield
extraction. This fixes PR5998.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102144 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-22 23:24:18 +00:00
Evan Cheng
30fdb5c2ac - Clean up some crappy code which deals with coalescing of copies which look at
extract_subreg / insert_subreg, etc.
- Add support for more aggressive insert_subreg coalescing.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101971 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-21 00:44:22 +00:00
Dan Gohman
9f23dee08c Start function numbering at 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101638 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-17 16:29:15 +00:00
Evan Cheng
3a1588a2e3 Use default lowering of DYNAMIC_STACKALLOC. As far as I can tell, ARM isle is doing the right thing and codegen looks correct for both Thumb and Thumb2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101410 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-15 22:20:34 +00:00
Evan Cheng
0ea7d219ec ARM SelectDYN_ALLOC should emit a copy from SP rather than referencing SP directly. In cases where there are two dyn_alloc in the same BB it would have caused the old SP value to be reused and badness ensues. rdar://7493908
llvm is generating poor code for dynamic alloca, I'll fix that later.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101383 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-15 18:42:28 +00:00
Jim Grosbach
7ec7a0e96b switch the flag for using NEON for SP floating point to a subtarget 'feature'.
Re-commit. This time complete with testsuite updates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99570 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-25 23:47:34 +00:00
Johnny Chen
9e08876a2a Added sub-formats to the NeonI/NeonXI instructions to further refine the NEONFrm
instructions to help disassembly.

We also changed the output of the addressing modes to omit the '+' from the
assembler syntax #+/-<imm> or +/-<Rm>.  See, for example, A8.6.57/58/60.

And modified test cases to not expect '+' in +reg or #+num.  For example,

; CHECK:       ldr.w	r9, [r7, #28]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98745 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-17 17:52:21 +00:00
Bob Wilson
49d9dc4dd2 --- Reverse-merging r98637 into '.':
U    test/CodeGen/ARM/tls2.ll
U    test/CodeGen/ARM/arm-negative-stride.ll
U    test/CodeGen/ARM/2009-10-30.ll
U    test/CodeGen/ARM/globals.ll
U    test/CodeGen/ARM/str_pre-2.ll
U    test/CodeGen/ARM/ldrd.ll
U    test/CodeGen/ARM/2009-10-27-double-align.ll
U    test/CodeGen/Thumb2/thumb2-strb.ll
U    test/CodeGen/Thumb2/ldr-str-imm12.ll
U    test/CodeGen/Thumb2/thumb2-strh.ll
U    test/CodeGen/Thumb2/thumb2-ldr.ll
U    test/CodeGen/Thumb2/thumb2-str_pre.ll
U    test/CodeGen/Thumb2/thumb2-str.ll
U    test/CodeGen/Thumb2/thumb2-ldrh.ll
U    utils/TableGen/TableGen.cpp
U    utils/TableGen/DisassemblerEmitter.cpp
D    utils/TableGen/RISCDisassemblerEmitter.h
D    utils/TableGen/RISCDisassemblerEmitter.cpp
U    Makefile.rules
U    lib/Target/ARM/ARMInstrNEON.td
U    lib/Target/ARM/Makefile
U    lib/Target/ARM/AsmPrinter/ARMInstPrinter.cpp
U    lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp
U    lib/Target/ARM/AsmPrinter/ARMInstPrinter.h
D    lib/Target/ARM/Disassembler
U    lib/Target/ARM/ARMInstrFormats.td
U    lib/Target/ARM/ARMAddressingModes.h
U    lib/Target/ARM/Thumb2ITBlockPass.cpp


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98640 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-16 16:59:47 +00:00
Johnny Chen
d30a98e43a Initial ARM/Thumb disassembler check-in. It consists of a tablgen backend
(RISCDisassemblerEmitter) which emits the decoder functions for ARM and Thumb,
and the disassembler core which invokes the decoder function and builds up the
MCInst based on the decoded Opcode.

Added sub-formats to the NeonI/NeonXI instructions to further refine the NEONFrm
instructions to help disassembly.

We also changed the output of the addressing modes to omit the '+' from the
assembler syntax #+/-<imm> or +/-<Rm>.  See, for example, A8.6.57/58/60.

And modified test cases to not expect '+' in +reg or #+num.  For example,

; CHECK:       ldr.w	r9, [r7, #28]


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98637 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-16 16:36:54 +00:00
Bob Wilson
ea7f22c31d Stop using the old pre-UAL syntax for LDM/STM instruction suffixes.
This does not move entirely to UAL syntax, since the default "increment after"
suffix is empty but we still use "IA" for that.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98635 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-16 16:19:07 +00:00
Bob Wilson
7cfb6d373a Add a testcase for the change in r98586.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98610 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-16 05:33:29 +00:00
Evan Cheng
fac4f1f181 Enable machine cse pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98132 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-10 03:07:41 +00:00
Bob Wilson
f5fd499791 Fix a crash compiling 254.gap for Thumb2. The Thumb2 add/sub with 12-bit
immediate instructions cannot set the condition codes, so they do not have
the extra cc_out operand.  We hit an assertion during tail duplication
because the instruction being duplicated had more operands that expected.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98001 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-08 22:56:15 +00:00
Johnny Chen
267124cff2 Drop the ".w" qualifier for t2UXTB16* instructions as there is no 16-bit version
of either sxtb16 or uxtb16, and the unified syntax does not specify ".w".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97760 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-04 22:24:41 +00:00
Jakob Stoklund Olesen
657baec0af Create a stack frame on ARM when
- Function uses all scratch registers AND
- Function does not use any callee saved registers AND
- Stack size is too big to address with immediate offsets.

In this case a register must be scavenged to calculate the address of a stack
object, and the scavenger needs a spare register or emergency spill slot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97071 91177308-0d34-0410-b5e6-96231b3b80d8
2010-02-24 22:43:17 +00:00
Jim Grosbach
f9a4b7653d LowerCall() should always do getCopyFromReg() to reference the stack pointer.
Machine instruction selection is much happier when operands are in virtual
registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97012 91177308-0d34-0410-b5e6-96231b3b80d8
2010-02-24 01:43:03 +00:00
Bob Wilson
f1b0a345d3 Last week we were generating code with duplicate induction variables in this
test, but the problem seems to have gone away today.  Add a check to make sure
it doesn't come back.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96277 91177308-0d34-0410-b5e6-96231b3b80d8
2010-02-15 21:56:40 +00:00
Bob Wilson
bf9b221c00 Besides removing phi cycles that reduce to a single value, also remove dead
phi cycles.  Adjust a few tests to keep dead instructions from being optimized
away.  This (together with my previous change for phi cycles) fixes Apple
radar 7627077.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96057 91177308-0d34-0410-b5e6-96231b3b80d8
2010-02-13 00:31:44 +00:00
Dan Gohman
572645cf84 Reapply the new LoopStrengthReduction code, with compile time and
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.

This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95975 91177308-0d34-0410-b5e6-96231b3b80d8
2010-02-12 10:34:29 +00:00
Bob Wilson
fe61fb1e10 Add a new pass on machine instructions to optimize away PHI cycles that
reduce down to a single value.  InstCombine already does this transformation
but DAG legalization may introduce new opportunities.  This has turned out to
be important for ARM where 64-bit values are split up during type legalization:
InstCombine is not able to remove the PHI cycles on the 64-bit values but
the separate 32-bit values can be optimized.  I measured the compile time 
impact of this (running llc on 176.gcc) and it was not significant.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95951 91177308-0d34-0410-b5e6-96231b3b80d8
2010-02-12 01:30:21 +00:00
Jakob Stoklund Olesen
4a540f0593 Reapply coalescer fix for better cross-class coalescing.
This time with fixed test cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95938 91177308-0d34-0410-b5e6-96231b3b80d8
2010-02-11 23:55:29 +00:00
Bob Wilson
e6373eb826 Handle AddrMode6 (for NEON load/stores) in Thumb2's rewriteT2FrameIndex.
Radar 7614112.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95456 91177308-0d34-0410-b5e6-96231b3b80d8
2010-02-06 00:24:38 +00:00
Dan Gohman
7979b72feb Revert LoopStrengthReduce.cpp to pre-r94061 for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94123 91177308-0d34-0410-b5e6-96231b3b80d8
2010-01-22 00:46:49 +00:00
Dan Gohman
a10756ee65 Re-implement the main strength-reduction portion of LoopStrengthReduction.
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.

It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94061 91177308-0d34-0410-b5e6-96231b3b80d8
2010-01-21 02:09:26 +00:00
Jakob Stoklund Olesen
35f0febcb6 Remove predicates when changing an add into an unpredicable mov.
Since the mov is executed unconditionally, make sure that the add didn't have
any predicate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93909 91177308-0d34-0410-b5e6-96231b3b80d8
2010-01-19 21:08:28 +00:00
Bob Wilson
516ab96de3 Run the pre-register allocation tail duplication pass by default. Remove
the -pre-regalloc-taildup command-line option, and add a new
-disable-early-taildup option.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93597 91177308-0d34-0410-b5e6-96231b3b80d8
2010-01-16 00:29:50 +00:00
Jakob Stoklund Olesen
d1862037f0 Add comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92883 91177308-0d34-0410-b5e6-96231b3b80d8
2010-01-07 00:51:04 +00:00
Jakob Stoklund Olesen
30ac0467ce Add Target hook to duplicate machine instructions.
Some instructions refer to unique labels, and so cannot be trivially cloned
with CloneMachineInstr.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92873 91177308-0d34-0410-b5e6-96231b3b80d8
2010-01-06 23:47:07 +00:00
Dan Gohman
aceba31b7a Delete useless trailing semicolons.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92740 91177308-0d34-0410-b5e6-96231b3b80d8
2010-01-05 17:55:26 +00:00