15931 Commits

Author SHA1 Message Date
Chandler Carruth
2817fc1e53 Fix ValueTracking to conclude that debug intrinsics are safe to
speculate. Without this, loop rotate (among many other places) would
suddenly stop working in the presence of debug info. I found this
looking at loop rotate, and have augmented its tests with a reduction
out of a very hot loop in yacr2 where failing to do this rotation costs
sometimes more than 10% in runtime performance, perturbing numerous
downstream optimizations.

This should have no impact on performance without debug info, but the
change in performance when debug info is enabled can be extreme. As
a consequence (and this how I got to this yak) any profiling of
performance problems should be treated with deep suspicion -- they may
have been wildly innacurate of debug info was enabled for profiling. =/
Just a heads up.

llvm-svn: 154263
2012-04-07 19:22:18 +00:00
Benjamin Kramer
a690750db9 SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW.
Found by inspection.

llvm-svn: 154262
2012-04-07 17:19:26 +00:00
Alexis Hunt
03ad83efbd Make the test for r154235 more platform-independent with a shorter
string.

llvm-svn: 154243
2012-04-07 01:33:14 +00:00
Alexis Hunt
5c14769849 Output UTF-8-encoded characters as identifier characters into assembly
by default.

This is a behaviour configurable in the MCAsmInfo. I've decided to turn
it on by default in (possibly optimistic) hopes that most assemblers are
reasonably sane. If this proves a problem, switching to default seems
reasonable.

I'm not sure if this is the opportune place to test, but it seemed good
to make sure it was tested somewhere.

llvm-svn: 154235
2012-04-07 00:37:53 +00:00
Akira Hatanaka
5cce394620 Add lines in global-address.ll to test N32 and N64 code generation.
llvm-svn: 154202
2012-04-06 20:23:36 +00:00
Jakob Stoklund Olesen
bb7b631def Allow negative immediates in ARM and Thumb2 compares.
ARM and Thumb2 mode can use cmn instructions to compare against negative
immediates. Thumb1 mode can't.

llvm-svn: 154183
2012-04-06 17:45:04 +00:00
Chandler Carruth
020a15db9d Sink the collection of return instructions until after *all*
simplification has been performed. This is a bit less efficient
(requires another ilist walk of the basic blocks) but shouldn't matter
in practice. More importantly, it's just too much work to keep track of
all the various ways the return instructions can be mutated while
simplifying them. This fixes yet another crasher, reported by Daniel
Dunbar.

llvm-svn: 154179
2012-04-06 17:21:31 +00:00
Chandler Carruth
352f98dd1e Tweak this test to ensure the inliner did indeed fire. Thanks to Richard
Smith for pointing this out in review.

llvm-svn: 154178
2012-04-06 17:21:28 +00:00
Craig Topper
40ac46c3d7 Test case for PR12413
llvm-svn: 154172
2012-04-06 14:38:25 +00:00
Craig Topper
ffae2f8986 Allow 256-bit shuffles to be split if a 128-bit lane contains elements from a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413.
llvm-svn: 154166
2012-04-06 07:45:23 +00:00
Craig Topper
a8657716ac Add the tests that were supposed to go with r153935 that I forgot svn add
llvm-svn: 154165
2012-04-06 07:09:59 +00:00
Chandler Carruth
bd8f18f828 Actually finish this sentence in the comment the way I intended. Thanks
Matt for pointing this out.

llvm-svn: 154158
2012-04-06 01:19:38 +00:00
Chandler Carruth
dc52b30dac Sink the return instruction collection until after we're done deleting
dead code, including dead return instructions in some cases. Otherwise,
we end up having a bogus poniter to a return instruction that blows up
much further down the road.

It turns out that this pattern is both simpler to code, easier to update
in the face of enhancements to the inliner cleanup, and likely cheaper
given that it won't add dead instructions to the list.

Thanks to John Regehr's numerous test cases for teasing this out.

llvm-svn: 154157
2012-04-06 01:11:52 +00:00
Jim Grosbach
2169e1d55c ARM assembly aliases for add negative immediates using sub.
'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out.
Thumb1 aliases for adding a negative immediate to the stack pointer,
also.

rdar://11192734

llvm-svn: 154123
2012-04-05 20:57:13 +00:00
Akira Hatanaka
f3ec345016 Reapply test case in 154038, this time with triple to prevent the backend
from emitting gp_rel relocation.

llvm-svn: 154122
2012-04-05 20:44:35 +00:00
Eric Christopher
2e17b32e69 Patch to set is_stmt a little better for prologue lines in a function.
This enables debuggers to see what are interesting lines for a
breakpoint rather than any line that starts a function.

rdar://9852092

llvm-svn: 154120
2012-04-05 20:39:05 +00:00
Jakob Stoklund Olesen
28edb011c4 Don't break the IV update in TLI::SimplifySetCC().
LSR always tries to make the ICmp in the loop latch use the incremented
induction variable. This allows the induction variable to be kept in a
single register.

When the induction variable limit is equal to the stride,
SimplifySetCC() would break LSR's hard work by transforming:

   (icmp (add iv, stride), stride) --> (cmp iv, 0)

This forced us to use lea for the IC update, preventing the simpler
incl+cmp.

<rdar://problem/7643606>
<rdar://problem/11184260>

llvm-svn: 154119
2012-04-05 20:30:20 +00:00
Dan Gohman
a5e2200b2a Fix accidentally inverted logic from r152803, and make the
testcase slightly less trivial. This fixes rdar://11171718.

llvm-svn: 154118
2012-04-05 20:27:21 +00:00
Silviu Baranga
f376e00699 Added support for unpredictable ADC/SBC instructions on ARM, and also fixed some corner cases involving the PC register as an operand for these instructions.
llvm-svn: 154101
2012-04-05 16:19:29 +00:00
Silviu Baranga
1c2668f700 Added support for handling unpredictable arithmetic instructions on ARM.
llvm-svn: 154100
2012-04-05 16:13:15 +00:00
James Molloy
3604b95957 An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases.
Noticed while investigating PR12274!

llvm-svn: 154090
2012-04-05 10:01:12 +00:00
Jim Grosbach
5d11d38750 ARM assembly aliases for two-operand V[R]SHR instructions.
rdar://11189467

llvm-svn: 154087
2012-04-05 07:23:53 +00:00
Jim Grosbach
64f4e8d5b3 ARM assembly parsing for 'msr' plain 'cpsr' operand.
Plain 'cpsr' is an alias for 'cpsr_fc'.

rdar://11153753

llvm-svn: 154080
2012-04-05 03:17:53 +00:00
Jakob Stoklund Olesen
e1ae4f161c Pass the right sign to TLI->isLegalICmpImmediate.
LSR can fold three addressing modes into its ICmpZero node:

  ICmpZero BaseReg + Offset      => ICmp BaseReg, -Offset
  ICmpZero -1*ScaleReg + Offset  => ICmp ScaleReg, Offset
  ICmpZero BaseReg + -1*ScaleReg => ICmp BaseReg, ScaleReg

The first two cases are only used if TLI->isLegalICmpImmediate() likes
the offset.

Make sure the right Offset sign is passed to this method in the second
case. The ARM version is not symmetric.

<rdar://problem/11184260>

llvm-svn: 154079
2012-04-05 03:10:56 +00:00
Akira Hatanaka
e5ea70212f Reapply 154038 without the failing test.
llvm-svn: 154062
2012-04-04 22:16:36 +00:00
Owen Anderson
f6f930a990 Revert r154038. It was causing make check failures.
llvm-svn: 154054
2012-04-04 21:18:58 +00:00
Akira Hatanaka
4df2267566 Fix LowerGlobalAddress to produce instructions with the correct relocation
types for N32 ABI. Add new test case and update existing ones.

llvm-svn: 154038
2012-04-04 19:02:38 +00:00
Akira Hatanaka
c8028e2551 Fix LowerConstantPool to produce instructions with the correct relocation
types for N32 ABI and update test case.

llvm-svn: 154034
2012-04-04 18:26:12 +00:00
Jakob Stoklund Olesen
0419ed395c Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr.
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.

<rdar://problem/11182914>

llvm-svn: 154033
2012-04-04 18:23:42 +00:00
Akira Hatanaka
913d78a99c Fix LowerBlockAddress to produce instructions with the correct relocation
types for N32 ABI and update test case.

llvm-svn: 154031
2012-04-04 18:22:53 +00:00
Hongbin Zheng
2a8f0cf400 Add testcase for r154007, when a function has the optsize attribute,
the loop should be unrolled according the value of OptSizeUnrollThreshold.

llvm-svn: 154014
2012-04-04 13:24:40 +00:00
Rafael Espindola
88a1aeb123 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Michael J. Spencer
2f9beb2374 Add YAML parser to Support.
llvm-svn: 153977
2012-04-03 23:09:22 +00:00
Pete Cooper
4164f86b8a Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095>
llvm-svn: 153976
2012-04-03 22:57:55 +00:00
Eric Christopher
53ef0cf4a5 Fix thinko check for number of operands to be the one that actually
might have more than 19 operands. Add a testcase to make sure I
never screw that up again.

Part of rdar://11026482

llvm-svn: 153961
2012-04-03 17:55:42 +00:00
Nadav Rotem
d72bf636aa Add an additional testcase which checks ops with multiple users.
llvm-svn: 153939
2012-04-03 07:39:36 +00:00
Craig Topper
ce6c05e0df Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo.
llvm-svn: 153935
2012-04-03 05:20:24 +00:00
Akira Hatanaka
c5bbe0b434 Revert r153924. Delete test/MC/Disassembler/Mips and lib/Target/Mips/Disassembler.
llvm-svn: 153926
2012-04-03 03:01:13 +00:00
Akira Hatanaka
cecb440c11 Revert r153924. There were buildbot failures.
llvm-svn: 153925
2012-04-03 02:51:09 +00:00
Akira Hatanaka
058b0cfb55 MIPS disassembler support.
Patch by Vladimir Medic.

llvm-svn: 153924
2012-04-03 02:20:58 +00:00
Jakob Stoklund Olesen
97f47c37b6 Allocate virtual registers in ascending order.
This is just the fallback tie-breaker ordering, the main allocation
order is still descending size.

Patch by Shamil Kurmangaleev!

llvm-svn: 153904
2012-04-02 22:30:39 +00:00
Lang Hames
dbc3175c89 During two-address lowering, rescheduling an instruction does not untie
operands. Make TryInstructionTransform return false to reflect this.
Fixes PR11861.

llvm-svn: 153892
2012-04-02 19:58:43 +00:00
Rafael Espindola
40e34629cb No need to run llvm-as.
llvm-svn: 153890
2012-04-02 19:44:20 +00:00
Akira Hatanaka
f37a1c4323 Initial 64 bit direct object support.
This patch allows llvm to recognize that a 64 bit object file is being produced
and that the subsequently generated ELF header has the correct information.

The test case checks for both big and little endian flavors.

Patch by Jack Carter.

llvm-svn: 153889
2012-04-02 19:25:22 +00:00
Stepan Dyatkovskiy
0ddc03ebad Fast fix for PR12343:
http://llvm.org/bugs/show_bug.cgi?id=12343

We have not trivial way for splitting edges that are goes from indirect branch. We can do it with some tricks, but it should be additionally discussed. And it is still dangerous due to difficulty of indirect branches controlling.

Fix forbids this case for unswitching.

llvm-svn: 153879
2012-04-02 17:16:45 +00:00
Silviu Baranga
77d372b45e Added fix in TableGen instruction decoder generation. The decoder now breaks for every leaf node.
llvm-svn: 153874
2012-04-02 15:20:39 +00:00
Nadav Rotem
a9ec0e024f Optimizing swizzles of complex shuffles may generate additional complex shuffles.
Do not try to optimize swizzles of shuffles if the source shuffle has more than
a single user, except when the source shuffle is also a swizzle.

llvm-svn: 153864
2012-04-02 07:11:12 +00:00
Hal Finkel
1c045f6845 Enable prefetch generation on PPC64.
llvm-svn: 153851
2012-04-01 20:08:17 +00:00
Nadav Rotem
2729f54295 This commit contains a few changes that had to go in together.
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
   (and also scalar_to_vector).

2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
   Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))

3. Optimize swizzles of shuffles:  shuff(shuff(x, y), undef) -> shuff(x, y).

4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.

Code which was previously compiled to this:

movd    (%rsi), %xmm0
movdqa  .LCPI0_0(%rip), %xmm2
pshufb  %xmm2, %xmm0
movd    (%rdi), %xmm1
pshufb  %xmm2, %xmm1
pxor    %xmm0, %xmm1
pshufb  .LCPI0_1(%rip), %xmm1
movd    %xmm1, (%rdi)
ret

Now compiles to this:

movl    (%rsi), %eax
xorl    %eax, (%rdi)
ret

llvm-svn: 153848
2012-04-01 19:31:22 +00:00
Hal Finkel
fd26145bc6 Add instruction itinerary for the PPC64 A2 core.
This adds a full itinerary for IBM's PPC64 A2 embedded core. These
cores form the basis for the CPUs in the new IBM BG/Q supercomputer.

llvm-svn: 153842
2012-04-01 19:22:40 +00:00