Commit Graph

90190 Commits

Author SHA1 Message Date
Arnold Schwaighofer
77e4a47e9b ARM cost model: Fix cost of fptrunc and fpext instructions
A vector fptrunc and fpext simply gets split into scalar instructions.

radar://13192358

llvm-svn: 177159
2013-03-15 15:10:47 +00:00
Hal Finkel
2ecb85412e Protect PPC Altivec patterns with a predicate
In preparation for the addition of other SIMD ISA extensions (such as QPX) we
need to make sure that all Altivec patterns are properly predicated on having
Altivec support.

No functionality change intended (one test case needed to be updated b/c it
assumed that Altivec intrinsics would be supported without enabling Altivec
support).

llvm-svn: 177152
2013-03-15 13:21:21 +00:00
Alexey Samsonov
2b83cbf9ab Fixup for r176933: more careful setup of path to llvm-symbolizer
llvm-svn: 177144
2013-03-15 07:27:49 +00:00
Craig Topper
c3c590b1bb Use NumBaseBits in a few more places in SmallBitVector instead of recalculating it. No functional change.
llvm-svn: 177142
2013-03-15 06:01:42 +00:00
Rafael Espindola
5529d68693 Fix the FDE encoding to be relative on ELF.
This is a very late complement to r130637 which fixed this on x86_64. Fixes
pr15448.

Since it looks like that every elf architecture uses this encoding when using
cfi, make it the default for elf. Just exclude mips64el. It has a lovely
.ll -> .o test (ef_frame.ll) that tests that nothing changes in the binary
content of the .eh_frame produced by llc. Oblige it.

llvm-svn: 177141
2013-03-15 05:51:57 +00:00
Hal Finkel
503a3723d1 Allocate the RS spill slot for any PPC function with spills and a large stack frame
For spills into a large stack frame, the FI-elimination code uses the register
scavenger to obtain a free GPR for use with an r+r-addressed load or store.
When there are no available GPRs, the scavenger gets one by using its spill
slot. Previously, we were not always allocating that spill slot and the RS
would assert when the spill slot was needed.

I don't currently have a small test that triggered the assert, but I've
created a small regression test that verifies that the spill slot is now
added when the stack frame is sufficiently large.

llvm-svn: 177140
2013-03-15 05:06:04 +00:00
Eric Christopher
a925e8b410 Turn anonymous type in anonymous union warning back on after cleaning up
issues.

llvm-svn: 177136
2013-03-15 00:43:00 +00:00
Eric Christopher
867ab49c5e Silence anonymous type in anonymous union warnings.
llvm-svn: 177135
2013-03-15 00:42:55 +00:00
Nadav Rotem
6500dc0dd2 Add a triple to the test.
llvm-svn: 177131
2013-03-15 00:10:23 +00:00
Nadav Rotem
03b60b8657 Unaligned loads should use the VMOVUPS opcode.
llvm-svn: 177130
2013-03-14 23:49:44 +00:00
David Blaikie
45229b3f0a Remove some unused variables to clean the Clang -Werror build
(these were added in r177089)

llvm-svn: 177129
2013-03-14 23:11:07 +00:00
Akira Hatanaka
70e291e997 [mips] Set isAllocatable bit of unallocatable register classes to 0.
llvm-svn: 177128
2013-03-14 23:09:19 +00:00
Andrew Trick
aab9978611 Fix r177112: Add ProcResGroup.
This is the other half of r177122 that I meant to commit at the same time.

llvm-svn: 177123
2013-03-14 22:47:01 +00:00
Jakob Stoklund Olesen
4d174fdcd0 Prepare for adding InstrSchedModel annotations to X86 instructions.
The new InstrSchedModel is easier to use than the instruction
itineraries. It will be used to model instruction latency and throughput
in modern Intel microarchitectures like Sandy Bridge.

InstrSchedModel should be able to coexist with instruction itinerary
classes, but for cleanliness we should switch the Atom processor model
to the new InstrSchedModel as well.

llvm-svn: 177122
2013-03-14 22:42:17 +00:00
Reed Kotler
6959b26da1 Add a new method which enables one to change register classes.
See the Mips16ISetLowering.cpp patch to see a use of this.
For now now the extra code in Mips16ISetLowering.cpp is a nop but is
used for test purposes. Mips32 registers are setup and then removed and
then the Mips16 registers are setup. 

Normally you need to add register classes and then call
computeRegisterProperties.

llvm-svn: 177120
2013-03-14 22:02:09 +00:00
Arnold Schwaighofer
6cbcb5af81 LoopVectorizer: Insert some white space to make test case more readable
Also remove some unneeded function attributes.

llvm-svn: 177114
2013-03-14 21:31:09 +00:00
Chad Rosier
aca0d2a5a0 [fast-isel] The X86FastISel::FastLowerArguments function doesn't properly handle
the win64 calling convention.
rdar://13423768

llvm-svn: 177113
2013-03-14 21:25:04 +00:00
Andrew Trick
d133b52150 MachineModel: Add a ProcResGroup class.
This allows abitrary groups of processor resources. Using something in
a subset automatically counts againts the superset. Currently, this
only works if the superset is also a ProcResGroup as opposed to a
SuperUnit.

This allows SandyBridge to be expressed naturally, which will be
checked in shortly.

def SBPort01 : ProcResGroup<[SBPort0, SBPort1]>;
def SBPort15 : ProcResGroup<[SBPort1, SBPort5]>;
def SBPort23  : ProcResGroup<[SBPort2, SBPort3]>;
def SBPort015 : ProcResGroup<[SBPort0, SBPort1, SBPort5]>;

llvm-svn: 177112
2013-03-14 21:21:50 +00:00
Hal Finkel
c0eead2c60 Move estimateStackSize from ARM into MachineFrameInfo
This is a generic function (derived from PEI); moving it into
MachineFrameInfo eliminates a current redundancy between the ARM and AArch64
backends, and will allow it to be used by the PowerPC target code.

No functionality change intended.

llvm-svn: 177111
2013-03-14 21:15:20 +00:00
Hal Finkel
a2a564fb2e Provide the register scavenger to processFunctionBeforeFrameFinalized
Add the current PEI register scavenger as a parameter to the
processFunctionBeforeFrameFinalized callback.

This change is necessary in order to allow the PowerPC target code to
set the register scavenger frame index after the save-area offset
adjustments performed by processFunctionBeforeFrameFinalized. Only
after these adjustments have been made is it possible to estimate
the size of the stack frame.

llvm-svn: 177108
2013-03-14 20:33:40 +00:00
Hal Finkel
e71925b570 Use frame-index scavenging for PPC register spilling
Make requiresFrameIndexScavenging return true, and create virtual registers in
the spilling code instead of using the register scavenger directly. This makes
the target-level code simpler, and importantly, delays the scavenging until
after callee-saved register processing (which will be important for later
changes).

Also cleans up trackLivenessAfterRegAlloc (makes it inline in the header with
the other related functions). This makes it clear that it always returns true.

No functionality change intended.

llvm-svn: 177107
2013-03-14 20:21:47 +00:00
Hal Finkel
37a5522734 Not all PPC functions with a frame pointer need a RS spill slot
We used to add a spill slot for the register scavenger whenever the function
has a frame pointer. This is unnecessarily conservative: We may need the spill
slot for dynamic stack allocations, and functions with dynamic stack
allocations always have a FP, but we might also have a FP for other reasons
(such as the user explicitly disabling frame-pointer elimination), and we don't
necessarily need a spill slot for those functions.

The structsinregs test needed adjustment because it disables FP elimination.

llvm-svn: 177106
2013-03-14 19:34:32 +00:00
Arnold Schwaighofer
63a59d3be8 ARM cost model: Increase cost of some vector selects we do terrible on
By terrible I mean we store/load from the stack.

This matters on PAQp8 in _Z5trainPsS_ii (which is inlined into Mixer::update)
where we decide to vectorize a loop with a VF of 8 resulting in a 25%
degradation on a cortex-a8.

LV: Found an estimated cost of 2 for VF 8 For instruction:   icmp slt i32
LV: Found an estimated cost of 2 for VF 8 For instruction:   select i1, i32, i32

The bug that tracks the CodeGen part is PR14868.

radar://13403975

llvm-svn: 177105
2013-03-14 19:17:02 +00:00
Akira Hatanaka
d091d9d1db [mips] Fix filename in comment and delete unnecessary lines of code.
No functionality changes.

llvm-svn: 177104
2013-03-14 19:09:52 +00:00
Jyotsna Verma
f2d3c71cf4 Hexagon: Removed asserts regarding alignment and offset.
We are warning the user about the alignment, so we should not assert.

llvm-svn: 177103
2013-03-14 19:08:03 +00:00
Arnold Schwaighofer
4c916c0a2a Add missing asserts flag to test - it uses debug flags
llvm-svn: 177102
2013-03-14 19:01:58 +00:00
Akira Hatanaka
d0dcf1faff Android uses cacheflush(long start, long end, long flags) for MIPS.
Patch by Stephen Hines.

llvm-svn: 177101
2013-03-14 19:01:00 +00:00
Arnold Schwaighofer
3e3105f2f8 LoopVectorize: Invert case when we use a vector cmp value to query select cost
We generate a select with a vectorized condition argument when the condition is
NOT loop invariant. Not the other way around.

llvm-svn: 177098
2013-03-14 18:54:36 +00:00
Akira Hatanaka
0de8223831 Add back lines which were accidentally deleted in CMakeLists.txt.
llvm-svn: 177096
2013-03-14 18:46:46 +00:00
Akira Hatanaka
6867334681 [mips] Define function MipsSEDAGToDAGISel::selectAddESubE.
No intended functionality changes.

llvm-svn: 177095
2013-03-14 18:39:25 +00:00
Hal Finkel
b37c6bb3c1 Add a comment about overlapping PPC frame offsets
I don't think that it is otherwise clear how the overlapping offsets
are processed into distinct spill slots. Comment that this is done
in processFunctionBeforeFrameFinalized.

llvm-svn: 177094
2013-03-14 18:38:31 +00:00
Akira Hatanaka
3127f7da20 [mips] Rename functions and variables to start with proper case.
llvm-svn: 177092
2013-03-14 18:33:23 +00:00
Akira Hatanaka
fdebcc91c4 Add header file MipsISelDAGToDAG.h.
llvm-svn: 177090
2013-03-14 18:28:19 +00:00
Akira Hatanaka
10f5675f72 [mips] Define two subclasses of MipsDAGToDAGISel. Mips16DAGToDAGISel is for
mips16 and MipsSEDAGToDAGISel is for mips32/64. 

No functionality changes.

llvm-svn: 177089
2013-03-14 18:27:31 +00:00
Shuxin Yang
55038cc0b2 Perform factorization as a last resort of unsafe fadd/fsub simplification.
Rules include:
  1)1 x*y +/- x*z => x*(y +/- z) 
    (the order of operands dosen't matter)

  2) y/x +/- z/x => (y +/- z)/x 

 The transformation is disabled if the new add/sub expr "y +/- z" is a 
denormal/naz/inifinity.

rdar://12911472

llvm-svn: 177088
2013-03-14 18:08:26 +00:00
Adrian Prantl
e3d19c207e Test that we emit a DW_AT_location for self captured by a block.
This is the backend part of a CFE test with the same name.

llvm-svn: 177087
2013-03-14 17:54:13 +00:00
Vincent Lejeune
cd12dadb5c R600: Factorize code handling Const Read Port limitation
llvm-svn: 177078
2013-03-14 15:50:45 +00:00
Alexey Samsonov
984e7940a4 [ASan] emit instrumentation for initialization order checking by default
llvm-svn: 177063
2013-03-14 12:38:58 +00:00
Chandler Carruth
3d9eacc90b PR14972: SROA vs. GVN exposed a really bad bug in SROA.
The fundamental problem is that SROA didn't allow for overly wide loads
where the bits past the end of the alloca were masked away and the load
was sufficiently aligned to ensure there is no risk of page fault, or
other trapping behavior. With such widened loads, SROA would delete the
load entirely rather than clamping it to the size of the alloca in order
to allow mem2reg to fire. This was exposed by a test case that neatly
arranged for GVN to run first, widening certain loads, followed by an
inline step, and then SROA which miscompiles the code. However, I see no
reason why this hasn't been plaguing us in other contexts. It seems
deeply broken.

Diagnosing all of the above took all of 10 minutes of debugging. The
really annoying aspect is that fixing this completely breaks the pass.
;] There was an implicit reliance on the fact that no loads or stores
extended past the alloca once we decided to rewrite them in the final
stage of SROA. This was used to encode information about whether the
loads and stores had been split across multiple partitions of the
original alloca. That required threading explicit tracking of whether
a *use* of a partition is split across multiple partitions.

Once that was done, another problem arose: we allowed splitting of
integer loads and stores iff they were loads and stores to the entire
alloca. This is a really arbitrary limitation, and splitting at least
some integer loads and stores is crucial to maximize promotion
opportunities. My first attempt was to start removing the restriction
entirely, but currently that does Very Bad Things by causing *many*
common alloca patterns to be fully decomposed into i8 operations and
lots of or-ing together to produce larger integers on demand. The code
bloat is terrifying. That is still the right end-goal, but substantial
work must be done to either merge partitions or ensure that small i8
values are eagerly merged in some other pass. Sadly, figuring all this
out took essentially all the time and effort here.

So the end result is that we allow splitting only when the load or store
at least covers the alloca. That ensures widened loads and stores don't
hurt SROA, and that we don't rampantly decompose operations more than we
have previously.

All of this was already fairly well tested, and so I've just updated the
tests to cover the wide load behavior. I can add a test that crafts the
pass ordering magic which caused the original PR, but that seems really
brittle and to provide little benefit. The fundamental problem is that
widened loads should Just Work.

llvm-svn: 177055
2013-03-14 11:32:24 +00:00
Joerg Sonnenberger
9e9e742da6 Add two of the float related ARM-specific entries for e_flags needed for
linkers to interact with GNU ld.

llvm-svn: 177016
2013-03-14 08:01:36 +00:00
Craig Topper
ca6b32ebe1 Fix the name of a variable to match its declaration. Fixes build failure from r177014.
llvm-svn: 177015
2013-03-14 07:47:43 +00:00
Craig Topper
29d0a365f1 Fix a bug in the calculation of the VEX.B bit for FMA4 rr with the VEX.W bit set. The VEX.B was being calculated from the wrong operand. Fixes at least some portion of PR14185.
llvm-svn: 177014
2013-03-14 07:40:52 +00:00
Craig Topper
a7c401d655 Teach X86 MC instruction lowering that VMOVAPSrr and other VEX-encoded register to register moves should be switched from using the MRMSrcReg form to the MRMDestReg form if the source register is a 64-bit extended register and the destination register is not. This allows the instruction to be encoded using the 2-byte VEX form instead of the 3-byte VEX form. The GNU assembler has similar behavior.
llvm-svn: 177011
2013-03-14 07:09:57 +00:00
Michael Liao
89d165e673 Fix PR15309
- Fix the typo on type checking

llvm-svn: 177010
2013-03-14 06:57:42 +00:00
Jiong Wang
f4d5a4cd79 test commit: remove blank line.
llvm-svn: 177009
2013-03-14 05:43:59 +00:00
Nick Lewycky
251845e6af Remove a change to the debug info in this test, that I made while testing
something else and forgot to remove.

llvm-svn: 177007
2013-03-14 05:28:10 +00:00
Nick Lewycky
d09c6bee59 Try using %S to find the emitted .gcno file.
llvm-svn: 177006
2013-03-14 05:23:30 +00:00
Nick Lewycky
a61dbc58bb Remove accidentally committed debug line.
llvm-svn: 177005
2013-03-14 05:19:12 +00:00
Nick Lewycky
d2ee2e0cd8 Refactor GCOV's six constructor arguments into a struct with a getter that
constructs default arguments. It can now take default arguments from
cl::opt'ions. Add a new -default-gcov-version=... option, and actually test it!

Sink the reverse-order of the version into GCOVProfiling, hiding it from our
users.

llvm-svn: 177002
2013-03-14 05:13:26 +00:00
Nick Lewycky
110760eb4d Fix typo in comment.
llvm-svn: 176997
2013-03-14 01:26:17 +00:00