Commit Graph

66928 Commits

Author SHA1 Message Date
Nick Lewycky
03b9ed1b7b A memcpy out of an fresh alloca is a no-op, delete it. Patch by Patrick Walton!
llvm-svn: 200907
2014-02-06 06:29:19 +00:00
Chandler Carruth
a10bbb470e [PM] Fix horrible typos that somehow didn't cause a failure in a C++11
build but spectacularly changed behavior of the C++98 build. =]

This shows my one problem with not having unittests -- basic API
expectations aren't well exercised by the integration tests because they
*happen* to not come up, even though they might later. I'll probably add
a basic unittest to complement the integration testing later, but
I wanted to revive the bots.

llvm-svn: 200905
2014-02-06 05:17:02 +00:00
Chandler Carruth
039285cbbf [PM] Add a new "lazy" call graph analysis pass for the new pass manager.
The primary motivation for this pass is to separate the call graph
analysis used by the new pass manager's CGSCC pass management from the
existing call graph analysis pass. That analysis pass is (somewhat
unfortunately) over-constrained by the existing CallGraphSCCPassManager
requirements. Those requirements make it *really* hard to cleanly layer
the needed functionality for the new pass manager on top of the existing
analysis.

However, there are also a bunch of things that the pass manager would
specifically benefit from doing differently from the existing call graph
analysis, and this new implementation tries to address several of them:

- Be lazy about scanning function definitions. The existing pass eagerly
  scans the entire module to build the initial graph. This new pass is
  significantly more lazy, and I plan to push this even further to
  maximize locality during CGSCC walks.
- Don't use a single synthetic node to partition functions with an
  indirect call from functions whose address is taken. This node creates
  a huge choke-point which would preclude good parallelization across
  the fanout of the SCC graph when we got to the point of looking at
  such changes to LLVM.
- Use a memory dense and lightweight representation of the call graph
  rather than value handles and tracking call instructions. This will
  require explicit update calls instead of some updates working
  transparently, but should end up being significantly more efficient.
  The explicit update calls ended up being needed in many cases for the
  existing call graph so we don't really lose anything.
- Doesn't explicitly model SCCs and thus doesn't provide an "identity"
  for an SCC which is stable across updates. This is essential for the
  new pass manager to work correctly.
- Only form the graph necessary for traversing all of the functions in
  an SCC friendly order. This is a much simpler graph structure and
  should be more memory dense. It does limit the ways in which it is
  appropriate to use this analysis. I wish I had a better name than
  "call graph". I've commented extensively this aspect.

This is still very much a WIP, in fact it is really just the initial
bits. But it is about the fourth version of the initial bits that I've
implemented with each of the others running into really frustrating
problms. This looks like it will actually work and I'd like to split the
actual complexity across commits for the sake of my reviewers. =] The
rest of the implementation along with lots of wiring will follow
somewhat more rapidly now that there is a good path forward.

Naturally, this doesn't impact any of the existing optimizer. This code
is specific to the new pass manager.

A bunch of thanks are deserved for the various folks that have helped
with the design of this, especially Nick Lewycky who actually sat with
me to go through the fundamentals of the final version here.

llvm-svn: 200903
2014-02-06 04:37:03 +00:00
Juergen Ributzka
a5769c5abc [DAG] Don't pull the binary operation though the shift if the operands have opaque constants.
During DAGCombine visitShiftByConstant assumes that certain binary operations
with only constant operands can always be folded successfully. This is no longer
true when the constant is opaque. This commit fixes visitShiftByConstant by not
performing the optimization for opaque constants. Otherwise we would end up in
an infinite DAGCombine loop.

llvm-svn: 200900
2014-02-06 04:09:06 +00:00
Manman Ren
91c0933df0 Set default of inlinecold-threshold to 225.
225 is the default value of inline-threshold. This change will make sure
we have the same inlining behavior as prior to r200886.

As Chandler points out, even though we don't have code in our testing
suite that uses cold attribute, there are larger applications that do
use cold attribute.

r200886 + this commit intend to keep the same behavior as prior to r200886.
We can later on tune the inlinecold-threshold.

The main purpose of r200886 is to help performance of instrumentation based
PGO before we actually hook up inliner with analysis passes such as BPI and BFI.
For instrumentation based PGO, we try to increase inlining of hot functions and
reduce inlining of cold functions by setting inlinecold-threshold.

Another option suggested by Chandler is to use a boolean flag that controls
if we should use OptSizeThreshold for cold functions. The default value
of the boolean flag should not change the current behavior. But it gives us
less freedom in controlling inlining of cold functions.

llvm-svn: 200898
2014-02-06 01:59:22 +00:00
Kevin Enderby
de9b7be909 Update the X86 assembler for .intel_syntax to accept
the << and >> bitwise operators.

rdar://15975725

llvm-svn: 200896
2014-02-06 01:21:15 +00:00
Rafael Espindola
cd26294465 don't set HasReliableSymbolDifference for ELF.
It is only used in MachObjectWriter.cpp. Another leftover from early days
of ELF in MC.

llvm-svn: 200895
2014-02-06 01:06:31 +00:00
Rafael Espindola
b12f4f168f doesSectionRequireSymbols is meaningless on ELF, remove.
This is a nop. doesSectionRequireSymbols is only used from
isSymbolLinkerVisible. isSymbolLinkerVisible only use from ELF was in

if (!Asm.isSymbolLinkerVisible(Symbol) && !Symbol.isUndefined())
  return false;

if (Symbol.isTemporary())
  return false;

If the symbol is a temporary this code returns false and it is irrelevant if
we take the first if or not. If the symbol is not a temporary,
Asm.isSymbolLinkerVisible returns true without ever calling
doesSectionRequireSymbols.

This was an horrible leftover from when support for ELF was first added.

llvm-svn: 200894
2014-02-06 00:54:53 +00:00
Paul Robinson
189e175394 Disable most IR-level transform passes on functions marked 'optnone'.
Ideally only those transform passes that run at -O0 remain enabled,
in reality we get as close as we reasonably can.
Passes are responsible for disabling themselves, it's not the job of
the pass manager to do it for them.

llvm-svn: 200892
2014-02-06 00:07:05 +00:00
Rafael Espindola
c27ddf11ba Just returning false is the default.
llvm-svn: 200890
2014-02-06 00:03:15 +00:00
Matt Arsenault
1e0e825161 Pass address space to allowsUnalignedMemoryAccesses
llvm-svn: 200888
2014-02-05 23:16:05 +00:00
Matt Arsenault
7b69102edb Add address space argument to allowsUnalignedMemoryAccess.
On R600, some address spaces have more strict alignment
requirements than others.

llvm-svn: 200887
2014-02-05 23:15:53 +00:00
Manman Ren
b78e9a1411 Inliner uses a smaller inline threshold for callees with cold attribute.
Added command line option inlinecold-threshold to set threshold for inlining
functions with cold attribute. Listen to the cold attribute when it would
decrease the inline threshold.

llvm-svn: 200886
2014-02-05 22:53:44 +00:00
Quentin Colombet
b3e9fcb39b [RegAlloc] Add a last chance recoloring mechanism when everything else failed to
find a register.

The idea is to choose a color for the variable that cannot be allocated and
recolor its interferences around. Unlike the current register allocation scheme,
it is allowed to change the color of an already assigned (but maybe not
splittable or spillable) live interval while propagating this change to its
neighbors.
In other word, there are two things that may help finding an available color:
- Already assigned variables (RS_Done) can be recolored to different color.
- The recoloring allows to catch solutions that needs to touch more that just
  the neighbors of the current allocated variable.

E.g.,
vA can use {R1, R2    }
vB can use {    R2, R3}
vC can use {R1        }
Where vA, vB, and vC cannot be split anymore (they are reloads for instance) and
they all interfere.

vA is assigned R1
vB is assigned R2
vC tries to evict vA but vA is already done.
=> Regular register allocation heuristic fails.

Last chance recoloring kicks in:
vC does as if vA was evicted => vC uses R1.
vC is marked as fixed.
vA needs to find a color.
None are available.
vA cannot evict vC: vC is a fixed virtual register now.
vA does as if vB was evicted => vA uses R2.
vB needs to find a color.
R3 is available.
Recoloring => vC = R1, vA = R2, vB = R3.

<rdar://problem/15947839>

llvm-svn: 200883
2014-02-05 22:13:59 +00:00
Chandler Carruth
ebb8519cff [PM] Don't require analysis results to be const in the new pass manager.
I think this was just over-eagerness on my part. The analysis results
need to often be non-const because they need to (in some cases at least)
be updated by the transformation pass in order to remain correct. It
also makes lazy analyses (a common case) needlessly annoying to write in
order to make their entire state mutable.

llvm-svn: 200881
2014-02-05 21:41:42 +00:00
Rafael Espindola
98165a6a91 Remove support for not using .loc directives.
Clang itself was not using this. The only way to access it was via llc.

llvm-svn: 200862
2014-02-05 18:00:21 +00:00
Rafael Espindola
b30b4e71fe Revert "Fix an invalid check for duplicate option categories."
This reverts commit r200853.

It was causing clang/Analysis/checker-plugins.c to crash.

llvm-svn: 200858
2014-02-05 17:49:31 +00:00
Petar Jovanovic
c17768616f [mips] Add NaCl target and forbid indexed loads and stores for it
This patch adds NaCl target for Mips. It also forbids indexed loads and
stores if the target is NaCl.

Patch by Sasa Stankovic.

Differential Revision: http://llvm-reviews.chandlerc.com/D2690

llvm-svn: 200855
2014-02-05 17:19:30 +00:00
Alexander Kornienko
437146bfcb Fix an invalid check for duplicate option categories.
Summary:
The check performed in the comparator is invalid, as some STL
implementations enforce strict weak ordering by calling the comparator with the
same value. This check was also in a wrong place: the assertion would only fire
when -help was used. The new check is performed each time the category is
registered (we are not going to have thousands of them, so it's fine to do it in
O(N^2)).

Reviewers: jordan_rose

Reviewed By: jordan_rose

CC: cfe-commits, alexmc

Differential Revision: http://llvm-reviews.chandlerc.com/D2699

llvm-svn: 200853
2014-02-05 16:56:37 +00:00
Elena Demikhovsky
f8035c205a AVX-512: optimized icmp -> sext -> icmp pattern
llvm-svn: 200849
2014-02-05 16:17:36 +00:00
Alon Mishne
bef0ca7d46 Test commit
llvm-svn: 200843
2014-02-05 14:23:18 +00:00
Logan Chien
0d1789c768 ARM: Resolve thumb_bl fixup in same MCFragment.
In Thumb1 mode, bl instruction might be selected for branches between
basic blocks in the function if the offset is greater than 2KB.
However, this might cause SEGV because the destination symbol
is not marked as thumb function and the execution mode will be reset
to ARM mode.

Since we are sure that these symbols are in the same data fragment, we
can simply resolve these local symbols, and don't emit any relocation
information for this bl instruction.

llvm-svn: 200842
2014-02-05 14:15:16 +00:00
Elena Demikhovsky
20652531de AVX-512: fixed a bug in EVEX encoding (the bug appeared after r200624)
llvm-svn: 200837
2014-02-05 13:03:01 +00:00
Michel Danzer
ed3052cc54 R600/SI: Add pattern for zero-extending i1 to i32
Fixes opencl-example if_* tests with radeonsi.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200830
2014-02-05 09:48:05 +00:00
Kai Nacke
6d1536fa76 ARM: Enable use of relocation type tlsldo in debug info for tls data.
This fixes PR18554.

Reviewers: Renato Golin, Keith Walker
llvm-svn: 200826
2014-02-05 07:23:09 +00:00
Craig Topper
4c6c325efa Move matching for x86 BMI BLSI/BLSMSK/BLSR instructions to isel patterns instead of DAG combine. This weakens the ability to fold loads with them because we aren't able to match patterns that load the same thing twice. But maybe we should fix that if we care. The peephole optimizer will be able to fold some loads in its absense.
llvm-svn: 200824
2014-02-05 07:09:40 +00:00
Elena Demikhovsky
2e0202b75e AVX-512: Added intrinsic for cvtph2ps.
Added VPTESTNM instruction.
Added a pattern to vselect (lit tests will follow).

llvm-svn: 200823
2014-02-05 07:05:03 +00:00
Craig Topper
792771e814 Add CheckChildInteger to ISelMatcher operations. Removes nearly 2000 bytes from X86 matcher table.
llvm-svn: 200821
2014-02-05 05:44:28 +00:00
Rafael Espindola
e3a1818871 Use the information provided by getFlags to unify some code in llvm-nm.
It is not clear how much we should try to expose in getFlags. For example,
should there be a SF_Object and a SF_Text?

But for information that is already being exposed, we may as well use it in
llvm-nm.

llvm-svn: 200820
2014-02-05 05:19:19 +00:00
Todd Fiala
f2dea859ee Fix configure to find arc4random via header files.
ISSUE:

On Ubuntu 12.04 LTS, arc4random is provided by libbsd.so, which is a
transitive dependency of libedit. If a system had libedit on it that
was implemented in terms of libbsd.so, then the arc4random test,
previously implemented as a linker test, would succeed with -ledit.
However, on Ubuntu this would also require a #include <bsd/stdlib.h>.
This caused a build breakage on configure-based Ubuntu 12.04 with
libedit installed.

FIX:

This fix changes configure to test for arc4random by searching for it
in the standard header files. On Ubuntu 12.04, this test now properly
fails to find arc4random as it is not defined in the default header
locations. It also tweaks the #define names to match the output of the
header check command, which is slightly different than the linker
function check #defines.

I tested the following scenarios:

(1) Ubuntu 12.04 without the libedit package [did not find arc4random,
as expected]

(2) Ubuntu 12.04 with libedit package [properly did not find
arc4random, as expected]

(3) Ubuntu 12.04 with most recent libedit, custom built, and not
dependent on libbsd.so [properly did not find arc4random, as
expected].

(4) FreeBSD 10.0B1 [properly found arc4random, as expected]

llvm-svn: 200819
2014-02-05 05:04:36 +00:00
Manman Ren
1fbb9d3674 Fix wording of warning message about invalid debug info.
llvm-svn: 200806
2014-02-04 23:49:02 +00:00
Rafael Espindola
c79be3484e Remove unused SF_ThreadLocal.
llvm-svn: 200800
2014-02-04 22:50:47 +00:00
Justin Bogner
ea650dd10e llvm-cov: Fix include order in GCOV.cpp
llvm-svn: 200796
2014-02-04 21:03:17 +00:00
Benjamin Kramer
700474a946 SimplifyLibCalls: Push TLI through the exp2->ldexp transform.
For the odd case of platforms with exp2 available but not ldexp.

llvm-svn: 200795
2014-02-04 20:27:23 +00:00
Peter Collingbourne
9a2318e757 Avoid using EL_GETFP.
This should fix the build against old versions of libedit.

llvm-svn: 200794
2014-02-04 20:04:46 +00:00
Lang Hames
4397ed9ac7 [X86] Only 213 FMA3 variants should be marked commutable.
Commuting the 231 and 132 variants would swap addends and
multiplicands/multipliers, which isn't valid.

I'm still trying to reduce a decent test case for this.

llvm-svn: 200792
2014-02-04 19:42:47 +00:00
Duncan P. N. Exon Smith
7024ad6965 cleanup: scc_iterator consumers should use isAtEnd
No functional change.  Updated loops from:

    for (I = scc_begin(), E = scc_end(); I != E; ++I)

to:

    for (I = scc_begin(); !I.isAtEnd(); ++I)

for teh win.

llvm-svn: 200789
2014-02-04 19:19:07 +00:00
Petar Jovanovic
615b3679ca [mips] Implement %hi(sym1 - sym2) and %lo(sym1 - sym2) expressions
Patch implements %hi(sym1 - sym2) and %lo(sym1 - sym2) expressions for MIPS
by creating target expression class MipsMCExpr.

Patch by Sasa Stankovic.

Differential Revision: http://llvm-reviews.chandlerc.com/D2592

llvm-svn: 200783
2014-02-04 18:41:57 +00:00
Rafael Espindola
34d0e998bd Every target uses .align. Simplify.
llvm-svn: 200782
2014-02-04 18:39:51 +00:00
Rafael Espindola
32330211d3 Use the default values.
llvm-svn: 200781
2014-02-04 18:34:04 +00:00
David Peixotto
a6dfad28da Fix PR18345: ldr= pseudo instruction produces incorrect code when using in inline assembly
This patch fixes the ldr-pseudo implementation to work when used in
inline assembly.  The fix is to move arm assembler constant pools
from the ARMAsmParser class to the ARMTargetStreamer class.

Previously we kept the assembler generated constant pools in the
ARMAsmParser object. This does not work for inline assembly because
a new parser object is created for each blob of inline assembly.
This patch moves the constant pools to the ARMTargetStreamer class
so that the constant pool will remain alive for the entire code
generation process.

An ARMTargetStreamer class is now required for the arm backend.
There was no existing implementation for MachO, only Asm and ELF.
Instead of creating an empty MachO subclass, we decided to make the
ARMTargetStreamer a non-abstract class and provide default
(llvm_unreachable) implementations for the non constant-pool related
methods.

Differential Revision: http://llvm-reviews.chandlerc.com/D2638

llvm-svn: 200777
2014-02-04 17:22:40 +00:00
Tom Stellard
d35a25ea14 R600/SI: Expand i1 BR_CC
This fixes a crashes in the OpenCV test suite and also the scrypt
kernel in bfgminer.

I was unable to come up with a reduced test case for this.

https://bugs.freedesktop.org/show_bug.cgi?id=72785

llvm-svn: 200776
2014-02-04 17:18:43 +00:00
Tom Stellard
f57fbab65b R600/SI: Don't assume copies will be coalesced in SIFixSGPRCopies
There is no lit test for this, because it would be too big and
complicated, but it does fix a crash in the Arithm/Absdiff.* OpenCV test.

llvm-svn: 200775
2014-02-04 17:18:42 +00:00
Tom Stellard
279daf2506 R600/SI: Custom lower i64 ISD::SELECT
llvm-svn: 200774
2014-02-04 17:18:40 +00:00
Tom Stellard
f4a180e50b R600: Enable vector fpow.
The OpenCL specs say: "The vector versions of the math functions operate
component-wise. The description is per-component."

Patch by: Jan Vesely

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 200773
2014-02-04 17:18:37 +00:00
Tim Northover
d6fb863f04 OS X: the correct function is __sincospif_stret, not __sincospi_stretf
rdar://problem/13729466

llvm-svn: 200771
2014-02-04 16:28:20 +00:00
Tim Northover
cf3928b4f7 ARM & AArch64: merge NEON absolute compare intrinsics
There was an extremely confusing proliferation of LLVM intrinsics to implement
the vacge & vacgt instructions. This combines them all into two polymorphic
intrinsics, shared across both backends.

llvm-svn: 200768
2014-02-04 14:55:42 +00:00
Aaron Ballman
bbf781de71 Implemented support for Process::GetRandomNumber on Windows.
Patch thanks to Stephan Tolksdorf!

llvm-svn: 200767
2014-02-04 14:49:21 +00:00
Justin Bogner
8f223c891e llvm-cov: Implement the preserve-paths flag
Until now, when a path in a gcno file included a directory, we would
emit our .gcov file in that directory, whereas gcov always emits the
file in the current directory. In doing so, this implements gcov's
strange name-mangling -p flag, which is needed to avoid clobbering
files when two with the same name exist in different directories.

The path mangling is a bit ugly and only handles unix-like paths, but
it's simple, and it doesn't make any guesses as to how it should
behave outside of what gcov documents. If we decide this should be
cross platform later, we can consider the compatibility implications
then.

llvm-svn: 200754
2014-02-04 10:45:02 +00:00
Tim Northover
4c3de0c83d ARM: fix fast-isel assertion failure
Missing braces on if meant we inserted both ARM and Thumb load for a litpool
entry. This didn't end well.

rdar://problem/15959157

llvm-svn: 200752
2014-02-04 10:38:46 +00:00