Commit Graph

7596 Commits

Author SHA1 Message Date
Hans Wennborg
1809c66b66 Revert r313771 "[SLP] Vectorize jumbled memory loads."
This broke the buildbots, e.g.
http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391

> Summary:
> This patch tries to vectorize loads of consecutive memory accesses, accessed
> in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
> which was reverted back due to some basic issue with representing the 'use mask'
> jumbled accesses.
>
> This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
>
> Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
>
> Subscribers: mzolotukhin
>
> Reviewed By: ayal
>
> Differential Revision: https://reviews.llvm.org/D36130
>
> Review comments updated accordingly
>
> Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0
>
> Added a TODO for sortLoadAccesses API
>
> Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58
>
> Modified the TODO for sortLoadAccesses API
>
> Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565
>
> Review comment update for using OpdNum to insert the mask in respective location
>
> Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce
>
> Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase
>
> Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313781 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 18:00:03 +00:00
Mohammad Shahid
46e0b67b99 [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask'
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Subscribers: mzolotukhin

Reviewed By: ayal

Differential Revision: https://reviews.llvm.org/D36130

Review comments updated accordingly

Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0

Added a TODO for sortLoadAccesses API

Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58

Modified the TODO for sortLoadAccesses API

Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565

Review comment update for using OpdNum to insert the mask in respective location

Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce

Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase

Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313771 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 17:19:57 +00:00
Alexander Kornienko
e1631a5af7 Revert r313736: "[SLP] Vectorize jumbled memory loads."
The revision breaks buildbots:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/6694/steps/test/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313758 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 14:53:07 +00:00
Alexander Kornienko
2d05b60473 Revert r313753: "Fix a -Wsign-compare warning in LoopAccessAnalysis.cpp"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313757 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 14:52:56 +00:00
Alexander Kornienko
160a98b89b Fix a -Wsign-compare warning in LoopAccessAnalysis.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313753 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 12:18:22 +00:00
Mohammad Shahid
0acc54b75c [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask' of
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh

Reviewed By: Ayal

Subscribers: mzolotukhin

Differential Revision: https://reviews.llvm.org/D36130

Commit after rebase for patch D36130

Change-Id: I8add1c265455669ef288d880f870a9522c8c08ab

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313736 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 08:18:28 +00:00
Sanjoy Das
198959c487 Tighten the invariants around LoopBase::invalidate
Summary:
With this change:
 - Methods in LoopBase trip an assert if the receiver has been invalidated
 - LoopBase::clear frees up the memory held the LoopBase instance

This change also shuffles things around as necessary to work with this stricter invariant.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38055

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313708 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 02:31:57 +00:00
Sanjoy Das
f4845c877a Clang-format few files to make later diffs leaner; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313705 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 01:12:09 +00:00
Sanjoy Das
6199cad867 [LoopInfo] Make LoopBase and Loop destructors non-public
Summary:
See comment for why I think this is a good idea.

This change also:

 - Removes an SCEV test case.  The SCEV test was not testing anything useful (most of it was `#if 0` ed out) and it would need to be updated to deal with a private ~Loop::Loop.
 - Updates the loop pass manager test case to deal with a private ~Loop::Loop.
 - Renames markAsRemoved to markAsErased to contrast with removeLoop, via the usual remove vs. erase idiom we already have for instructions and basic blocks.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D37996

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313695 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 23:19:00 +00:00
Sanjay Patel
bc0f9d9517 [InstSimplify] fold sdiv/srem based on compare of dividend and divisor
This should bring signed div/rem analysis up to the same level as unsigned. 
We use icmp simplification to determine when the divisor is known greater than the dividend.

Each positive test is followed by a negative test to show that we're not overstepping the boundaries of the known bits.
There are extra tests for the signed-min-value special cases.

Alive proofs:
http://rise4fun.com/Alive/WI5

Differential Revision: https://reviews.llvm.org/D37713


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313264 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 14:59:07 +00:00
Sanjay Patel
32b8a7a919 [InstSimplify] clean up div/rem handling; NFCI
The idea to make an 'isDivZero' helper was suggested for the signed case in D37713:
https://reviews.llvm.org/D37713

This clean-up makes it clear that D37713 is just filling the gap for signed div/rem,
removes unnecessary code, and allows us to remove a bit of duplicated code from the
planned improvement in D37713.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313261 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 14:09:11 +00:00
Chandler Carruth
dbaacccc31 [PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle
invalidated SCCs even when we do not have an updated SCC to redirect
towards.

This comes up in a fairly subtle and surprising circumstance: we need to
have a connected but internal node in the call graph which later becomes
a disconnected island, and then gets deleted. All of this needs to
happen mid-CGSCC walk. Because it is disconnected, we have no way of
computing a new "current" SCC when it gets deleted. Instead, we need to
explicitly check for a deleted "current" SCC and bail out of the current
CGSCC step. This will bubble all the way up to the post-order walk and
then resume correctly.

I've included minimal tests for this bug. The specific behavior
matches something we've seen in the wild with the new PM combined with
ThinLTO and sample PGO, but I've not yet confirmed whether this is the
only issue there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313242 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 08:33:57 +00:00
Alon Kom
dde48e1948 [LV] Fix maximum legal VF calculation
This patch fixes pr34283, which exposed that the computation of
maximum legal width for vectorization was wrong, because it relied
on MaxInterleaveFactor to obtain the maximum stride used in the loop,
however not all strided accesses in the loop have an interleave-group
associated with them.
Instead of recording the maximum stride in the loop, which can be over
conservative (e.g. if the access with the maximum stride is not involved
in the dependence limitation), this patch tracks the actual maximum legal
width imposed by accesses that are involved in dependencies.

Differential Revision: https://reviews.llvm.org/D37507

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313237 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 07:40:02 +00:00
Easwaran Raman
7f44c36d07 [Inliner] Add another way to compute full inline cost.
Summary:
Full inline cost is computed when -inline-cost-full is true or ORE is
non-null. This patch adds another way to compute full inline cost by
adding a field to InlineParams. This will be used by SampleProfileLoader
to check legality of inlining a callee that it wants to inline.

Reviewers: danielcdh, haicheng

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37819

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313185 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-13 20:16:02 +00:00
Hiroshi Yamauchi
cc31faa0b4 Add options to dump PGO counts in text.
Summary:
Added text options to -pgo-view-counts and -pgo-view-raw-counts that dump block frequency and branch probability info in text.

This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37776

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313159 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-13 17:20:38 +00:00
Teresa Johnson
bca2a430d9 [ThinLTO] AliasSummary should not have any references
Summary: References should only be on the aliasee.

Reviewers: pcc

Subscribers: llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D37814

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313158 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-13 17:10:24 +00:00
Silviu Baranga
1ece28eb77 [LAA] Allow more run-time alias checks by coercing pointer expressions to AddRecExprs
Summary:
LAA can only emit run-time alias checks for pointers with affine AddRec
SCEV expressions. However, non-AddRecExprs can be now be converted to
affine AddRecExprs using SCEV predicates.

This change tries to add the minimal set of SCEV predicates in order
to enable run-time alias checking.

Reviewers: anemet, mzolotukhin, mkuper, sanjoy, hfinkel

Reviewed By: hfinkel

Subscribers: mssimpso, Ayal, dorit, roman.shirokiy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D17080

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313012 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-12 07:48:22 +00:00
Marcello Maggioni
022ffdf29f [ScalarEvolution] Refactor forgetLoop() to improve performance
forgetLoop() has pretty bad performance because it goes over
the same instructions over and over again in particular when
nested loop are involved.
The refactoring changes the function to a not-recursive function
and reusing the allocation for data-structures and the Visited
set.

NFCI

Differential Revision: https://reviews.llvm.org/D37659

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312920 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-11 15:44:20 +00:00
Sanjay Patel
92ba92dfe4 [InstSimplify] reorder methods; NFC
I'm trying to refactor some shared code for integer div/rem,
but I keep having to scroll through fdiv. The FP ops have
nothing in common with the integer ops, so I'm moving FP
below everything else. 

While here, improve a couple of comments and fix some formatting.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312913 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-11 13:34:27 +00:00
Sanjay Patel
44e68c6bb8 [InstSimplify] refactor udiv/urem code and add tests; NFCI
This removes some duplicated code and makes it easier to support signed div/rem
in a similar way if we want to do that. Note that the existing comments were not
accurate - we don't need a constant divisor to simplify; icmp simplification does
more than that. But as the added tests show, it could go even further.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312885 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-10 17:55:08 +00:00
Nuno Lopes
fe353a0cbf Merge isKnownNonNull into isKnownNonZero
It now knows the tricks of both functions.
Also, fix a bug that considered allocas of non-zero address space to be always non null

Differential Revision: https://reviews.llvm.org/D37628

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312869 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-09 18:23:11 +00:00
Sanjay Patel
193e898f75 [DivRempairs] add a pass to optimize div/rem pairs (PR31028)
This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented 
as an independent pass, so there's no stretching of scope and feature creep for an existing pass. 
I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost 
this same functionality as an addition to CGP in the motivating example of PR31028:
https://bugs.llvm.org/show_bug.cgi?id=31028

The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow 
more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and
undo the hoisting that is done here.

Decomposing remainder may allow removing some code from the backend (PPC and possibly others).

Differential Revision: https://reviews.llvm.org/D37121 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312862 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-09 13:38:18 +00:00
Guozhi Wei
19969b8b8f [TargetTransformInfo] Add a new public interface getInstructionCost
Current TargetTransformInfo can support throughput cost model and code size model, but sometimes we also need instruction latency cost model in different optimizations. Hal suggested we need a single public interface to query the different cost of an instruction. So I proposed following interface:

  enum TargetCostKind {
    TCK_RecipThroughput, ///< Reciprocal throughput.
    TCK_Latency,         ///< The latency of instruction.
    TCK_CodeSize         ///< Instruction code size.
  };

  int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const;

All clients should mainly use this function to query the cost of an instruction, parameter <kind> specifies the desired cost model.

This patch also provides a simple default implementation of getInstructionLatency.

The default getInstructionLatency provides latency numbers for only small number of instruction classes, those latency numbers are only reasonable for modern OOO processors. It can be extended in following ways:

   Add more detail into this function.
   Add getXXXLatency function and call it from here.
   Implement target specific getInstructionLatency function.

Differential Revision: https://reviews.llvm.org/D37170



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312832 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 22:29:17 +00:00
Alexey Bataev
4fcc7e8528 [SLP] Support for horizontal min/max reduction.
SLP vectorizer supports horizontal reductions for Add/FAdd binary
operations. Patch adds support for horizontal min/max reductions.
Function getReductionCost() is split to getArithmeticReductionCost() for
binary operation reductions and getMinMaxReductionCost() for min/max
reductions.
Patch fixes PR26956.

Differential revision: https://reviews.llvm.org/D27846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312791 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 13:49:36 +00:00
Peter Collingbourne
2c6c4893c7 ModuleSummaryAnalysis: Correctly handle all function operand references.
The current code that handles personality functions when creating a
module summary does not correctly handle the case where a function's
personality function operand refers to the function indirectly
(e.g. via a bitcast). This patch handles such cases by treating
personality function references like any other reference, i.e. by
adding them to the function's reference list. This has the minor side
benefit of allowing personality functions to participate in early
dead stripping.

We do this by calling findRefEdges on the function itself. This way
we also end up handling other function operands (specifically prefix
data and prologue data) for free.

Differential Revision: https://reviews.llvm.org/D37553

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312698 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 05:35:35 +00:00
Matt Arsenault
e0de89287c InstSimplify: canonicalize is idempotent
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312685 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 01:21:43 +00:00
Nuno Lopes
e429f678d6 Fix PR33878: BasicAA incorrectly assumes different address spaces don't alias
Remove code that assumed that a nullptr of address space != 0 couldnt alias with a non-null pointer. This is incorrect, since nothing can be concluded about a null pointer in an address space != 0.
This code was written before address spaces were introduced

Differential Revision: https://reviews.llvm.org/D37518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312648 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-06 16:55:31 +00:00
Sanjay Patel
04894a4949 [ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to null constants
This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite():
https://bugs.llvm.org/show_bug.cgi?id=27145

In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno
with a constant operand.

But while looking at those patterns, I realized we were missing a canonicalization for nonzero
constants. Rather than limiting to just folds for constants, we're adding a general value
tracking method for this based on an existing DAG helper.

By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps()
and pick up missing vector folds.

Differential Revision: https://reviews.llvm.org/D37427


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312591 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 23:13:13 +00:00
Daniel Neilson
f7dd8e2ac0 [SCEV] Ensure ScalarEvolution::createAddRecFromPHIWithCastsImpl properly handles out of range truncations of the start and accum values
Summary:
 When constructing the predicate P1 in ScalarEvolution::createAddRecFromPHIWithCastsImpl() it is possible
for the PHISCEV from which the predicate is constructed to be a SCEVConstant instead of a SCEVAddRec. If
this happens, then the cast<SCEVAddRec>(PHISCEV) in the code will assert.

 Such a PHISCEV is possible if either the start value or the accumulator value is a constant value
that not equal to its truncated value, and if the truncated value is zero.

 This patch adds tests that demonstrate the cast<> assertion, and fixes this problem by checking
whether the PHISCEV is a constant before constructing the P1 predicate; if it is, then P1 is
equivalent to one of P2 or P3. Additionally, if we know that the start value or accumulator
value are constants then we check whether the P2 and/or P3 predicates are known false at compile
time; if either is, then we bail out of constructing the AddRec.

Reviewers: sanjoy, mkazantsev, silviu.baranga

Reviewed By: mkazantsev

Subscribers: mkazantsev, llvm-commits

Differential Revision: https://reviews.llvm.org/D37265

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312568 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 19:54:03 +00:00
Eugene Zelenko
cecd8f18e2 [Analysis, Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312383 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 21:37:29 +00:00
Craig Topper
85fcd3487c [InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through truncate instructions
This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask.

This allows some code to be removed from InstSimplify that was doing this functionality.

This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out.

There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary.

Differential Revision: https://reviews.llvm.org/D37158

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312382 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 21:27:34 +00:00
Peter Collingbourne
043998b329 ModuleSummaryAnalysis: Correctly handle refs from function inline asm to module inline asm.
If a function contains inline asm and the module-level inline asm
contains the definition of a local symbol, prevent the function from
being imported in case the function-level inline asm refers to a
symbol in the module-level inline asm.

Differential Revision: https://reviews.llvm.org/D37370

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312332 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 16:24:02 +00:00
Alexandre Isoard
3b88873b05 [SCEV] Add URem support to SCEV
In LLVM IR the following code:

    %r = urem <ty> %t, %b

is equivalent to

    %q = udiv <ty> %t, %b
    %s = mul <ty> nuw %q, %b
    %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t

As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented
with minimal effort using that relation:

    %r --> (-%b * (%t /u %b)) + %t

We implement two special cases:

  - if %b is 1, the result is always 0
  - if %b is a power-of-two, we produce a zext/trunc based expression instead

That is, the following code:

    %r = urem i32 %t, 65536

Produces:

    %r --> (zext i16 (trunc i32 %a to i16) to i32)

Note that while this helps get a tighter bound on the range analysis and the
known-bits analysis, this exposes some normalization shortcoming of SCEVs:

    %div = udim i32 %a, 65536
    %mul = mul i32 %div, 65536
    %rem = urem i32 %a, 65536
    %add = add i32 %mul, %rem

Will usually not be reduced.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312329 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 14:59:59 +00:00
Eugene Zelenko
046ca04445 [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Also affected in files (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312289 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-31 21:56:16 +00:00
Adam Nemet
0827f9ac83 Remove an unnecessary const_cast.
I think that this is dating back to when emit used to take a const reference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311948 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 23:00:13 +00:00
Don Hinton
1adb5a9cb5 [Dominators] Remove redundant explicit template instantiation.
Summary:
Remove redundant explicit template instantiation.

This was reported by Andrew Kelley building release_50 with gcc7.2.0 on MacOS: duplicate symbol llvm::DominatorTreeBase.

Reviewers: kuhar, andrewrk, davide, hans

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37185

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311835 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-26 21:08:51 +00:00
Hiroshi Yamauchi
1020c414d8 Add options to dump block frequency/branch probability info in text.
Summary:
Add options -print-bfi/-print-bpi that dump block frequency and branch
probability info like -view-block-freq-propagation-dags and
-view-machine-block-freq-propagation-dags do but in text.

This is useful when the graph is very large and complex (the dot command
crashes, lines/edges too close to tell apart, hard to navigate without textual
search) or simply when text is preferred.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37165

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311822 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-26 00:31:00 +00:00
Haicheng Wu
33be26f893 [InlineCost] Small changes to early exit condition. NFC.
Change the early exit condition from Cost > Threshold to Cost >= Threshold
because the inline condition is Cost < Threshold.

Differential Revision: https://reviews.llvm.org/D37087

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311791 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 19:00:33 +00:00
Michael Kruse
f29303de23 Normlize to LF line endings.
Commit r297442 introduced mixed CRLF/LF line endings to two files.
Normalize to to LF-only line endings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311774 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 12:38:53 +00:00
Dehao Chen
d38687abb5 Move accurate-sample-profile into the function attribute.
Summary: We need to have accurate-sample-profile in function attribute so that it works with LTO.

Reviewers: davidxl, rsmith

Reviewed By: davidxl

Subscribers: sanjoy, mehdi_amini, javed.absar, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D37113

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311706 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 21:37:04 +00:00
Tobias Grosser
2050a0312d Model cache size and associativity in TargetTransformInfo
Summary:
We add the precise cache sizes and associativity for the following Intel
architectures:

  - Penry
  - Nehalem
  - Westmere
  - Sandy Bridge
  - Ivy Bridge
  - Haswell
  - Broadwell
  - Skylake
  - Kabylake

Polly uses since several months a performance model for BLAS computations that
derives optimal cache and register tile sizes from cache and latency
information (based on ideas from "Analytical Modeling Is Enough for High-Performance BLIS", by Tze Meng Low published at TOMS 2016).
While bootstrapping this model, these target values have been kept in Polly.
However, as our implementation is now rather mature, it seems time to teach
LLVM itself about cache sizes.

Interestingly, L1 and L2 cache sizes are pretty constant across
micro-architectures, hence a set of architecture specific default values
seems like a good start. They can be expanded to more target specific values,
in case certain newer architectures require different values. For now a set
of Intel architectures are provided.

Just as a little teaser, for a simple gemm kernel this model allows us to
improve performance from 1.2s to 0.27s. For gemm kernels with less optimal
memory layouts even larger speedups can be reported.

Reviewers: Meinersbur, bollu, singam-sanjay, hfinkel, gareevroman, fhahn, sebpop, efriedma, asb

Reviewed By: fhahn, asb

Subscribers: lsaba, asb, pollydev, llvm-commits

Differential Revision: https://reviews.llvm.org/D37051

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311647 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 09:46:25 +00:00
Rong Xu
7996242b16 [PGO] Set edge weights for indirectbr instruction with profile counts
Current PGO only annotates the edge weight for branch and switch instructions
with profile counts. We should also annotate the indirectbr instruction as
all the information is there. This patch enables the annotating for indirectbr
instructions. Also uses this annotation in branch probability analysis.

Differential Revision: https://reviews.llvm.org/D37074


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311604 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 21:36:02 +00:00
George Rimar
95a4133b77 [lib/Analysis] - Mark personality functions as live.
This is PR33245.

Case I am fixing is next:
Imagine we have 2 BC files, one defines and uses personality routine,
second has only declaration and also uses it.

Previously algorithm computing dead symbols (llvm::computeDeadSymbols) did
not know about personality routines and leaved them dead even if function that
has routine was live.

As a result thinLTOInternalizeAndPromoteGUID() method changed binding for
such symbol to local. Later when LLD tried to link these objects it failed
because one object had undefined global symbol for routine and second
object contained local definition instead of global.

Patch set the live root flag on the corresponding FunctionSummary
for personality routines when we build the per-module summaries
during the compile step.

Differential revision: https://reviews.llvm.org/D36834

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311432 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 08:50:56 +00:00
Craig Topper
16e7603633 [ValueTracking] Add assertions that the starting Depth in isKnownToBeAPowerOfTwo and ComputeNumSignBitsImpl is not above MaxDepth
The function does an equality check later to terminate the recursion, but that won't work if its starts out too high. Similar assert already exists in computeKnownBits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311400 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 22:56:12 +00:00
Haicheng Wu
25ef265dc9 [InlineCost] Add cl::opt to allow full inline cost to be computed for debugging purposes.
Currently, the inline cost model will bail once the inline cost exceeds the
inline threshold in order to avoid unnecessary compile-time. However, when
debugging it is useful to compute the full cost, so this command line option
is added to override the default behavior.

I took over this work from Chad Rosier (mcrosier@codeaurora.org).

Differential Revision: https://reviews.llvm.org/D35850

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311371 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 20:00:09 +00:00
Chad Rosier
14edb7eb1a [InlineCost] Add more debug during inline cost computation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311370 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 19:56:46 +00:00
Eugene Zelenko
89688ce180 [Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311212 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 23:51:26 +00:00
Amjad Aboud
066b24cb94 [InstCombine] Teach ComputeNumSignBitsImpl to handle integer multiply instruction.
Differential Revision: https://reviews.llvm.org/D36679


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311206 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 22:56:55 +00:00
Eugene Zelenko
93bb413a33 [Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311048 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 22:07:40 +00:00
Sanjay Patel
77622085e7 [DemandedBits] simplify call; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311009 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 14:28:23 +00:00