archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Hans Wennborg	1809c66b66	Revert r313771 "[SLP] Vectorize jumbled memory loads." This broke the buildbots, e.g. http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391 > Summary: > This patch tries to vectorize loads of consecutive memory accesses, accessed > in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 > which was reverted back due to some basic issue with representing the 'use mask' > jumbled accesses. > > This patch fixes the mask representation by recording the 'use mask' in the usertree entry. > > Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df > > Subscribers: mzolotukhin > > Reviewed By: ayal > > Differential Revision: https://reviews.llvm.org/D36130 > > Review comments updated accordingly > > Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 > > Added a TODO for sortLoadAccesses API > > Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 > > Modified the TODO for sortLoadAccesses API > > Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 > > Review comment update for using OpdNum to insert the mask in respective location > > Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce > > Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase > > Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313781 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-20 18:00:03 +00:00
Mohammad Shahid	46e0b67b99	[SLP] Vectorize jumbled memory loads. Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Subscribers: mzolotukhin Reviewed By: ayal Differential Revision: https://reviews.llvm.org/D36130 Review comments updated accordingly Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 Added a TODO for sortLoadAccesses API Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 Modified the TODO for sortLoadAccesses API Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 Review comment update for using OpdNum to insert the mask in respective location Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313771 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-20 17:19:57 +00:00
Alexander Kornienko	e1631a5af7	Revert r313736: "[SLP] Vectorize jumbled memory loads." The revision breaks buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/6694/steps/test/logs/stdio git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313758 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-20 14:53:07 +00:00
Alexander Kornienko	2d05b60473	Revert r313753: "Fix a -Wsign-compare warning in LoopAccessAnalysis.cpp" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313757 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-20 14:52:56 +00:00
Alexander Kornienko	160a98b89b	Fix a -Wsign-compare warning in LoopAccessAnalysis.cpp git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313753 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-20 12:18:22 +00:00
Mohammad Shahid	0acc54b75c	[SLP] Vectorize jumbled memory loads. Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 Commit after rebase for patch D36130 Change-Id: I8add1c265455669ef288d880f870a9522c8c08ab git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313736 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-20 08:18:28 +00:00
Sanjoy Das	198959c487	Tighten the invariants around LoopBase::invalidate Summary: With this change: - Methods in LoopBase trip an assert if the receiver has been invalidated - LoopBase::clear frees up the memory held the LoopBase instance This change also shuffles things around as necessary to work with this stricter invariant. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38055 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313708 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-20 02:31:57 +00:00
Sanjoy Das	f4845c877a	Clang-format few files to make later diffs leaner; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313705 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-20 01:12:09 +00:00
Sanjoy Das	6199cad867	[LoopInfo] Make LoopBase and Loop destructors non-public Summary: See comment for why I think this is a good idea. This change also: - Removes an SCEV test case. The SCEV test was not testing anything useful (most of it was `#if 0` ed out) and it would need to be updated to deal with a private ~Loop::Loop. - Updates the loop pass manager test case to deal with a private ~Loop::Loop. - Renames markAsRemoved to markAsErased to contrast with removeLoop, via the usual remove vs. erase idiom we already have for instructions and basic blocks. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37996 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313695 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-19 23:19:00 +00:00
Sanjay Patel	bc0f9d9517	[InstSimplify] fold sdiv/srem based on compare of dividend and divisor This should bring signed div/rem analysis up to the same level as unsigned. We use icmp simplification to determine when the divisor is known greater than the dividend. Each positive test is followed by a negative test to show that we're not overstepping the boundaries of the known bits. There are extra tests for the signed-min-value special cases. Alive proofs: http://rise4fun.com/Alive/WI5 Differential Revision: https://reviews.llvm.org/D37713 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313264 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-14 14:59:07 +00:00
Sanjay Patel	32b8a7a919	[InstSimplify] clean up div/rem handling; NFCI The idea to make an 'isDivZero' helper was suggested for the signed case in D37713: https://reviews.llvm.org/D37713 This clean-up makes it clear that D37713 is just filling the gap for signed div/rem, removes unnecessary code, and allows us to remove a bit of duplicated code from the planned improvement in D37713. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313261 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-14 14:09:11 +00:00
Chandler Carruth	dbaacccc31	[PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle invalidated SCCs even when we do not have an updated SCC to redirect towards. This comes up in a fairly subtle and surprising circumstance: we need to have a connected but internal node in the call graph which later becomes a disconnected island, and then gets deleted. All of this needs to happen mid-CGSCC walk. Because it is disconnected, we have no way of computing a new "current" SCC when it gets deleted. Instead, we need to explicitly check for a deleted "current" SCC and bail out of the current CGSCC step. This will bubble all the way up to the post-order walk and then resume correctly. I've included minimal tests for this bug. The specific behavior matches something we've seen in the wild with the new PM combined with ThinLTO and sample PGO, but I've not yet confirmed whether this is the only issue there. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313242 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-14 08:33:57 +00:00
Alon Kom	dde48e1948	[LV] Fix maximum legal VF calculation This patch fixes pr34283, which exposed that the computation of maximum legal width for vectorization was wrong, because it relied on MaxInterleaveFactor to obtain the maximum stride used in the loop, however not all strided accesses in the loop have an interleave-group associated with them. Instead of recording the maximum stride in the loop, which can be over conservative (e.g. if the access with the maximum stride is not involved in the dependence limitation), this patch tracks the actual maximum legal width imposed by accesses that are involved in dependencies. Differential Revision: https://reviews.llvm.org/D37507 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313237 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-14 07:40:02 +00:00
Easwaran Raman	7f44c36d07	[Inliner] Add another way to compute full inline cost. Summary: Full inline cost is computed when -inline-cost-full is true or ORE is non-null. This patch adds another way to compute full inline cost by adding a field to InlineParams. This will be used by SampleProfileLoader to check legality of inlining a callee that it wants to inline. Reviewers: danielcdh, haicheng Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37819 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313185 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-13 20:16:02 +00:00
Hiroshi Yamauchi	cc31faa0b4	Add options to dump PGO counts in text. Summary: Added text options to -pgo-view-counts and -pgo-view-raw-counts that dump block frequency and branch probability info in text. This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37776 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313159 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-13 17:20:38 +00:00
Teresa Johnson	bca2a430d9	[ThinLTO] AliasSummary should not have any references Summary: References should only be on the aliasee. Reviewers: pcc Subscribers: llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D37814 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313158 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-13 17:10:24 +00:00
Silviu Baranga	1ece28eb77	[LAA] Allow more run-time alias checks by coercing pointer expressions to AddRecExprs Summary: LAA can only emit run-time alias checks for pointers with affine AddRec SCEV expressions. However, non-AddRecExprs can be now be converted to affine AddRecExprs using SCEV predicates. This change tries to add the minimal set of SCEV predicates in order to enable run-time alias checking. Reviewers: anemet, mzolotukhin, mkuper, sanjoy, hfinkel Reviewed By: hfinkel Subscribers: mssimpso, Ayal, dorit, roman.shirokiy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D17080 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313012 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-12 07:48:22 +00:00
Marcello Maggioni	022ffdf29f	[ScalarEvolution] Refactor forgetLoop() to improve performance forgetLoop() has pretty bad performance because it goes over the same instructions over and over again in particular when nested loop are involved. The refactoring changes the function to a not-recursive function and reusing the allocation for data-structures and the Visited set. NFCI Differential Revision: https://reviews.llvm.org/D37659 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312920 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-11 15:44:20 +00:00
Sanjay Patel	92ba92dfe4	[InstSimplify] reorder methods; NFC I'm trying to refactor some shared code for integer div/rem, but I keep having to scroll through fdiv. The FP ops have nothing in common with the integer ops, so I'm moving FP below everything else. While here, improve a couple of comments and fix some formatting. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312913 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-11 13:34:27 +00:00
Sanjay Patel	44e68c6bb8	[InstSimplify] refactor udiv/urem code and add tests; NFCI This removes some duplicated code and makes it easier to support signed div/rem in a similar way if we want to do that. Note that the existing comments were not accurate - we don't need a constant divisor to simplify; icmp simplification does more than that. But as the added tests show, it could go even further. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312885 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-10 17:55:08 +00:00
Nuno Lopes	fe353a0cbf	Merge isKnownNonNull into isKnownNonZero It now knows the tricks of both functions. Also, fix a bug that considered allocas of non-zero address space to be always non null Differential Revision: https://reviews.llvm.org/D37628 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312869 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 18:23:11 +00:00
Sanjay Patel	193e898f75	[DivRempairs] add a pass to optimize div/rem pairs (PR31028) This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented as an independent pass, so there's no stretching of scope and feature creep for an existing pass. I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost this same functionality as an addition to CGP in the motivating example of PR31028: https://bugs.llvm.org/show_bug.cgi?id=31028 The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and undo the hoisting that is done here. Decomposing remainder may allow removing some code from the backend (PPC and possibly others). Differential Revision: https://reviews.llvm.org/D37121 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312862 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 13:38:18 +00:00
Guozhi Wei	19969b8b8f	[TargetTransformInfo] Add a new public interface getInstructionCost Current TargetTransformInfo can support throughput cost model and code size model, but sometimes we also need instruction latency cost model in different optimizations. Hal suggested we need a single public interface to query the different cost of an instruction. So I proposed following interface: enum TargetCostKind { TCK_RecipThroughput, ///< Reciprocal throughput. TCK_Latency, ///< The latency of instruction. TCK_CodeSize ///< Instruction code size. }; int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const; All clients should mainly use this function to query the cost of an instruction, parameter <kind> specifies the desired cost model. This patch also provides a simple default implementation of getInstructionLatency. The default getInstructionLatency provides latency numbers for only small number of instruction classes, those latency numbers are only reasonable for modern OOO processors. It can be extended in following ways: Add more detail into this function. Add getXXXLatency function and call it from here. Implement target specific getInstructionLatency function. Differential Revision: https://reviews.llvm.org/D37170 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312832 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 22:29:17 +00:00
Alexey Bataev	4fcc7e8528	[SLP] Support for horizontal min/max reduction. SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Differential revision: https://reviews.llvm.org/D27846 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312791 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 13:49:36 +00:00
Peter Collingbourne	2c6c4893c7	ModuleSummaryAnalysis: Correctly handle all function operand references. The current code that handles personality functions when creating a module summary does not correctly handle the case where a function's personality function operand refers to the function indirectly (e.g. via a bitcast). This patch handles such cases by treating personality function references like any other reference, i.e. by adding them to the function's reference list. This has the minor side benefit of allowing personality functions to participate in early dead stripping. We do this by calling findRefEdges on the function itself. This way we also end up handling other function operands (specifically prefix data and prologue data) for free. Differential Revision: https://reviews.llvm.org/D37553 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312698 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 05:35:35 +00:00
Matt Arsenault	e0de89287c	InstSimplify: canonicalize is idempotent git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312685 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 01:21:43 +00:00
Nuno Lopes	e429f678d6	Fix PR33878: BasicAA incorrectly assumes different address spaces don't alias Remove code that assumed that a nullptr of address space != 0 couldnt alias with a non-null pointer. This is incorrect, since nothing can be concluded about a null pointer in an address space != 0. This code was written before address spaces were introduced Differential Revision: https://reviews.llvm.org/D37518 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312648 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-06 16:55:31 +00:00
Sanjay Patel	04894a4949	[ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to null constants This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite(): https://bugs.llvm.org/show_bug.cgi?id=27145 In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno with a constant operand. But while looking at those patterns, I realized we were missing a canonicalization for nonzero constants. Rather than limiting to just folds for constants, we're adding a general value tracking method for this based on an existing DAG helper. By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps() and pick up missing vector folds. Differential Revision: https://reviews.llvm.org/D37427 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312591 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 23:13:13 +00:00
Daniel Neilson	f7dd8e2ac0	[SCEV] Ensure ScalarEvolution::createAddRecFromPHIWithCastsImpl properly handles out of range truncations of the start and accum values Summary: When constructing the predicate P1 in ScalarEvolution::createAddRecFromPHIWithCastsImpl() it is possible for the PHISCEV from which the predicate is constructed to be a SCEVConstant instead of a SCEVAddRec. If this happens, then the cast<SCEVAddRec>(PHISCEV) in the code will assert. Such a PHISCEV is possible if either the start value or the accumulator value is a constant value that not equal to its truncated value, and if the truncated value is zero. This patch adds tests that demonstrate the cast<> assertion, and fixes this problem by checking whether the PHISCEV is a constant before constructing the P1 predicate; if it is, then P1 is equivalent to one of P2 or P3. Additionally, if we know that the start value or accumulator value are constants then we check whether the P2 and/or P3 predicates are known false at compile time; if either is, then we bail out of constructing the AddRec. Reviewers: sanjoy, mkazantsev, silviu.baranga Reviewed By: mkazantsev Subscribers: mkazantsev, llvm-commits Differential Revision: https://reviews.llvm.org/D37265 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312568 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 19:54:03 +00:00
Eugene Zelenko	cecd8f18e2	[Analysis, Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312383 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-01 21:37:29 +00:00
Craig Topper	85fcd3487c	[InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through truncate instructions This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask. This allows some code to be removed from InstSimplify that was doing this functionality. This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out. There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary. Differential Revision: https://reviews.llvm.org/D37158 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312382 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-01 21:27:34 +00:00
Peter Collingbourne	043998b329	ModuleSummaryAnalysis: Correctly handle refs from function inline asm to module inline asm. If a function contains inline asm and the module-level inline asm contains the definition of a local symbol, prevent the function from being imported in case the function-level inline asm refers to a symbol in the module-level inline asm. Differential Revision: https://reviews.llvm.org/D37370 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312332 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-01 16:24:02 +00:00
Alexandre Isoard	3b88873b05	[SCEV] Add URem support to SCEV In LLVM IR the following code: %r = urem <ty> %t, %b is equivalent to %q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort using that relation: %r --> (-%b * (%t /u %b)) + %t We implement two special cases: - if %b is 1, the result is always 0 - if %b is a power-of-two, we produce a zext/trunc based expression instead That is, the following code: %r = urem i32 %t, 65536 Produces: %r --> (zext i16 (trunc i32 %a to i16) to i32) Note that while this helps get a tighter bound on the range analysis and the known-bits analysis, this exposes some normalization shortcoming of SCEVs: %div = udim i32 %a, 65536 %mul = mul i32 %div, 65536 %rem = urem i32 %a, 65536 %add = add i32 %mul, %rem Will usually not be reduced. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312329 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-01 14:59:59 +00:00
Eugene Zelenko	046ca04445	[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Also affected in files (NFC). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312289 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-31 21:56:16 +00:00
Adam Nemet	0827f9ac83	Remove an unnecessary const_cast. I think that this is dating back to when emit used to take a const reference. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311948 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-28 23:00:13 +00:00
Don Hinton	1adb5a9cb5	[Dominators] Remove redundant explicit template instantiation. Summary: Remove redundant explicit template instantiation. This was reported by Andrew Kelley building release_50 with gcc7.2.0 on MacOS: duplicate symbol llvm::DominatorTreeBase. Reviewers: kuhar, andrewrk, davide, hans Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37185 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311835 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-26 21:08:51 +00:00
Hiroshi Yamauchi	1020c414d8	Add options to dump block frequency/branch probability info in text. Summary: Add options -print-bfi/-print-bpi that dump block frequency and branch probability info like -view-block-freq-propagation-dags and -view-machine-block-freq-propagation-dags do but in text. This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37165 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311822 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-26 00:31:00 +00:00
Haicheng Wu	33be26f893	[InlineCost] Small changes to early exit condition. NFC. Change the early exit condition from Cost > Threshold to Cost >= Threshold because the inline condition is Cost < Threshold. Differential Revision: https://reviews.llvm.org/D37087 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311791 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-25 19:00:33 +00:00
Michael Kruse	f29303de23	Normlize to LF line endings. Commit r297442 introduced mixed CRLF/LF line endings to two files. Normalize to to LF-only line endings. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311774 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-25 12:38:53 +00:00
Dehao Chen	d38687abb5	Move accurate-sample-profile into the function attribute. Summary: We need to have accurate-sample-profile in function attribute so that it works with LTO. Reviewers: davidxl, rsmith Reviewed By: davidxl Subscribers: sanjoy, mehdi_amini, javed.absar, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D37113 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311706 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-24 21:37:04 +00:00
Tobias Grosser	2050a0312d	Model cache size and associativity in TargetTransformInfo Summary: We add the precise cache sizes and associativity for the following Intel architectures: - Penry - Nehalem - Westmere - Sandy Bridge - Ivy Bridge - Haswell - Broadwell - Skylake - Kabylake Polly uses since several months a performance model for BLAS computations that derives optimal cache and register tile sizes from cache and latency information (based on ideas from "Analytical Modeling Is Enough for High-Performance BLIS", by Tze Meng Low published at TOMS 2016). While bootstrapping this model, these target values have been kept in Polly. However, as our implementation is now rather mature, it seems time to teach LLVM itself about cache sizes. Interestingly, L1 and L2 cache sizes are pretty constant across micro-architectures, hence a set of architecture specific default values seems like a good start. They can be expanded to more target specific values, in case certain newer architectures require different values. For now a set of Intel architectures are provided. Just as a little teaser, for a simple gemm kernel this model allows us to improve performance from 1.2s to 0.27s. For gemm kernels with less optimal memory layouts even larger speedups can be reported. Reviewers: Meinersbur, bollu, singam-sanjay, hfinkel, gareevroman, fhahn, sebpop, efriedma, asb Reviewed By: fhahn, asb Subscribers: lsaba, asb, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D37051 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311647 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-24 09:46:25 +00:00
Rong Xu	7996242b16	[PGO] Set edge weights for indirectbr instruction with profile counts Current PGO only annotates the edge weight for branch and switch instructions with profile counts. We should also annotate the indirectbr instruction as all the information is there. This patch enables the annotating for indirectbr instructions. Also uses this annotation in branch probability analysis. Differential Revision: https://reviews.llvm.org/D37074 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311604 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-23 21:36:02 +00:00
George Rimar	95a4133b77	[lib/Analysis] - Mark personality functions as live. This is PR33245. Case I am fixing is next: Imagine we have 2 BC files, one defines and uses personality routine, second has only declaration and also uses it. Previously algorithm computing dead symbols (llvm::computeDeadSymbols) did not know about personality routines and leaved them dead even if function that has routine was live. As a result thinLTOInternalizeAndPromoteGUID() method changed binding for such symbol to local. Later when LLD tried to link these objects it failed because one object had undefined global symbol for routine and second object contained local definition instead of global. Patch set the live root flag on the corresponding FunctionSummary for personality routines when we build the per-module summaries during the compile step. Differential revision: https://reviews.llvm.org/D36834 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311432 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-22 08:50:56 +00:00
Craig Topper	16e7603633	[ValueTracking] Add assertions that the starting Depth in isKnownToBeAPowerOfTwo and ComputeNumSignBitsImpl is not above MaxDepth The function does an equality check later to terminate the recursion, but that won't work if its starts out too high. Similar assert already exists in computeKnownBits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311400 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-21 22:56:12 +00:00
Haicheng Wu	25ef265dc9	[InlineCost] Add cl::opt to allow full inline cost to be computed for debugging purposes. Currently, the inline cost model will bail once the inline cost exceeds the inline threshold in order to avoid unnecessary compile-time. However, when debugging it is useful to compute the full cost, so this command line option is added to override the default behavior. I took over this work from Chad Rosier (mcrosier@codeaurora.org). Differential Revision: https://reviews.llvm.org/D35850 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311371 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-21 20:00:09 +00:00
Chad Rosier	14edb7eb1a	[InlineCost] Add more debug during inline cost computation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311370 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-21 19:56:46 +00:00
Eugene Zelenko	89688ce180	[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311212 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-18 23:51:26 +00:00
Amjad Aboud	066b24cb94	[InstCombine] Teach ComputeNumSignBitsImpl to handle integer multiply instruction. Differential Revision: https://reviews.llvm.org/D36679 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311206 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-18 22:56:55 +00:00
Eugene Zelenko	93bb413a33	[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311048 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-16 22:07:40 +00:00
Sanjay Patel	77622085e7	[DemandedBits] simplify call; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311009 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-16 14:28:23 +00:00

1 2 3 4 5 ...

7596 Commits