186 Commits

Author SHA1 Message Date
Craig Topper
c1b79b3c59 [ScalarizeMaskedMemIntrin] Add support for scalarizing expandload and compressstore intrinsics.
This adds support for scalarizing these intrinsics as well the X86TargetTransformInfo support to avoid scalarizing them in the cases X86 can handle.

I've omitted handling special cases for constant masks for this first pass. Though CodeGenPrepare can constant fold the branch conditions and remove some of the control flow anyway.

Fixes PR40994 and is covers most of PR3666. Might want to implement constant masks to close that.

Differential Revision: https://reviews.llvm.org/D59180

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356687 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-21 17:38:52 +00:00
Sjoerd Meijer
240db0070c [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI.
This is addressing the issue that we're not modeling the cost of clib functions
in TTI::getIntrinsicCosts and thus we're basically addressing this fixme:
    
// FIXME: This is wrong for libc intrinsics.

To enable analysis of clib functions, we not only need an intrinsic ID and
formal arguments, but also the actual user of that function so that we can e.g.
look at alignment and values of arguments. So, this is the initial plumbing to
pass the user of an intrinsinsic on to getCallCosts, which queries
getIntrinsicCosts.

Differential Revision: https://reviews.llvm.org/D59014


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355901 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 09:48:02 +00:00
Sam Parker
d0c143de76 [LSR] Generate cross iteration indexes
Modify GenerateConstantOffsetsImpl to create offsets that can be used
by indexed addressing modes. If formulae can be generated which
result in the constant offset being the same size as the recurrence,
we can generate a pre-indexed access. This allows the pointer to be
updated via the single pre-indexed access so that (hopefully) no
add/subs are required to update it for the next iteration. For small
cores, this can significantly improve performance DSP-like loops.

Differential Revision: https://reviews.llvm.org/D55373



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353403 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-07 13:32:54 +00:00
Chandler Carruth
6b547686c5 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 08:50:56 +00:00
Tom Stellard
356397dc91 Only promote args when function attributes are compatible
Summary:
Check to make sure that the caller and the callee have compatible
function arguments before promoting arguments.  This uses the same
TargetTransformInfo queries that are used to determine if attributes
are compatible for inlining.

The goal here is to avoid breaking ABI when a called function's ABI
depends on a target feature that is not enabled in the caller.

This is a very conservative fix for PR37358.  Ideally we would have a more
sophisticated check for ABI compatiblity rather than checking if the
attributes are compatible for inlining.

Reviewers: echristo, chandlerc, eli.friedman, craig.topper

Reviewed By: echristo, chandlerc

Subscribers: nikic, xbolva00, rkruppe, alexcrichton, llvm-commits

Differential Revision: https://reviews.llvm.org/D53554

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351296 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-16 05:15:31 +00:00
Simon Pilgrim
48de5cc373 [TTI] getOperandInfo - a broadcast shuffle means the result is OK_UniformValue
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346868 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-14 15:04:08 +00:00
Simon Pilgrim
20a1e3c6ac [TTI] Make TargetTransformInfo::getOperandInfo static. NFCI.
It has no member dependencies and this makes it easier to reuse in other cost analysis code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346755 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-13 13:45:10 +00:00
Simon Pilgrim
ec09d9119e [TTI] Flip vector types in getShuffleCost SK_ExtractSubvector call
For SK_ExtractSubvector, the default 'Ty' type is the source operand type and 'SubTy' is the destination subvector type

I got this the wrong way around when I added rL346510

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346534 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-09 18:30:59 +00:00
Simon Pilgrim
3ccdf221e9 [CostModel] Add SK_ExtractSubvector handling to getInstructionThroughput (PR39368)
Add ShuffleVectorInst::isExtractSubvectorMask helper to match shuffle masks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346510 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-09 16:28:19 +00:00
Dorit Nuzman
06bac6c858 [LV] Support vectorization of interleave-groups that require an epilog under
optsize using masked wide loads 

Under Opt for Size, the vectorizer does not vectorize interleave-groups that
have gaps at the end of the group (such as a loop that reads only the even
elements: a[2*i]) because that implies that we'll require a scalar epilogue
(which is not allowed under Opt for Size). This patch extends the support for
masked-interleave-groups (introduced by D53011 for conditional accesses) to
also cover the case of gaps in a group of loads; Targets that enable the
masked-interleave-group feature don't have to invalidate interleave-groups of
loads with gaps; they could now use masked wide-loads and shuffles (if that's
what the cost model selects).

Reviewers: Ayal, hsaito, dcaballe, fhahn

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D53668



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345705 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-31 09:57:56 +00:00
Dorit Nuzman
7d7250490b recommit 344472 after fixing build failure on ARM and PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344475 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-14 08:50:06 +00:00
Dorit Nuzman
473da03560 revert 344472 due to failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344473 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-14 07:21:20 +00:00
Dorit Nuzman
a3ff03e8e2 [IAI,LV] Add support for vectorizing predicated strided accesses using masked
interleave-group

The vectorizer currently does not attempt to create interleave-groups that
contain predicated loads/stores; predicated strided accesses can currently be
vectorized only using masked gather/scatter or scalarization. This patch makes
predicated loads/stores candidates for forming interleave-groups during the
Loop-Vectorizer's analysis, and adds the proper support for masked-interleave-
groups to the Loop-Vectorizer's planning and transformation stages. The patch
also extends the TTI API to allow querying the cost of masked interleave groups
(which each target can control); Targets that support masked vector loads/
stores may choose to enable this feature and allow vectorizing predicated
strided loads/stores using masked wide loads/stores and shuffles.

Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D53011



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344472 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-14 07:06:16 +00:00
Jonas Paulsson
9a16b611a7 [LoopVectorizer] Use TTI.getOperandInfo()
Call getOperandInfo() instead of using (near) duplicated code in
LoopVectorizationCostModel::getInstructionCost().

This gets the OperandValueKind and OperandValueProperties values for a Value
passed as operand to an arithmetic instruction.

getOperandInfo() used to be a static method in TargetTransformInfo.cpp, but
is now instead a public member.

Review: Florian Hahn
https://reviews.llvm.org/D52883

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343852 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-05 14:34:04 +00:00
Fangrui Song
af7b1832a0 Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338293 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-30 19:41:25 +00:00
Simon Pilgrim
7a7cfd8a89 [TargetTransformInfo] Add pow2 analysis for scalar constants
Add ConstantInt analysis to getOperandInfo so we get more realistic div/rem expansion costs comparable to the vector costs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336827 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 17:51:27 +00:00
Sanjay Patel
42f462a392 [IR] move shuffle mask queries from TTI to ShuffleVectorInst
The optimizer is getting smarter (eg, D47986) about differentiating shuffles 
based on its mask values, so we should make queries on the mask constant 
operand generally available to avoid code duplication.

We'll probably use this soon in the vectorizers and instcombine (D48023 and 
https://bugs.llvm.org/show_bug.cgi?id=37806).

We might clean up TTI a bit more once all of its current 'SK_*' options are 
covered.

Differential Revision: https://reviews.llvm.org/D48236


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335067 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-19 18:44:00 +00:00
Benjamin Kramer
c011f6948e Fix namespaces. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334890 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-16 13:37:52 +00:00
Simon Pilgrim
419887cd06 [CostModel] Cleanup isSingleSourceVectorMask to match other shuffle matchers. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334699 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-14 09:48:19 +00:00
Simon Pilgrim
d9dafe02fb [CostModel] Recognise REVERSE shuffle mask if the elements come from the second src
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334698 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-14 09:35:00 +00:00
Simon Pilgrim
31dfcf10a6 [CostModel] Recognise BROADCAST shuffle mask if the elements come from the second src
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334620 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-13 16:52:02 +00:00
Simon Pilgrim
21582f2af6 [CostModel] Replace ShuffleKind::SK_Alternate with ShuffleKind::SK_Select (PR33744)
As discussed on PR33744, this patch relaxes ShuffleKind::SK_Alternate which requires shuffle masks to only match an alternating pattern from its 2 sources:

e.g. v4f32: <0,5,2,7> or <4,1,6,3>

This seems far too restrictive as most SIMD hardware which will implement it using a general blend/bit-select instruction, so replaces it with SK_Select, permitting elements from either source as long as they are inline:

e.g. v4f32: <0,5,2,7>, <4,1,6,3>, <0,1,6,7>, <4,1,2,3> etc.

This initial patch just updates the name and cost model shuffle mask analysis, later patch reviews will update SLP to better utilise this - it still limits itself to SK_Alternate style patterns.

Differential Revision: https://reviews.llvm.org/D47985

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334513 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-12 16:12:29 +00:00
Simon Pilgrim
2ccbb4c82e Fix signed/unsigned warning. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334509 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-12 15:14:34 +00:00
Simon Pilgrim
861e3f325f [CostModel] Treat Identity shuffle masks as zero cost
As discussed on D47985, identity shuffle masks should probably be free.

I've limited this to the case where the input and output types all match - but we could probably accept all cases.

Differential Revision: https://reviews.llvm.org/D47986

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334506 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-12 14:47:13 +00:00
Simon Pilgrim
91eac1017f [TTI] Add uniform/non-uniform constant Pow2 detection to TargetTransformInfo::getInstructionThroughput
This enables us to detect more fast path sdiv cases under cost analysis.

This patch also enables us to handle non-uniform-constant pow2 cases for X86 SDIV costs.

Found while working on D46276

Future patches can then extend the vectorizers to more fully support non-uniform pow2 cases.

Differential Revision: https://reviews.llvm.org/D46637

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332969 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-22 10:40:09 +00:00
Adrian Prantl
26b584c691 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331272 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-01 15:54:18 +00:00
Matthew Simpson
daa39fa144 [TTI, AArch64] Add transpose shuffle kind
This patch adds a new shuffle kind useful for transposing a 2xn matrix. These
transpose shuffle masks read corresponding even- or odd-numbered vector
elements from two n-dimensional source vectors and write each result into
consecutive elements of an n-dimensional destination vector. The transpose
shuffle kind is meant to model the TRN1 and TRN2 AArch64 instructions. As such,
this patch also considers transpose shuffles in the AArch64 implementation of
getShuffleCost.

Differential Revision: https://reviews.llvm.org/D45982

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330941 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-26 13:48:33 +00:00
Krzysztof Parzyszek
c6388fd9c7 [LV] Introduce TTI::getMinimumVF
The function getMinimumVF(ElemWidth) will return the minimum VF for
a vector with elements of size ElemWidth bits. This value will only
apply to targets for which TTI::shouldMaximizeVectorBandwidth returns
true. The value of 0 indicates that there is no minimum VF.

Differential Revision: https://reviews.llvm.org/D45271


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330062 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-13 20:16:32 +00:00
David Blaikie
ebb5c41145 Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency
Thanks to echristo for the pointers on direction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328737 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-28 22:28:50 +00:00
Krzysztof Parzyszek
4153b00c21 [LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per target
The default implementation returns false and keeps the current behavior.

Differential Revision: https://reviews.llvm.org/D44735


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328632 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-27 16:14:11 +00:00
Krzysztof Parzyszek
48972e6c8f [LSR] Allow giving priority to post-incrementing addressing modes
Implement TTI interface for targets to indicate that the LSR should give
priority to post-incrementing addressing modes.

Combination of patches by Sebastian Pop and Brendon Cahoon.

Differential Revision: https://reviews.llvm.org/D44758


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328490 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-26 13:10:09 +00:00
Sanjay Patel
74007202a3 [LoopStrengthReduce, x86] don't add cost for a cmp that will be macro-fused (PR35681)
In the motivating case from PR35681 and represented by the macro-fuse-cmp test:
https://bugs.llvm.org/show_bug.cgi?id=35681
...there's a 37 -> 31 byte size win for the loop because we eliminate the big base 
address offsets.

SPEC2017 on Ryzen shows no significant perf difference.

Differential Revision: https://reviews.llvm.org/D42607



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324289 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-05 23:43:05 +00:00
Zaara Syeda
12b04c98f7 Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This recommits r322721 reverted due to sanitizer memory leak build bot failures.

Original commit message:
This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.

Differential Revision: https://reviews.llvm.org/D38413

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323778 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-30 16:17:22 +00:00
Zaara Syeda
bbb2a7afcc Revert [PowerPC] This reverts commit rL322721
Failing build bots. Revert the commit now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322748 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17 20:00:15 +00:00
Zaara Syeda
4379bc4ba1 [PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.

Differential Revision: https://reviews.llvm.org/D38413

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322721 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17 18:22:55 +00:00
Guozhi Wei
de740eaa76 Revert r321377, it causes regression to https://reviews.llvm.org/P8055.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321528 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-28 17:02:34 +00:00
Guozhi Wei
7f53692f1d [SimplifyCFG] Don't do if-conversion if there is a long dependence chain
If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor.

This patch checks for the long dependence chain, and give up if-conversion if find one.

Differential Revision: https://reviews.llvm.org/D39352



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321377 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-22 18:54:04 +00:00
Sean Fertile
b71c6a9b39 [Memcpy Loop Lowering] Remove the fixed int8 lowering.
Switch over to the lowering that uses target supplied operand types.

Differential Revision: https://reviews.llvm.org/D41201

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320989 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18 15:31:14 +00:00
Sanjay Patel
8a189ea7bf [PartiallyInlineLibCalls][x86] add TTI hook to allow sqrt inlining to depend on arg rather than result
This should fix PR31455:
https://bugs.llvm.org/show_bug.cgi?id=31455

Differential Revision: https://reviews.llvm.org/D28314


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319094 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-27 21:15:43 +00:00
Clement Courbet
4ccf677f27 [CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).
- Targets that want to support memcmp expansions now return the list of
   supported load sizes.
 - Expansion codegen does not assume that all power-of-two load sizes
   smaller than the max load size are valid. For examples, this is not the
   case for x86(32bit)+sse2.

Fixes PR34887.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316905 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 14:19:33 +00:00
Artem Belevich
c79e8ba6d6 [NVPTX] allow address space inference for volatile loads/stores.
If particular target supports volatile memory access operations, we can
avoid AS casting to generic AS. Currently it's only enabled in NVPTX for
loads and stores that access global & shared AS.

Differential Revision: https://reviews.llvm.org/D39026

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316495 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-24 20:31:44 +00:00
Daniel Jasper
e9712285e9 Revert r314923: "Recommit : Use the basic cost if a GEP is not used as addressing mode"
Significantly reduces performancei (~30%) of gipfeli
(https://github.com/google/gipfeli)

I have not yet managed to reproduce this regression with the open-source
version of the benchmark on github, but will work with others to get a
reproducer to you later today.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315680 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-13 14:04:21 +00:00
Jun Bum Lim
e3f6227d56 Recommit : Use the basic cost if a GEP is not used as addressing mode
Recommitting r314517 with the fix for handling ConstantExpr.

Original commit message:
  Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing
  mode in the target. However, since it doesn't check its actual users, it will
  return FREE even in cases where the GEP cannot be folded away as a part of
  actual addressing mode. For example, if an user of the GEP is a call
  instruction taking the GEP as a parameter, then the GEP may not be folded in
  isel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314923 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-04 18:33:52 +00:00
Alex Shlyapnikov
682384e698 Revert "Use the basic cost if a GEP is not used as addressing mode"
This reverts commit r314517.

This commit crashes sanitizer bots, for example:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/4167

Stack snippet:
...
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Support/Casting.h:255:0
llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getGEPCost(llvm::GEPOperator const*, llvm::ArrayRef<llvm::Value const*>)
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:742:0
llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getUserCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>)
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:782:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/lib/Analysis/TargetTransformInfo.cpp:116:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:116:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:343:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:864:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfo.h:285:0
...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314560 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 22:04:45 +00:00
Jun Bum Lim
fd8d5dac84 Use the basic cost if a GEP is not used as addressing mode
Summary:
Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target.
However, since it doesn't check its actual users, it will return FREE even in cases
where the GEP cannot be folded away as a part of actual addressing mode.
For example, if an user of the GEP is a call instruction taking the GEP as a parameter,
then the GEP may not be folded in isel.

Reviewers: hfinkel, efriedma, mcrosier, jingyue, haicheng

Reviewed By: hfinkel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38085

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314517 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 14:50:16 +00:00
Clement Courbet
bb3c660e87 [CodeGenPrepare][NFC] Rename TargetTransformInfo::expandMemCmp -> TargetTransformInfo::enableMemCmpExpansion.
Summary:
Right now there are two functions with the same name, one does the work
and the other one returns true if expansion is needed. Rename
TargetTransformInfo::expandMemCmp to make it more consistent with other
members of TargetTransformInfo.

Remove the unused Instruction* parameter.

Differential Revision: https://reviews.llvm.org/D38165

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314096 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-25 06:35:16 +00:00
Sanjay Patel
193e898f75 [DivRempairs] add a pass to optimize div/rem pairs (PR31028)
This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented 
as an independent pass, so there's no stretching of scope and feature creep for an existing pass. 
I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost 
this same functionality as an addition to CGP in the motivating example of PR31028:
https://bugs.llvm.org/show_bug.cgi?id=31028

The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow 
more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and
undo the hoisting that is done here.

Decomposing remainder may allow removing some code from the backend (PPC and possibly others).

Differential Revision: https://reviews.llvm.org/D37121 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312862 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-09 13:38:18 +00:00
Guozhi Wei
19969b8b8f [TargetTransformInfo] Add a new public interface getInstructionCost
Current TargetTransformInfo can support throughput cost model and code size model, but sometimes we also need instruction latency cost model in different optimizations. Hal suggested we need a single public interface to query the different cost of an instruction. So I proposed following interface:

  enum TargetCostKind {
    TCK_RecipThroughput, ///< Reciprocal throughput.
    TCK_Latency,         ///< The latency of instruction.
    TCK_CodeSize         ///< Instruction code size.
  };

  int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const;

All clients should mainly use this function to query the cost of an instruction, parameter <kind> specifies the desired cost model.

This patch also provides a simple default implementation of getInstructionLatency.

The default getInstructionLatency provides latency numbers for only small number of instruction classes, those latency numbers are only reasonable for modern OOO processors. It can be extended in following ways:

   Add more detail into this function.
   Add getXXXLatency function and call it from here.
   Implement target specific getInstructionLatency function.

Differential Revision: https://reviews.llvm.org/D37170



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312832 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 22:29:17 +00:00
Alexey Bataev
4fcc7e8528 [SLP] Support for horizontal min/max reduction.
SLP vectorizer supports horizontal reductions for Add/FAdd binary
operations. Patch adds support for horizontal min/max reductions.
Function getReductionCost() is split to getArithmeticReductionCost() for
binary operation reductions and getMinMaxReductionCost() for min/max
reductions.
Patch fixes PR26956.

Differential revision: https://reviews.llvm.org/D27846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312791 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 13:49:36 +00:00
Tobias Grosser
2050a0312d Model cache size and associativity in TargetTransformInfo
Summary:
We add the precise cache sizes and associativity for the following Intel
architectures:

  - Penry
  - Nehalem
  - Westmere
  - Sandy Bridge
  - Ivy Bridge
  - Haswell
  - Broadwell
  - Skylake
  - Kabylake

Polly uses since several months a performance model for BLAS computations that
derives optimal cache and register tile sizes from cache and latency
information (based on ideas from "Analytical Modeling Is Enough for High-Performance BLIS", by Tze Meng Low published at TOMS 2016).
While bootstrapping this model, these target values have been kept in Polly.
However, as our implementation is now rather mature, it seems time to teach
LLVM itself about cache sizes.

Interestingly, L1 and L2 cache sizes are pretty constant across
micro-architectures, hence a set of architecture specific default values
seems like a good start. They can be expanded to more target specific values,
in case certain newer architectures require different values. For now a set
of Intel architectures are provided.

Just as a little teaser, for a simple gemm kernel this model allows us to
improve performance from 1.2s to 0.27s. For gemm kernels with less optimal
memory layouts even larger speedups can be reported.

Reviewers: Meinersbur, bollu, singam-sanjay, hfinkel, gareevroman, fhahn, sebpop, efriedma, asb

Reviewed By: fhahn, asb

Subscribers: lsaba, asb, pollydev, llvm-commits

Differential Revision: https://reviews.llvm.org/D37051

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311647 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 09:46:25 +00:00