Commit Graph

150 Commits

Author SHA1 Message Date
Guillaume Chatelet
4b8e3f9a30 [LLVM][NFC] remove unused fields
Summary:
Here is the commit introducing the fields
https://github.com/llvm/llvm-project/commit/cf6749e4c091

It dates back from 2006 and was used by AArch64 backend.
There is no more reference to these fields in the whole codebase so I think it's fine.

Reviewers: courbet

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66683

llvm-svn: 369810
2019-08-23 20:49:06 +00:00
Simon Pilgrim
5d5f9fbb0b Fix typo in comment. NFCI.
llvm-svn: 369419
2019-08-20 17:54:37 +00:00
Matt Arsenault
d04e499a2d InferAddressSpaces: Move target intrinsic handling to TTI
I'm planning on handling intrinsics that will benefit from checking
the address space enums. Don't bother moving the address collection
for now, since those won't need th enums.

llvm-svn: 368895
2019-08-14 18:13:00 +00:00
Daniil Fukalov
63e1e9076b [AMDGPU] Tune inlining parameters for AMDGPU target
Summary:
Since the target has no significant advantage of vectorization,
vector instructions bous threshold bonus should be optional.

amdgpu-inline-arg-alloca-cost parameter default value and the target
InliningThresholdMultiplier value tuned then respectively.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64642

llvm-svn: 366348
2019-07-17 16:51:29 +00:00
Rui Ueyama
1179e18f0c Fix parameter name comments using clang-tidy. NFC.
This patch applies clang-tidy's bugprone-argument-comment tool
to LLVM, clang and lld source trees. Here is how I created this
patch:

$ git clone https://github.com/llvm/llvm-project.git
$ cd llvm-project
$ mkdir build
$ cd build
$ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \
    -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \
    -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \
    -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm
$ ninja
$ parallel clang-tidy -checks='-*,bugprone-argument-comment' \
    -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \
    ::: ../llvm/lib/**/*.{cpp,h} ../clang/lib/**/*.{cpp,h} ../lld/**/*.{cpp,h}

llvm-svn: 366177
2019-07-16 04:46:31 +00:00
David Greene
1a0528f138 Revert "[System Model] [TTI] Update cache and prefetch TTI interfaces"
This broke some PPC prefetching tests.

This reverts commit 9fdfb045ae8bb643ab0d0455dcf9ecaea3b1eb3c.

llvm-svn: 365680
2019-07-10 18:25:58 +00:00
David Greene
0c46c9cc2f [System Model] [TTI] Update cache and prefetch TTI interfaces
Rework the TTI cache and software prefetching APIs to prepare for the
introduction of a general system model.  Changes include:

- Marking existing interfaces const and/or override as appropriate
- Adding comments
- Adding BasicTTIImpl interfaces that delegate to a subtarget
  implementation
- Adding a default "no information" subtarget implementation

Only a handful of targets use these interfaces currently: AArch64,
Hexagon, PPC and SystemZ.  AArch64 already has a custom subtarget
implementation, so its custom TTI implementation is migrated to use
the new facilities in BasicTTIImpl to invoke its custom subtarget
implementation.  The custom TTI implementations continue to exist for
the other targets with this change.  They are not moved over to
subtarget-based implementations.

The end goal is to have the default subtarget implementation defer to
the system model defined by the target.  With this change, the default
subtarget implementation essentially returns "no information" for
these interfaces.  None of the existing users of TTI will hit that
implementation because they define their own custom TTI
implementations and won't use the BasicTTIImpl implementations.

Once system models are in place for the targets that use these
interfaces, their custom TTI implementations can be removed.

Differential Revision: https://reviews.llvm.org/D63614

llvm-svn: 365676
2019-07-10 18:07:01 +00:00
Chen Zheng
5a9de4ff50 [NFC] move some hardware loop checking code to a common place for other using.
Differential Revision: https://reviews.llvm.org/D63478

llvm-svn: 363758
2019-06-19 01:26:31 +00:00
Simon Pilgrim
b087bfb569 [TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests (PR42123)
As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space.

This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them.

If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores.

Differential Revision: https://reviews.llvm.org/D63075

llvm-svn: 363179
2019-06-12 17:14:03 +00:00
Sander de Smalen
e1e23467af Change semantics of fadd/fmul vector reductions.
This patch changes how LLVM handles the accumulator/start value
in the reduction, by never ignoring it regardless of the presence of
fast-math flags on callsites. This change introduces the following
new intrinsics to replace the existing ones:

  llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd
  llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul

and adds functionality to auto-upgrade existing LLVM IR and bitcode.

Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D60261

llvm-svn: 363035
2019-06-11 08:22:10 +00:00
Sam Parker
7df94e8bbc [CodeGen] Generic Hardware Loop Support
Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops and PowerPC has been ported over to use this generic
pass. The target dependent parts have been moved into
TargetTransformInfo, via isHardwareLoopProfitable, with
HardwareLoopInfo introduced to transfer information from the backend.
    
Three generic intrinsics have been introduced:
- void @llvm.set_loop_iterations
  Takes as a single operand, the number of iterations to be executed.
- i1 @llvm.loop_decrement(anyint)
  Takes the maximum number of elements processed in an iteration of
  the loop body and subtracts this from the total count. Returns
  false when the loop should exit.
- anyint @llvm.loop_decrement_reg(anyint, anyint)
  Takes the number of elements remaining to be processed as well as
  the maximum numbe of elements processed in an iteration of the loop
  body. Returns the updated number of elements remaining.

llvm-svn: 362774
2019-06-07 07:35:30 +00:00
Matt Arsenault
01807baf69 TTI: Improve default costs for addrspacecast
For some reason multiple places need to do this, and the variant the
loop unroller and inliner use was not handling it.

Also, introduce a new wrapper to be slightly more precise, since on
AMDGPU some addrspacecasts are free, but not no-ops.

llvm-svn: 362436
2019-06-03 18:41:34 +00:00
Sjoerd Meijer
c1c71ee45e [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI.
This is addressing the issue that we're not modeling the cost of clib functions
in TTI::getIntrinsicCosts and thus we're basically addressing this fixme:
    
// FIXME: This is wrong for libc intrinsics.

To enable analysis of clib functions, we not only need an intrinsic ID and
formal arguments, but also the actual user of that function so that we can e.g.
look at alignment and values of arguments. So, this is the initial plumbing to
pass the user of an intrinsinsic on to getCallCosts, which queries
getIntrinsicCosts.

Differential Revision: https://reviews.llvm.org/D59014

llvm-svn: 355901
2019-03-12 09:48:02 +00:00
Simon Pilgrim
d98dec6bb7 [TTI] Add generic cost model for smul/umul overflow intrinsics
Based off smul/umul fixed costs and the implementation in TargetLowering::expandMULO.

llvm-svn: 354784
2019-02-25 13:30:23 +00:00
Simon Pilgrim
9c3dd6002b [TTI] Add generic cost model for fixed point smul/umul
Based on an IR equivalent of target lowering's generic expansion - target specific costs will typically be lower (IR doesn't have a good mull/mulh equivalent) but we need a baseline.

Differential Revision: https://reviews.llvm.org/D57925

llvm-svn: 354774
2019-02-25 11:59:23 +00:00
Simon Pilgrim
0187480789 [TTI] Add generic SADDSAT/SSUBSAT costs
Add generic costs calculation for SADDSAT/SSUBSAT intrinsics, this uses generic costs for sadd_with_overflow/ssub_with_overflow, an extra sign comparison + a selects based on the sign/overflow.

This completes PR40316

Differential Revision: https://reviews.llvm.org/D57239

llvm-svn: 352315
2019-01-27 13:51:59 +00:00
Simon Pilgrim
505985ee45 Fix line endings and trim trailing whitespace. NFCI.
llvm-svn: 352198
2019-01-25 14:29:57 +00:00
Simon Pilgrim
f2c27f86d8 [TTI] Add generic SADDO/SSUBO costs
Added x86 scalar sadd_with_overflow/ssub_with_overflow costs.

llvm-svn: 352045
2019-01-24 13:36:45 +00:00
Simon Pilgrim
a9632ac602 [TTI] Add generic UADDSAT/USUBSAT costs
Add generic costs calculation for UADDSAT/USUBSAT intrinsics, this fallbacks to using generic costs for uadd_with_overflow/usub_with_overflow + a select.

Differential Revision: https://reviews.llvm.org/D56907

llvm-svn: 352044
2019-01-24 12:27:10 +00:00
Simon Pilgrim
c836f6af6f [TTI] Add generic UADDO/USUBO costs
Added x86 scalar uadd_with_overflow/usub_with_overflow costs.

Differential Revision: https://reviews.llvm.org/D56907

llvm-svn: 352043
2019-01-24 12:10:20 +00:00
Chandler Carruth
ae65e281f3 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Simon Pilgrim
4b6e08d50e [TTI] Use ConcreteTTI cast in getIntrinsicInstrCost Type variant. NFCI.
Same as we do in the Value variant.

llvm-svn: 351554
2019-01-18 14:48:36 +00:00
Craig Topper
fc983d8fa5 [CostModel][X86] Don't count 2 shuffles on the last level of a pairwise arithmetic or min/max reduction
This is split from D55452 with the correct patch this time.

Pairwise reductions require two shuffles on every level but the last. On the last level the two shuffles are <1, u, u, u...> and <0, u, u, u...>, but <0, u, u, u...> will be dropped by InstCombine/DAGCombine as being an identity shuffle.

Differential Revision: https://reviews.llvm.org/D55615

llvm-svn: 349072
2018-12-13 19:08:10 +00:00
Craig Topper
47ccb7c7e7 [CostModel][X86][AArch64] Adjust cost of the scalarization part of min/max reduction.
Summary: The comment says we need 3 extracts and a select at the end. But didn't we just account for the select in the vector cost above. Aren't we just extracting the single element after taking the min/max in the vector register?

Reviewers: RKSimon, spatel, ABataev

Reviewed By: RKSimon

Subscribers: javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D55480

llvm-svn: 348739
2018-12-10 06:58:58 +00:00
Craig Topper
456baa52f2 [CostModel][X86] Fix overcounting arithmetic cost in illegal types in getArithmeticReductionCost/getMinMaxReductionCost
We were overcounting the number of arithmetic operations needed at each level before we reach a legal type. We were using the full vector type for that level, but we are going to split the input vector at that level in half. So the effective arithmetic operation cost at that level is half the width.

So for example on 8i32 on an sse target. Were were calculating the cost of an 8i32 op which is likely 2 for basic integer. Then after the loop we count 2 more v4i32 ops. For a total arith cost of 4. But if you look at the assembly there would only be 3 arithmetic ops.

There are still more bugs in this code that I'm going to work on next. The non pairwise code shouldn't count extract subvectors in the loop. There are no extracts, the types are split in registers. For pairwise we need to use 2 two src permute shuffles.

Differential Revision: https://reviews.llvm.org/D55397

llvm-svn: 348621
2018-12-07 18:20:56 +00:00
Simon Pilgrim
a5a555e69a [TTI] Reduction costs only need to include a single extract element cost (REAPPLIED)
We were adding the entire scalarization extraction cost for reductions, which returns the total cost of extracting every element of a vector type.

For reductions we don't need to do this - we just need to extract the 0'th element after the reduction pattern has completed.

Fixes PR37731

Rebased and reapplied after being reverted in rL347541 due to PR39774 - which was fixed by D54955/rL347759 and D55017/rL347997

Differential Revision: https://reviews.llvm.org/D54585

llvm-svn: 348076
2018-12-01 14:18:31 +00:00
Fedor Sergeev
79ff5dcabd Revert "[TTI] Reduction costs only need to include a single extract element cost"
This reverts commit r346970.
It was causing PR39774, a crash in slp-vectorizer on a rather simple loop
with just a bunch of 'and's in the body.

llvm-svn: 347541
2018-11-26 10:17:27 +00:00
Simon Pilgrim
a5b4b79bb8 [TTI] Reduction costs only need to include a single extract element cost
We were adding the entire scalarization extraction cost for reductions, which returns the total cost of extracting every element of a vector type.

For reductions we don't need to do this - we just need to extract the 0'th element after the reduction pattern has completed.

Fixes PR37731

Differential Revision: https://reviews.llvm.org/D54585

llvm-svn: 346970
2018-11-15 17:42:53 +00:00
Simon Pilgrim
b5cdf54d41 [TTI] Pull out repeated 'ConcreteTTI' static_casts. NFCI.
llvm-svn: 346859
2018-11-14 13:23:28 +00:00
Simon Pilgrim
7f3f272966 [CostModel] Add generic expansion funnel shift cost support
Add support for the expansion of funnelshift/rotates to getIntrinsicInstrCost.

This also required us to move the X86 fshl/fshr costs to the same place as the rotates to avoid expansion and get correct scalarization vs vectorization costs.

llvm-svn: 346854
2018-11-14 12:24:50 +00:00
Simon Pilgrim
71b92e1ae4 [CostModel] Add more realistic SK_InsertSubvector generic costs.
Instead of defaulting to a cost = 1, expand to element extract/insert like we do for other shuffles.

llvm-svn: 346662
2018-11-12 15:20:24 +00:00
Simon Pilgrim
f1d79956c8 Fix unused variable warning. NFCI.
llvm-svn: 346657
2018-11-12 14:48:39 +00:00
Simon Pilgrim
fa09727fea [CostModel] Add more realistic SK_ExtractSubvector generic costs.
Instead of defaulting to a cost = 1, expand to element extract/insert like we do for other shuffles.

This exposes an issue in LoopVectorize which could call SK_ExtractSubvector with a scalar subvector type.

llvm-svn: 346656
2018-11-12 14:25:23 +00:00
Dorit Nuzman
a2771a93ac [LV] Support vectorization of interleave-groups that require an epilog under
optsize using masked wide loads 

Under Opt for Size, the vectorizer does not vectorize interleave-groups that
have gaps at the end of the group (such as a loop that reads only the even
elements: a[2*i]) because that implies that we'll require a scalar epilogue
(which is not allowed under Opt for Size). This patch extends the support for
masked-interleave-groups (introduced by D53011 for conditional accesses) to
also cover the case of gaps in a group of loads; Targets that enable the
masked-interleave-group feature don't have to invalidate interleave-groups of
loads with gaps; they could now use masked wide-loads and shuffles (if that's
what the cost model selects).

Reviewers: Ayal, hsaito, dcaballe, fhahn

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D53668

llvm-svn: 345705
2018-10-31 09:57:56 +00:00
Simon Pilgrim
656ffe8110 [TTI] Fix uses of SK_ExtractSubvector shuffle costs (PR39368)
Correct costings of SK_ExtractSubvector requires the SubTy argument to indicate the type/size of the extracted subvector.

Unlike the rest of the shuffle kinds this means that the main Ty argument represents the source vector type not the destination!

I've done my best to fix a number of vectorizer uses:

SLP - the reduction epilogue costs should be using a SK_PermuteSingleSrc shuffle as these all occur at the hardware vector width - we're not extracting (illegal) subvector types. This is causing the cost model diffs as SK_ExtractSubvector costs are poorly handled and tend to just return 1 at the moment.

LV - I'm not clear on what the SK_ExtractSubvector should represents for recurrences - I've used a <1 x ?> subvector extraction as that seems to match the VF delta.

Differential Revision: https://reviews.llvm.org/D53573

llvm-svn: 345617
2018-10-30 18:10:02 +00:00
Simon Pilgrim
afbfdd6bd6 [TTI] Add generic SK_Broadcast shuffle costs
I noticed while fixing PR39368 that we don't have generic shuffle costs for broadcast style shuffles.

This patch adds SK_BROADCAST handling, but exposes ARM/AARCH64 lack of handling of this type, which I've added a fix for at the same time.

Differential Revision: https://reviews.llvm.org/D53570

llvm-svn: 345253
2018-10-25 10:52:36 +00:00
Thomas Lively
e649dc137e [NFC] Rename minnan and maxnan to minimum and maximum
Summary:
Changes all uses of minnan/maxnan to minimum/maximum
globally. These names emphasize that the semantic difference between
these operations is more than just NaN-propagation.

Reviewers: arsenm, aheejin, dschuff, javed.absar

Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53112

llvm-svn: 345218
2018-10-24 22:49:55 +00:00
Simon Pilgrim
df74e03322 [TTI] Add generic cost handling of SK_Reverse shuffles
These can be treated as a general permute.

This required a fix for missing reverse patterns on ARM

llvm-svn: 345015
2018-10-23 09:42:10 +00:00
Dorit Nuzman
a3df726c55 recommit 344472 after fixing build failure on ARM and PPC.
llvm-svn: 344475
2018-10-14 08:50:06 +00:00
Dorit Nuzman
70052d5053 revert 344472 due to failures.
llvm-svn: 344473
2018-10-14 07:21:20 +00:00
Dorit Nuzman
c4c9199631 [IAI,LV] Add support for vectorizing predicated strided accesses using masked
interleave-group

The vectorizer currently does not attempt to create interleave-groups that
contain predicated loads/stores; predicated strided accesses can currently be
vectorized only using masked gather/scatter or scalarization. This patch makes
predicated loads/stores candidates for forming interleave-groups during the
Loop-Vectorizer's analysis, and adds the proper support for masked-interleave-
groups to the Loop-Vectorizer's planning and transformation stages. The patch
also extends the TTI API to allow querying the cost of masked interleave groups
(which each target can control); Targets that support masked vector loads/
stores may choose to enable this feature and allow vectorizing predicated
strided loads/stores using masked wide loads/stores and shuffles.

Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D53011

llvm-svn: 344472
2018-10-14 07:06:16 +00:00
Sam Clegg
b517fd6012 [SLPVectorizer] Check that lowered type is floating point before calling isFabsFree
In the case of soft-fp (e.g. fp128 under wasm) the result of
getTypeLegalizationCost() can be an integer type even if the input is
floating point (See LegalizeTypeAction::TypeSoftenFloat).

Before calling isFabsFree() (which asserts if given a non-fp
type) we need to check that that result is fp.  This is safe since in
fabs is certainly not free in the soft-fp case.

Fixes PR39168

Differential Revision: https://reviews.llvm.org/D52899

llvm-svn: 344069
2018-10-09 18:41:17 +00:00
Matt Arsenault
2da3e37d8b Fix vectorization of canonicalize
llvm-svn: 342390
2018-09-17 13:24:30 +00:00
Fangrui Song
c76988a0c0 [CodeGen] Fix inconsistent declaration parameter name
llvm-svn: 337200
2018-07-16 18:51:40 +00:00
Simon Pilgrim
f6cb95e1e4 [CostModel] Replace ShuffleKind::SK_Alternate with ShuffleKind::SK_Select (PR33744)
As discussed on PR33744, this patch relaxes ShuffleKind::SK_Alternate which requires shuffle masks to only match an alternating pattern from its 2 sources:

e.g. v4f32: <0,5,2,7> or <4,1,6,3>

This seems far too restrictive as most SIMD hardware which will implement it using a general blend/bit-select instruction, so replaces it with SK_Select, permitting elements from either source as long as they are inline:

e.g. v4f32: <0,5,2,7>, <4,1,6,3>, <0,1,6,7>, <4,1,2,3> etc.

This initial patch just updates the name and cost model shuffle mask analysis, later patch reviews will update SLP to better utilise this - it still limits itself to SK_Alternate style patterns.

Differential Revision: https://reviews.llvm.org/D47985

llvm-svn: 334513
2018-06-12 16:12:29 +00:00
Adrian Prantl
076a6683eb Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Matthew Simpson
0ecc1283b2 [TTI, AArch64] Add transpose shuffle kind
This patch adds a new shuffle kind useful for transposing a 2xn matrix. These
transpose shuffle masks read corresponding even- or odd-numbered vector
elements from two n-dimensional source vectors and write each result into
consecutive elements of an n-dimensional destination vector. The transpose
shuffle kind is meant to model the TRN1 and TRN2 AArch64 instructions. As such,
this patch also considers transpose shuffles in the AArch64 implementation of
getShuffleCost.

Differential Revision: https://reviews.llvm.org/D45982

llvm-svn: 330941
2018-04-26 13:48:33 +00:00
Craig Topper
0681be8ccc [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer.
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.

The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.

Differential Revision: https://reviews.llvm.org/D45017

llvm-svn: 328806
2018-03-29 17:21:10 +00:00
David Blaikie
3154efa401 Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency
Thanks to echristo for the pointers on direction.

llvm-svn: 328737
2018-03-28 22:28:50 +00:00
Krzysztof Parzyszek
bd36a69b34 [LSR] Allow giving priority to post-incrementing addressing modes
Implement TTI interface for targets to indicate that the LSR should give
priority to post-incrementing addressing modes.

Combination of patches by Sebastian Pop and Brendon Cahoon.

Differential Revision: https://reviews.llvm.org/D44758

llvm-svn: 328490
2018-03-26 13:10:09 +00:00