Commit Graph

806 Commits

Author SHA1 Message Date
Kazu Hirata
dcbeaf027c [llvm] Use pop_back_val (NFC) 2021-01-23 10:56:33 -08:00
Sanjay Patel
77d1393c15 [SLP] fix fast-math-flag propagation on FP reductions
As shown in the test diffs, we could miscompile by
propagating flags that did not exist in the original
code.

The flags required for fmin/fmax reductions will be
fixed in a follow-up patch.
2021-01-23 11:17:20 -05:00
Anton Rapetov
4ca1cdd110 [SLP] do not traverse constant uses
Walking the use list of a Constant (particularly, ConstantData)
is not scalable, since a given constant may be used by many
instructinos in many functions in many modules.

Differential Revision: https://reviews.llvm.org/D94713
2021-01-22 08:14:09 -05:00
Sanjay Patel
12a2475426 [SLP] rename reduction variable to avoid shadowing; NFC
The code structure can likely be improved now that
'OperationData' is gone.
2021-01-21 16:02:38 -05:00
Sanjay Patel
c08bbdb417 [SLP] simplify reduction matching
This is NFC-intended and removes the "OperationData"
class which had become nothing more than a recurrence
(reduction) type.

I adjusted the matching logic to distinguish
instructions from non-instructions - that's all that
the "IsLeafValue" member was keeping track of.
2021-01-21 14:58:57 -05:00
Sanjay Patel
85606c045b [SLP] reduce reduction code for checking vectorizable ops; NFC
This is another step towards removing `OperationData` and
fixing FMF matching/propagation bugs when forming reductions.
2021-01-20 11:14:48 -05:00
Sanjay Patel
aaecd838b5 [SLP] refactor more reduction functions; NFC
We were able to remove almost all of the state from
OperationData, so these don't make sense as members
of that class - just pass the RecurKind in as a param.

More streamlining is possible, but I'm trying to avoid
logic/typo bugs while fixing this. Eventually, we should
not need the `OperationData` class.
2021-01-20 11:14:48 -05:00
Sanjay Patel
8c271b4214 [SLP] move reduction createOp functions; NFC
We were able to remove almost all of the state from
OperationData, so these don't make sense as members
of that class - just pass the RecurKind in as a param.
2021-01-20 11:14:48 -05:00
Alexey Bataev
13011c3c3f Revert "[SLP]Merge reorder and reuse shuffles."
This reverts commit 438682de6a38ac97f89fa38faf5c8dc9b09cd9ad to fix the
bug with the reducing size of the resulting vector for the entry node
with multiple users.
2021-01-19 11:48:04 -08:00
Sanjay Patel
c2b5b41b35 [SLP] match maxnum/minnum intrinsics as FP reduction ops
After much refactoring over the last 2 weeks to the reduction
matching code, I think this change is finally ready.

We effectively broke fmax/fmin vector reduction optimization
when we started canonicalizing to intrinsics in instcombine,
so this should restore that functionality for SLP.

There are still FMF problems here as noted in the code comments,
but we should be avoiding miscompiles on those for fmax/fmin by
restricting to full 'fast' ops (negative tests are included).

Fixing FMF propagation is a planned follow-up.

Differential Revision: https://reviews.llvm.org/D94913
2021-01-18 17:37:16 -05:00
Sanjay Patel
7d7c35ebbc [SLP] rename reduction query for min/max ops; NFC
This will avoid confusion once we start matching
min/max intrinsics. All of these hacks to accomodate
cmp+sel idioms should disappear once we canonicalize
to min/max intrinsics.
2021-01-18 09:32:57 -05:00
Sanjay Patel
0e670a5e0a [SLP] reduce opcode API dependency in reduction cost calc; NFC
The icmp opcode is now hard-coded in the cost model call.
This will make it easier to eventually remove all opcode
queries for min/max patterns as we transition to intrinsics.
2021-01-18 09:32:57 -05:00
Sanjay Patel
96dcca47c3 [SLP] remove opcode field from reduction data class
This is NFC-intended and another step towards supporting
intrinsics as reduction candidates.

The remaining bits of the OperationData class do not make
much sense as-is, so I will try to improve that, but I'm
trying to take minimal steps because it's still not clear
how this was intended to work.
2021-01-16 13:55:52 -05:00
Sanjay Patel
8c16512468 [SLP] fix typos; NFC 2021-01-16 13:55:52 -05:00
Sanjay Patel
3a36f79c26 [SLP] remove unnecessary use of 'OperationData'
This is another NFC-intended patch to allow matching
intrinsics (example: maxnum) as candidates for reductions.

It's possible that the loop/if logic can be reduced now,
but it's still difficult to understand how this all works.
2021-01-16 13:55:52 -05:00
Sanjay Patel
7fe9e5078b [SLP] remove dead code in reduction matching; NFC
To get into this block we had: !A || B || C
and we checked C in the first 'if' clause
leaving !A || B. But the 2nd 'if' is checking:
A && !B --> !(!A || B)
2021-01-15 17:03:26 -05:00
Sanjay Patel
8de34477a1 [SLP] remove unused reduction functions; NFC
These were made obsolete by simplifying the code in recent patches.
2021-01-15 14:59:33 -05:00
Sanjay Patel
47b85ad366 [SLP] remove unnecessary state in matching reductions
This is NFC-intended. I'm still trying to figure out
how the loop where this is used works. It does not
seem like we require this data at all, but it's
hard to confirm given the complicated predicates.
2021-01-14 18:32:37 -05:00
Bjorn Pettersson
5a1624141a [SLP] Don't vectorize stores of non-packed types (like i1, i2)
In the spirit of commit fc783e91e0c0696e (llvm-svn: 248943) we
shouldn't vectorize stores of non-packed types (i.e. types that
has padding between consecutive variables in a scalar layout,
but being packed in a vector layout).

The problem was detected as a miscompile in a downstream test case.

Reviewed By: anton-afanasyev

Differential Revision: https://reviews.llvm.org/D94446
2021-01-14 11:30:33 +01:00
Sanjay Patel
781444d01f [SLP] simplify type check for reductions
This is NFC-intended. The 'valid' call allows int/FP/pointers
for other parts of SLP. The difference here is that we can't
reduce pointers.
2021-01-13 13:30:46 -05:00
Sanjay Patel
b41334d30a [SLP] reduce code duplication while processing reductions; NFC 2021-01-12 16:03:57 -05:00
Sanjay Patel
4f0882e467 [SLP] rename variable to improve readability; NFC
The OperationData in the 2nd block (visiting the operands)
is completely independent of the 1st block.
2021-01-12 16:03:57 -05:00
Sanjay Patel
edb88f4fad [SLP] reduce code duplication in processing reductions; NFC 2021-01-12 16:03:57 -05:00
Sanjay Patel
a486f56de3 [SLP] reduce code duplication while matching reductions; NFC 2021-01-12 16:03:57 -05:00
David Sherwood
c826cad841 [NFC] Remove min/max functions from InstructionCost
Removed the InstructionCost::min/max functions because it's
fine to use std::min/max instead.

Differential Revision: https://reviews.llvm.org/D94301
2021-01-11 09:00:12 +00:00
Sanjay Patel
3e75d0db00 [SLP] fix typo in assert
This snuck into 0aa75fb12faa , but I didn't catch it locally.
2021-01-10 13:15:04 -05:00
Sanjay Patel
e563a6172d [SLP] put verifyFunction call behind EXPENSIVE_CHECKS
A severe compile-time slowdown from this call is noted in:
https://llvm.org/PR48689
My naive fix was to put it under LLVM_DEBUG ( 267ff79 ),
but that's not limiting in the way we want.
This is a quick fix (or we could just remove the call completely
and rely on some later pass to discover potentially wrong IR?).
A bigger/better fix would be to improve/limit verifyFunction()
as noted in:
https://llvm.org/PR47712

Differential Revision: https://reviews.llvm.org/D94328
2021-01-10 12:32:21 -05:00
Alexander Belyaev
0d8c596c50 Revert "[SLP]Need shrink the load vector after reordering."
This reverts commit 4284afdf9432f7d756f56b0ab21d69191adefa8d.

This changes computed values in fused_batchnorm_test_cpu.

Not equal to tolerance rtol=1e-06, atol=0.001
Mismatched value: a is different from b.
not close where = (array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1]), array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1]), array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
       1, 1]), array([0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3,
       4, 5]))
not close lhs = [-0.6636615  -0.9804948  -1.148275   -0.68193716 -0.8572368  -0.65046215
 -0.6993756  -1.2244141  -1.0938729  -0.50369143 -0.51830524 -0.738452
 -0.7214286  -0.48115745 -0.9380924  -0.9341769  -0.5916775  -1.2896856
 -0.7264182  -0.9746917  -0.783249   -0.7659018  -0.86214024 -0.47784212]
not close rhs = [ 0.44102234  0.12418899 -0.04359123  0.42274666  0.24744703  0.45422167
  0.40530816 -0.11973029  0.01081094  0.6009924   0.5863786   0.3662318
  0.38325527  0.62352633  0.1665914   0.1705069   0.5130063  -0.18500176
  0.37826565  0.12999213  0.3214348   0.338782    0.24254355  0.62684166]
not close dif = [1.1046839 1.1046838 1.1046838 1.1046839 1.1046839 1.1046839 1.1046838
 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046838
 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839
 1.1046839 1.1046838 1.1046838]
not close tol = [0.00100044 0.00100012 0.00100004 0.00100042 0.00100025 0.00100045
 0.00100041 0.00100012 0.00100001 0.0010006  0.00100059 0.00100037
 0.00100038 0.00100062 0.00100017 0.00100017 0.00100051 0.00100019
 0.00100038 0.00100013 0.00100032 0.00100034 0.00100024 0.00100063]
2021-01-08 14:42:26 +01:00
Sanjay Patel
0132b1afa9 [SLP] limit verifyFunction to debug build (PR48689)
As noted in PR48689, the verifier may have some kind
of exponential behavior that should be addressed
separately. For now, only run it in debug mode to
prevent problems for release+asserts.
That limit is what we had before D80401, and I'm
not sure if there was a reason to change it in that
patch.
2021-01-08 08:10:17 -05:00
Kazu Hirata
b5d840801d [llvm] Use *Set::contains (NFC) 2021-01-07 20:29:34 -08:00
Sanjay Patel
dac6aa4e2a [SLP] remove opcode identifier for reduction; NFC
Another step towards allowing intrinsics in reduction matching.
2021-01-07 14:07:27 -05:00
Alexey Bataev
5cd8182360 [SLP]Need shrink the load vector after reordering.
After merging the shuffles, we cannot rely on the previous shuffle
anymore and need to shrink the final shuffle, if it is required.

Reported in D92668

Differential Revision: https://reviews.llvm.org/D93967
2021-01-07 04:50:48 -08:00
Oliver Stannard
88767efa6e Revert "[llvm] Use BasicBlock::phis() (NFC)"
Reverting because this causes crashes on the 2-stage buildbots, for
example http://lab.llvm.org:8011/#/builders/7/builds/1140.

This reverts commit 9b228f107d43341ef73af92865f73a9a076c5a76.
2021-01-07 09:43:33 +00:00
Kazu Hirata
8fd273682c [llvm] Use llvm::all_of (NFC) 2021-01-06 18:27:36 -08:00
Kazu Hirata
745ad84858 [llvm] Use BasicBlock::phis() (NFC) 2021-01-06 18:27:35 -08:00
Sanjay Patel
55df2038d7 [SLP] use reduction kind's opcode to create new instructions; NFC
Similar to 5a1d31a28 -
This should be no-functional-change because the reduction kind
opcodes are 1-for-1 mappings to the instructions we are matching
as reductions. But we want to remove the need for the
`OperationData` opcode field because that does not work when
we start matching intrinsics (eg, maxnum) as reduction candidates.
2021-01-06 14:37:44 -05:00
Sanjay Patel
11d66ae09c [SLP] reduce code for propagating flags on reductions; NFC
If we add/change to match intrinsics, this might get more
wordy, but there's no need to list each kind currently.
2021-01-06 14:37:44 -05:00
Juneyoung Lee
982327f66d [SLP,LV] Use poison constant vector for shufflevector/initial insertelement
This patch makes SLP and LV emit operations with initial vectors set to poison constant instead of undef.
This is a part of efforts for using poison vector instead of undef to represent "doesn't care" vector.
The goal is to make nice shufflevector optimizations valid that is currently incorrect due to the tricky interaction between undef and poison (see https://bugs.llvm.org/show_bug.cgi?id=44185 ).

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D94061
2021-01-06 11:22:50 +09:00
Sanjay Patel
9893ad0999 [SLP] reduce code for finding reduction costs; NFC
We can get both (vector/scalar) costs in a single switch
instead of sequentially.
2021-01-05 17:35:54 -05:00
Sanjay Patel
b532dcefa1 [SLP] use reduction kind's opcode for cost model queries; NFC
This should be no-functional-change because the reduction kind
opcodes are 1-for-1 mappings to the instructions we are matching
as reductions. But we want to remove the need for the
`OperationData` opcode field because that does not work when
we start matching intrinsics (eg, maxnum) as reduction candidates.
2021-01-05 15:12:40 -05:00
Sanjay Patel
305c130e7c [SLP] reduce code duplication; NFC 2021-01-05 15:12:40 -05:00
Sanjay Patel
49c1e8b6b0 [SLP] delete unused pairwise reduction option
SLP tries to model 2 forms of vector reductions: pairwise and splitting.
From the cost model code comments, those are defined using an example as:

  /// Pairwise:
  ///  (v0, v1, v2, v3)
  ///  ((v0+v1), (v2+v3), undef, undef)
  /// Split:
  ///  (v0, v1, v2, v3)
  ///  ((v0+v2), (v1+v3), undef, undef)

I don't know the full history of this functionality, but it was partly
added back in D29402. There are apparently no users at this point (no
regression tests change). X86 might have managed to work-around the need
for this through cost model and codegen improvements.

Removing this code makes it easier to continue the work that was started
in D87416 / D88193. The alternative -- if there is some target that is
silently using this option -- is to move this logic into LoopUtils. We
have related/duplicate functionality there via llvm::createTargetReduction().

Differential Revision: https://reviews.llvm.org/D93860
2021-01-05 13:23:07 -05:00
Sanjay Patel
eafc7203b6 [LoopUtils] remove redundant opcode parameter; NFC
While here, rename the inaccurate getRecurrenceBinOp()
because that was also used to get CmpInst opcodes.

The recurrence/reduction kind should always refer to the
expected opcode for a reduction. SLP appears to be the
only direct caller of createSimpleTargetReduction(), and
that calling code ideally should not be carrying around
both an opcode and a reduction kind.

This should allow us to generalize reduction matching to
use intrinsics instead of only binops.
2021-01-04 17:05:28 -05:00
Sanjay Patel
ed01d0402b [Analysis] flatten enums for recurrence types
This is almost all mechanical search-and-replace and
no-functional-change-intended (NFC). Having a single
enum makes it easier to match/reason about the
reduction cases.

The goal is to remove `Opcode` from reduction matching
code in the vectorizers because that makes it harder to
adapt the code to handle intrinsics.

The code in RecurrenceDescriptor::AddReductionVar() is
the only place that required closer inspection. It uses
a RecurrenceDescriptor and a second InstDesc to sometimes
overwrite part of the struct. It seem like we should be
able to simplify that logic, but it's not clear exactly
which cmp+sel patterns that we are trying to handle/avoid.
2021-01-01 12:20:16 -05:00
Sanjay Patel
26db8e9958 [LoopUtils] reduce FMF and min/max complexity when forming reductions
I don't know if there's some way this changes what the vectorizers
may produce for reductions, but I have added test coverage with
3567908 and 5ced712 to show that both passes already have bugs in
this area. Hopefully this does not make things worse before we can
really fix it.
2020-12-30 15:22:26 -05:00
Sanjay Patel
af4839c1c3 [SLP] replace local reduction enum with RecurrenceKind; NFCI
I'm not sure if the SLP enum was created before the IVDescriptor
RecurrenceDescriptor / RecurrenceKind existed, but the code in
SLP is now redundant with that class, so it just makes things
more complicated to have both. We eventually call LoopUtils
createSimpleTargetReduction() to create reduction ops, so we
might as well standardize on those enum names.

There's still a question of whether we need to use TTI::ReductionFlags
vs. MinMaxRecurrenceKind, but that can be another clean-up step.

Another option would just be to flatten the enums in RecurrenceDescriptor
into a single enum. There isn't much benefit (smaller switches?) to
having a min/max subset.
2020-12-29 14:52:11 -05:00
Kazu Hirata
b3da88360d [CodeGen, Transforms] Use *Map::lookup (NFC) 2020-12-27 09:57:27 -08:00
Sanjay Patel
3f4cf6da80 [SLP] rename reduction variables for readability; NFC
I am hoping to extend the reduction matching code, and it is
hard to distinguish "ReductionData" from "ReducedValueData".
So extend the tree/root metaphor to include leaves.

Another problem is that the name "OperationData" does not
provide insight into its purpose. I'm not sure if we can alter
that underlying data structure to make the code clearer.
2020-12-26 11:20:25 -05:00
Sanjay Patel
85a38d4e26 [SLP] use switch to improve readability; NFC
This will get more complicated when we handle intrinsics like maxnum.
2020-12-26 10:59:45 -05:00
Kazu Hirata
682f8913a5 [CodeGen, Transforms] Use llvm::any_of (NFC) 2020-12-24 09:08:36 -08:00