20866 Commits

Author SHA1 Message Date
Sanjay Patel
be9ce13f15 [IR] add optional parameter for copying IR flags to compare instructions
As shown, this is used to eliminate redundant code in InstCombine,
and there are more cases where we should be using this pattern, but
we're currently unintentionally dropping flags. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346282 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-07 00:00:42 +00:00
Teresa Johnson
645cd31982 [ThinLTO] Split NotEligibleToImport into legality and inlinability flags
Summary:
The NotEligibleToImport flag on the GlobalValueSummary was set if it
isn't legal to import (e.g. because it references unpromotable locals)
and when it can't be inlined (in which case importing is pointless).

I split out the inlinable piece into a separate flag on the
FunctionSummary (doesn't make sense for aliases or global variables),
because in the future we may want to import for reasons other than
inlining.

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53345

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346261 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 19:41:35 +00:00
Vedant Kumar
2b75576214 [CodeExtractor] Do not extract calls to eh_typeid_for (PR39545)
The lowering for a call to eh_typeid_for changes when it's moved from
one function to another.

There are several proposals for fixing this issue in llvm.org/PR39545.
Until some solution is in place, do not allow CodeExtractor to extract
calls to eh_typeid_for, as that results in serious miscompilations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346256 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 19:06:08 +00:00
Vedant Kumar
b109c4206a [CodeExtractor] Erase use-without-def debug intrinsics in parent func
When CodeExtractor moves instructions to a new function, debug
intrinsics referring to those instructions within the parent function
become invalid.

This results in the same verifier failure which motivated r344545, about
function-local metadata being used in the wrong function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346255 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 19:05:53 +00:00
Sanjay Patel
09d77c0de9 [InstCombine] allow vector types for fcmp+fpext fold
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346245 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 17:20:20 +00:00
Sanjay Patel
031587fccf [InstCombine] propagate fast-math-flags when folding fcmp+fpext, part 2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346242 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 16:45:27 +00:00
Sanjay Patel
ac037d267f [InstCombine] rearrange code for fcmp+fpext; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346241 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 16:37:35 +00:00
Sanjay Patel
6052aa3705 [InstCombine] propagate fast-math-flags when folding fcmp+fpext
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346240 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 16:23:03 +00:00
Sanjay Patel
d174746db8 [InstCombine] propagate fast-math-flags when folding fcmp+fneg, part 2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346238 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 15:58:57 +00:00
Sanjay Patel
c52594aa97 [InstCombine] reduce code; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346235 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 15:53:58 +00:00
Sanjay Patel
bd1a44f2b8 [InstCombine] propagate fast-math-flags when folding fcmp+fneg
This is another part of solving PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475

This might be enough to fix that particular issue, but as noted
with the FIXME, we're still dropping FMF on other folds around here.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346234 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 15:49:45 +00:00
Simon Pilgrim
241a3bcfa8 [InstCombine] Ensure nested shifts are in range (OSS-Fuzz #9880)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346225 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 11:28:22 +00:00
Max Kazantsev
bd5ea2b6c2 [LICM] Remove too conservative IsMustExecute variable
LICM relies on variable `MustExecute` which is conservatively set to `false`
in all non-headers. It is used when we decide whether or not we want to hoist
an instruction or a guard.

For the guards, it might be too conservative to use this variable, we can
instead use a more precise logic from LoopSafetyInfo. Currently it is only NFC
because `IsMemoryNotModified` is also conservatively set to `false` for all
non-headers, and we cannot hoist guards from non-header blocks. However once we
give up using `IsMemoryNotModified` and use a smarter check instead, this will
allow us to hoist guards from all mustexecute non-header blocks.

Differential Revision: https://reviews.llvm.org/D50888
Reveiwed By: fedor.sergeev


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346204 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 04:17:40 +00:00
Max Kazantsev
5927508e1d [LICM] Use ICFLoopSafetyInfo in LICM
This patch makes LICM use `ICFLoopSafetyInfo` that is a smarter version
of LoopSafetyInfo that leverages power of Implicit Control Flow Tracking
to keep track of throwing instructions and give less pessimistic answers
to queries related to throws.

The ICFLoopSafetyInfo itself has been introduced in rL344601. This patch
enables it in LICM only.

Differential Revision: https://reviews.llvm.org/D50377
Reviewed By: apilipenko


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346201 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 02:44:49 +00:00
Max Kazantsev
a61bdda557 Revert "[IndVars] Smart hard uses detection"
This reverts commit 2f425e9c7946b9d74e64ebbfa33c1caa36914402.

It seems that the check that we still should do the transform if we
know the result is constant is missing in this code. So the logic that
has been deleted by this change is still sometimes accidentally useful.
I revert the change to see what can be done about it. The motivating
case is the following:

@Y = global [400 x i16] zeroinitializer, align 1

define i16 @foo() {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %i = phi i16 [ 0, %entry ], [ %inc, %for.body ]

  %arrayidx = getelementptr inbounds [400 x i16], [400 x i16]* @Y, i16 0, i16 %i
  store i16 0, i16* %arrayidx, align 1
  %inc = add nuw nsw i16 %i, 1
  %cmp = icmp ult i16 %inc, 400
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  %inc.lcssa = phi i16 [ %inc, %for.body ]
  ret i16 %inc.lcssa
}

We should be able to figure out that the result is constant, but the patch
breaks it.

Differential Revision: https://reviews.llvm.org/D51584


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346198 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-06 02:02:05 +00:00
Sanjay Patel
e357c8b879 [InstSimplify] fold select (fcmp X, Y), X, Y
This is NFCI for InstCombine because it calls InstSimplify, 
so I left the tests for this transform there. As noted in
the code comment, we can allow this fold more often by using
FMF and/or value tracking.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346169 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-05 21:51:39 +00:00
Taewook Oh
960c599319 [MergeICmps] Do not perform the transformation if GEP is used outside of block
Summary:
This patch prevents MergeICmps to performn the transformation if the address operand GEP of the load instruction has a use outside of the load's parent block. Without this patch, compiler crashes with the given test case because the use of `%first.i` is still around when the basic block is erased from https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Scalar/MergeICmps.cpp#L620. I think checking `isUsedOutsideOfBlock` with `GEP` is the original intention of the code, as the checking for `LoadI` is already performed in the same function.

This patch is incomplete though, as this makes the pass overly conservative and fails the test `tuple-four-int8.ll`. I believe what needs to be done is checking if GEP has a use outside of block that is not the part of "Comparisons" chain. Submit the patch as of now to prevent compiler crash.

Reviewers: courbet, trentxintong

Reviewed By: courbet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54089

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346151 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-05 18:16:32 +00:00
Sanjay Patel
ab049d88fa [InstCombine] canonicalize -0.0 to +0.0 in fcmp
As stated in IEEE-754 and discussed in:
https://bugs.llvm.org/show_bug.cgi?id=38086
...the sign of zero does not affect any FP compare predicate.

Known regressions were fixed with:
rL346097 (D54001)
rL346143

The transform will help reduce pattern-matching complexity to solve:
https://bugs.llvm.org/show_bug.cgi?id=39475
...as well as improve CSE and codegen (a zero constant is almost always
easier to produce than 0x80..00).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346147 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-05 17:26:42 +00:00
Sanjay Patel
60a9b3360d [InstCombine] loosen FP 0.0 constraint for fcmp+select substitution
It looks like we correctly removed edge cases with 0.0 from D50714,
but we were a bit conservative because getBinOpIdentity() doesn't
distinguish between +0.0 and -0.0 and 'nsz' is effectively always
true for fcmp (see discussion in:
https://bugs.llvm.org/show_bug.cgi?id=38086

Without this change, we would get regressions by canonicalizing
to +0.0 in all fcmp, and that's a step towards solving:
https://bugs.llvm.org/show_bug.cgi?id=39475



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346143 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-05 16:50:44 +00:00
Vedant Kumar
fd24147338 [HotColdSplitting] Use TTI to inform outlining threshold
Using TargetTransformInfo allows the splitting pass to factor in the
code size cost of instructions as it decides whether or not outlining is
profitable.

This did not regress the overall amount of outlining seen on the handful
of internal frameworks I tested.

Thanks to Jun Bum Lim for suggesting this!

Differential Revision: https://reviews.llvm.org/D53835

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346108 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-04 23:11:57 +00:00
Jordan Rupprecht
b11a4e59cc [DebugInfo][InstMerge] Fix -debugify for phi node created by -mldst-motion
Summary:
-mldst-motion creates a new phi node without any debug info. Use the merged debug location from the incoming stores to fix this.

Fixes PR38177. The test case here is (somewhat) simplified from:

```
struct S {
  int foo;
  void fn(int bar);
};
void S::fn(int bar) {
  if (bar)
    foo = 1;
  else
    foo = 0;
}
```

Reviewers: dblaikie, gbedwell, aprantl, vsk

Reviewed By: vsk

Subscribers: vsk, JDevlieghere, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D54019

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346027 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-02 18:25:41 +00:00
Ayal Zaks
6d0de65682 [LV] Avoid vectorizing loops under opt for size that involve SCEV checks
Fix PR39417, PR39497

The loop vectorizer may generate runtime SCEV checks for overflow and stride==1
cases, leading to execution of original scalar loop. The latter is forbidden
when optimizing for size. An assert introduced in r344743 triggered the above
PR's showing it does happen. This patch fixes this behavior by preventing
vectorization in such cases.

Differential Revision: https://reviews.llvm.org/D53612


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345959 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-02 09:16:12 +00:00
Max Kazantsev
c46ca76214 [NFC][LICM] Factor out instruction erasing logic
This patch factors out a function that makes all required updates
whenever an instruction gets erased.

Differential Revision: https://reviews.llvm.org/D54011
Reviewed By: apilipenko


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345914 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-02 00:21:45 +00:00
Florian Hahn
797cdde77b [LoopInterchange] Fix unused variables in release build
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345881 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-01 19:51:13 +00:00
Florian Hahn
f31487a7c3 [LoopInterchange] Remove support for inner-only reductions.
Inner-loop only reductions require additional checks to make sure they
form a load-phi-store cycle across inner and outer loop. Otherwise the
reduction value is not properly preserved. This patch disables
interchanging such loops for now, as it causes miscompiles in some
cases and it seems to apply only for a tiny amount of loops. Across the
test-suite, SPEC2000 and SPEC2006, 61 instead of 62 loops are
interchange with inner loop reduction support disabled. With
-loop-interchange-threshold=-1000, 3256 instead of 3267.

See the discussion and history of D53027 for an outline of how such legality
checks could look like.

Reviewers: efriedma, mcrosier, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D53027


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345877 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-01 19:25:00 +00:00
Reid Kleckner
cf5c71234a Remove unnecessary fallthrough annotation after unreachable
Clang's -Wimplicit-fallthrough implementation warns on this. I built
clang with GCC 7.3 in +asserts and -asserts mode, and GCC doesn't warn
on this in either configuration. I think it is unnecessary. I separated
it from the large mechanical patch (https://reviews.llvm.org/D53950) in
case I am wrong and it has to be reverted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345876 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-01 19:11:05 +00:00
Max Kazantsev
1edc3c60f3 [NFC] Reorganize code to prepare it for more transforms
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345820 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-01 09:42:50 +00:00
Max Kazantsev
2f425e9c79 [IndVars] Smart hard uses detection
When rewriting loop exit values, IndVars considers this transform not profitable if
the loop instruction has a loop user which it believes cannot be optimized away.
In current implementation only calls that immediately use the instruction are considered
as such.

This patch extends the definition of "hard" users to any side-effecting instructions
(which usually cannot be optimized away from the loop) and also allows handling
of not just immediate users, but use chains.

Differentlai Revision: https://reviews.llvm.org/D51584
Reviewed By: etherzhhb


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345814 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-01 06:47:01 +00:00
Volkan Keles
b5378372b8 [InstCombine] Combine nested min/max intrinsics with constants
Reviewers: arsenm, spatel

Reviewed By: spatel

Subscribers: lebedev.ri, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D53774

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345751 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-31 17:50:52 +00:00
Sanjay Patel
22968f72bd [InstCombine] refactor fabs+fcmp fold; NFC
Also, remove/replace/minimize/enhance the tests for this fold.
The code drops FMF, so it needs more tests and at least 1 fix.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345734 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-31 16:34:43 +00:00
Sanjay Patel
123a45feb6 [InstCombine] add assertion that InstSimplify has folded a fabs+fcmp; NFC
The 'OLT' case was updated at rL266175, so I assume it was just an
oversight that 'UGE' was not included because that patch handled
both predicates in InstSimplify.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345727 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-31 15:31:45 +00:00
Sanjay Patel
1ef057dce8 [InstSimplify] fold 'fcmp nnan oge X, 0.0' when X is not negative
This re-raises some of the open questions about how to apply and use fast-math-flags in IR from PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086
...but given the current implementation (no FMF on casts), this is likely the only way to predicate the 
transform.

This is part of solving PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475

Differential Revision: https://reviews.llvm.org/D53874


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345725 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-31 14:57:23 +00:00
Fedor Sergeev
2f9c8a0c1d [LoopUnroll] allow customization for new-pass-manager version of LoopUnroll
Unlike its legacy counterpart new pass manager's LoopUnrollPass does
not provide any means to select which flavors of unroll to run
(runtime, peeling, partial), relying on global defaults.

In some cases having ability to run a restricted LoopUnroll that
does more than LoopFullUnroll is needed.

Introduced LoopUnrollOptions to select optional unroll behaviors.
Added 'unroll<peeling>' to PassRegistry mainly for the sake of testing.

Reviewers: chandlerc, tejohnson
Differential Revision: https://reviews.llvm.org/D53440

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345723 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-31 14:33:14 +00:00
Max Kazantsev
b1ebaf21d0 [IndVars] Strengthen restricton in rewriteLoopExitValues
For some unclear reason rewriteLoopExitValues considers recalculation
after the loop profitable if it has some "soft uses" outside the loop (i.e. any
use other than call and return), even if we have proved that it has a user inside
the loop which we think will not be optimized away.

There is no existing unit test that would explain this. This patch provides an
example when rematerialisation of exit value is not profitable but it passes
this check due to presence of a "soft use" outside the loop.

It makes no sense to recalculate value on exit if we are going to compute it
due to some irremovable within the loop. This patch disallows applying this
transform in the described situation.

Differential Revision: https://reviews.llvm.org/D51581
Reviewed By: etherzhhb


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345708 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-31 10:30:50 +00:00
Dorit Nuzman
06bac6c858 [LV] Support vectorization of interleave-groups that require an epilog under
optsize using masked wide loads 

Under Opt for Size, the vectorizer does not vectorize interleave-groups that
have gaps at the end of the group (such as a loop that reads only the even
elements: a[2*i]) because that implies that we'll require a scalar epilogue
(which is not allowed under Opt for Size). This patch extends the support for
masked-interleave-groups (introduced by D53011 for conditional accesses) to
also cover the case of gaps in a group of loads; Targets that enable the
masked-interleave-group feature don't have to invalidate interleave-groups of
loads with gaps; they could now use masked wide-loads and shuffles (if that's
what the cost model selects).

Reviewers: Ayal, hsaito, dcaballe, fhahn

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D53668



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345705 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-31 09:57:56 +00:00
Alexander Potapenko
761dc549d1 [MSan] another take at instrumenting inline assembly - now with calls
Turns out it's not always possible to figure out whether an asm()
statement argument points to a valid memory region.
One example would be per-CPU objects in the Linux kernel, for which the
addresses are calculated using the FS register and a small offset in the
.data..percpu section.
To avoid pulling all sorts of checks into the instrumentation, we replace
actual checking/unpoisoning code with calls to
msan_instrument_asm_load(ptr, size) and
msan_instrument_asm_store(ptr, size) functions in the runtime.

This patch doesn't implement the runtime hooks in compiler-rt, as there's
been no demand in assembly instrumentation for userspace apps so far.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345702 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-31 09:32:47 +00:00
Matthias Braun
b3bc95870d ADT/STLExtras: Introduce llvm::empty; NFC
This is modeled after C++17 std::empty().

Differential Revision: https://reviews.llvm.org/D53909

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345679 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-31 00:23:23 +00:00
Sanjay Patel
ef97784739 [InstCombine] use 'match' to reduce code; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345647 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-30 20:52:25 +00:00
Quentin Colombet
044ef75cd4 [InstCombine] Teach the move free before null test opti how to deal with noop casts
InstCombine features an optimization that essentially replaces:
if (a)
  free(a)
into:
free(a)

Right now, this optimization is gated by the minsize attribute and therefore
we only perform it if we can prove that we are going to be able to eliminate
the branch and the destination block.

However when casts are involved the optimization would fail to apply, because
the optimization was not smart enough to realize that it is possible to also
move the casts away from the destination block and that is harmless to the
performance since they are just noops.
E.g.,
foo(int *a)
if (a)
  free((char*)a)

Wouldn't be optimized by instcombine, because
- We would refuse to hoist the `bitcast i32* %a to i8` in the source block
- We would fail to see that `bitcast i32* %a to i8` and %a are the same value.

This patch fixes both these problems:
- It teaches the pattern matching of the comparison how to look
  through casts.
- It checks that whether the additional instruction in the destination block
  can be hoisted and are harmless performance-wise.
- It hoists all the code of the destination block in the source block.

Differential Revision: D53356

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345644 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-30 20:51:04 +00:00
Calixte Denizet
14842a4c81 [GCOV] Function counters are wrong when on one line
Summary:
After commit https://reviews.llvm.org/rL344228, the function definitions have a counter but when on one line the counter is wrong (e.g. void foo() { })
I added a test in: https://reviews.llvm.org/D53601

Reviewers: marco-c

Reviewed By: marco-c

Subscribers: llvm-commits, sylvestre.ledru

Differential Revision: https://reviews.llvm.org/D53600

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345624 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-30 18:41:31 +00:00
Simon Pilgrim
38ad550dd9 [TTI] Fix uses of SK_ExtractSubvector shuffle costs (PR39368)
Correct costings of SK_ExtractSubvector requires the SubTy argument to indicate the type/size of the extracted subvector.

Unlike the rest of the shuffle kinds this means that the main Ty argument represents the source vector type not the destination!

I've done my best to fix a number of vectorizer uses:

SLP - the reduction epilogue costs should be using a SK_PermuteSingleSrc shuffle as these all occur at the hardware vector width - we're not extracting (illegal) subvector types. This is causing the cost model diffs as SK_ExtractSubvector costs are poorly handled and tend to just return 1 at the moment.

LV - I'm not clear on what the SK_ExtractSubvector should represents for recurrences - I've used a <1 x ?> subvector extraction as that seems to match the VF delta.

Differential Revision: https://reviews.llvm.org/D53573

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345617 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-30 18:10:02 +00:00
Sanjay Patel
7e79132257 [InstCombine] use getFltSemantics() instead of duplicating it; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345613 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-30 16:21:56 +00:00
Sanjay Patel
36d678978c [InstCombine] try to turn shuffle into insertelement
shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC'

The motivating case is at least a couple of steps away: I noticed that
SLPVectorizer does not analyze shuffles as well as sequences of 
insert/extract in PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724
...so SLP may fail to vectorize when source code has shuffles to start 
with or instcombine has converted insert/extract to shuffles.

Independent of that, an insertelement is always a simpler op for IR 
analysis vs. a shuffle, so we should transform to insert when possible.

I don't think there's any codegen concern here - if a target can't insert 
a scalar directly to some fixed element in a vector (x86?), then this 
should get expanded to the insert+shuffle that we started with.

Differential Revision: https://reviews.llvm.org/D53507


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345607 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-30 15:26:39 +00:00
Jonas Paulsson
ccd4f446eb [LoopVectorizer] Fix for cost values of memory accesses.
This commit is a combination of two patches:

* "Fix in getScalarizationOverhead()"

   If target returns false in TTI.prefersVectorizedAddressing(), it means the
   address registers will not need to be extracted. Therefore, there should
   be no operands scalarization overhead for a load instruction.

* "Don't pass the instruction pointer from getMemInstScalarizationCost."

   Since VF is always > 1, this is a cost query for an instruction in the
   vectorized loop and it should not be evaluated within the scalar
   context of the instruction.

Review: Ulrich Weigand, Hal Finkel
https://reviews.llvm.org/D52351
https://reviews.llvm.org/D52417

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345603 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-30 14:34:15 +00:00
Nicola Zaghen
8125e33cdb [SROA] Use offset sizes from the DataLayout instead of the pointer siezes.
This fixes an assertion when constant folding a GEP when the part of the offset
was in i32 (IndexSize, as per DataLayout) and part in the i64 (PointerSize) in
the newly created test case.

Differential Revision: https://reviews.llvm.org/D52609



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345585 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-30 11:15:04 +00:00
Vedant Kumar
c65373c098 [HotColdSplitting] Allow outlining single-block cold regions
It can be profitable to outline single-block cold regions because they
may be large.

Allow outlining single-block regions if they have over some threshold of
non-debug, non-terminator instructions. I chose 3 as the threshold after
experimenting with several internal frameworks.

In practice, reducing the threshold further did not give much
improvement, whereas increasing it resulted in substantial regressions.

Differential Revision: https://reviews.llvm.org/D53824

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345524 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-29 19:15:39 +00:00
Florian Hahn
196c060529 [Local] Keep K's range if K does not move when combining metadata.
As K has to dominate I, IIUC I's range metadata must be a subset of
K's. After Eli's recent clarification to the LangRef, loading a value
outside of the range is undefined behavior.
Therefore if I's range contains elements outside of K's range and we would load
one such value, K would cause undefined behavior.

In cases like hoisting/sinking, we still want the most generic range
over all code paths to/from the hoist/sink point. As suggested in the
patches related to D47339, I will refactor the handling of those
scenarios and try to decouple it from this function as follow up, once
we switched to a similar handling of metadata in most of
combineMetadata.

I updated some tests checking mostly the merging of metadata to keep the
metadata of to dominating load. The most interesting one is probably test8 in
test/Transforms/JumpThreading/thread-loads.ll. It contained a comment
about the alias metadata preventing us to eliminate the branch, but it
seem like the actual problem currently is that we merge the ranges of
both loads and cannot eliminate the icmp afterwards. With this patch, we
manage to eliminate the icmp, as the range of the first load excludes 8.

Reviewers: efriedma, nlopes, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D51629


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345456 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-27 16:53:45 +00:00
Simon Pilgrim
8177736107 Fix -Wdocumentation warning. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345454 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-27 15:14:42 +00:00
Leonard Chan
b49d1c1b41 Revert "[PassManager/Sanitizer] Enable usage of ported AddressSanitizer passes with -fsanitize=address"
This reverts commit 8d6af840396f2da2e4ed6aab669214ae25443204 and commit
b78d19c287b6e4a9abc9fb0545de9a3106d38d3d which causes slower build times
by initializing the AddressSanitizer on every function run.

The corresponding revisions are https://reviews.llvm.org/D52814 and
https://reviews.llvm.org/D52739.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345433 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-26 22:51:51 +00:00
Christy Lee
d87fb985a5 Pointer types were treated as zero-size by MergeICmps
Summary:
The visitICmp analysis function would record compares of pointer types, as size 0. This causes the resulting memcmp() call to have the wrong total size.
Found with "self-build" of clang/LLVM on Windows.

Reviewers: christylee, trentxintong, courbet

Reviewed By: courbet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53536

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345413 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-26 18:02:06 +00:00