20275 Commits

Author SHA1 Message Date
Peter Collingbourne
43c7ac5894 Change the cap on the amount of padding for each vtable to 32-byte (previously it was 128-byte)
We tested different cap values with a recent commit of Chromium. Our results show that the 32-byte cap yields the smallest binary and all the caps yield similar performance.
Based on the results, we propose to change the cap value to 32-byte.

Patch by Zhaomo Yang!

Differential Revision: https://reviews.llvm.org/D49405

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337622 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 21:43:20 +00:00
Roman Tereshin
6bb56ed117 Reapply "[LSV] Refactoring + supporting bitcasts to a type of different size"
This reapplies commit r337489 reverted by r337541
Additionally, this commit contains a speculative fix to the issue reported in r337541
(the report does not contain an actionable reproducer, just a stack trace)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337606 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 20:10:04 +00:00
Alexander Potapenko
86a77cda68 [MSan] Hotfix compilation
Make sure NewSI is used in materializeStores()


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337577 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 16:52:12 +00:00
Alexander Potapenko
7d86834e82 [MSan] run materializeChecks() before materializeStores()
When pointer checking is enabled, it's important that every pointer is
checked before its value is used.
For stores MSan used to generate code that calculates shadow/origin
addresses from a pointer before checking it.
For userspace this isn't a problem, because the shadow calculation code
is quite simple and compiler is able to move it after the check on -O2.
But for KMSAN getShadowOriginPtr() creates a runtime call, so we want the
check to be performed strictly before that call.

Swapping materializeChecks() and materializeStores() resolves the issue:
both functions insert code before the given IR location, so the new
insertion order guarantees that the code calculating shadow address is
between the address check and the memory access.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337571 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 16:28:49 +00:00
Florian Hahn
59b3465a59 [IPSCCP] Fix for bot failure caused by r337548
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337554 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 14:37:10 +00:00
Florian Hahn
1ba71c6ab7 Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters.
This version contains a fix to add values for which the state in ParamState change
to the worklist if the state in ValueState did not change. To avoid adding the
same value multiple times, mergeInValue returns true, if it added the value to
the worklist. The value is added to the worklist depending on its state in
ValueState.

Original message:
For comparisons with parameters, we can use the ParamState lattice
elements which also provide constant range information. This improves
the code for PR33253 further and gets us closer to use
ValueLatticeElement for all values.

Also, as we are using the range information in the solver directly, we
do not need tryToReplaceWithConstantRange afterwards anymore.

Reviewers: dberlin, mssimpso, davide, efriedma

Reviewed By: mssimpso

Differential Revision: https://reviews.llvm.org/D43762


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337548 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 13:29:12 +00:00
Sam McCall
1642979851 Revert "[LSV] Refactoring + supporting bitcasts to a type of different size"
This reverts commit r337489.
It causes asserts to fire in some TensorFlow tests, e.g.
tensorflow/compiler/tests/gather_test.py on GPU.

Example stack trace:
Start test case: GatherTest.testHigherRank
assertion failed at third_party/llvm/llvm/lib/Support/APInt.cpp:819 in llvm::APInt llvm::APInt::trunc(unsigned int) const: width && "Can't truncate to 0 bits"
    @     0x5559446ebe10  __assert_fail
    @     0x55593ef32f5e  llvm::APInt::trunc()
    @     0x55593d78f86e  (anonymous namespace)::Vectorizer::lookThroughComplexAddresses()
    @     0x55593d78f2bc  (anonymous namespace)::Vectorizer::areConsecutivePointers()
    @     0x55593d78d128  (anonymous namespace)::Vectorizer::isConsecutiveAccess()
    @     0x55593d78c926  (anonymous namespace)::Vectorizer::vectorizeInstructions()
    @     0x55593d78c221  (anonymous namespace)::Vectorizer::vectorizeChains()
    @     0x55593d78b948  (anonymous namespace)::Vectorizer::run()
    @     0x55593d78b725  (anonymous namespace)::LoadStoreVectorizer::runOnFunction()
    @     0x55593edf4b17  llvm::FPPassManager::runOnFunction()
    @     0x55593edf4e55  llvm::FPPassManager::runOnModule()
    @     0x55593edf563c  (anonymous namespace)::MPPassManager::runOnModule()
    @     0x55593edf5137  llvm::legacy::PassManagerImpl::run()
    @     0x55593edf5b71  llvm::legacy::PassManager::run()
    @     0x55593ced250d  xla::gpu::IrDumpingPassManager::run()
    @     0x55593ced5033  xla::gpu::(anonymous namespace)::EmitModuleToPTX()
    @     0x55593ced40ba  xla::gpu::(anonymous namespace)::CompileModuleToPtx()
    @     0x55593ced33d0  xla::gpu::CompileToPtx()
    @     0x55593b26b2a2  xla::gpu::NVPTXCompiler::RunBackend()
    @     0x55593b21f973  xla::Service::BuildExecutable()
    @     0x555938f44e64  xla::LocalService::CompileExecutable()
    @     0x555938f30a85  xla::LocalClient::Compile()
    @     0x555938de3c29  tensorflow::XlaCompilationCache::BuildExecutable()
    @     0x555938de4e9e  tensorflow::XlaCompilationCache::CompileImpl()
    @     0x555938de3da5  tensorflow::XlaCompilationCache::Compile()
    @     0x555938c5d962  tensorflow::XlaLocalLaunchBase::Compute()
    @     0x555938c68151  tensorflow::XlaDevice::Compute()
    @     0x55593f389e1f  tensorflow::(anonymous namespace)::ExecutorState::Process()
    @     0x55593f38a625  tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady()::$_1::operator()()
*** SIGABRT received by PID 7798 (TID 7837) from PID 7798; ***

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337541 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 12:03:00 +00:00
Eli Friedman
88c3cbe6a7 [SCCP] Don't use markForcedConstant on branch conditions.
It's more aggressive than we need to be, and leads to strange
workarounds in other places like call return value inference. Instead,
just directly mark an edge viable.

Tests by Florian Hahn.

Differential Revision: https://reviews.llvm.org/D49408



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337507 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 23:02:07 +00:00
Roman Tereshin
b2f9f92413 [LSV] Refactoring + supporting bitcasts to a type of different size
This is mostly a preparation work for adding a limited support for
select instructions. It proved to be difficult to do due to size and
irregularity of Vectorizer::isConsecutiveAccess, this is fixed here I
believe.

It also turned out that these changes make it simpler to finish one of
the TODOs and fix a number of other small issues, namely:

1. Looking through bitcasts to a type of a different size (requires
careful tracking of the original load/store size and some math
converting sizes in bytes to expected differences in indices of GEPs).

2. Reusing partial analysis of pointers done by first attempt in proving
them consecutive instead of starting from scratch. This added limited
support for nested GEPs co-existing with difficult sext/zext
instructions. This also required a careful handling of negative
differences between constant parts of offsets.

3. Handing a case where the first pointer index is not an add, but
something else (a function parameter for instance).

I observe an increased number of successful vectorizations on a large
set of shader programs. Only few shaders are affected, but those that
are affected sport >5% less loads and stores than before the patch.

Reviewed By: rampitec

Differential-Revision: https://reviews.llvm.org/D49342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337489 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 19:42:43 +00:00
Farhana Aleen
ae72a8c570 [LoadStoreVectorizer] Use getMinusScev() to compute the distance between two pointers.
Summary: Currently, isConsecutiveAccess() detects two pointers(PtrA and PtrB) as consecutive by
         comparing PtrB with BaseDelta+PtrA. This works when both pointers are factorized or
         both of them are not factorized. But isConsecutiveAccess() fails if one of the
         pointers is factorized but the other one is not.

         Here is an example:
         PtrA = 4 * (A + B)
         PtrB = 4 + 4A + 4B

         This patch uses getMinusSCEV() to compute the distance between two pointers.
         getMinusSCEV() allows combining the expressions and computing the simplified distance.

Author: FarhanaAleen

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D49516

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337471 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 16:50:27 +00:00
Teresa Johnson
2a478d43e1 [ThinLTO] Enable ThinLTO WholeProgramDevirt and LowerTypeTests in new PM
Summary:
Enable these passes for CFI and WPD in ThinLTO and LTO with the new pass
manager. Add a couple of tests for both PMs based on the clang tests
tools/clang/test/CodeGen/thinlto-distributed-cfi*.ll, but just test
through llvm-lto2 and not with distributed ThinLTO.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337461 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 14:51:32 +00:00
Peter Collingbourne
e601e5fe08 Rename __asan_gen_* symbols to ___asan_gen_*.
This prevents gold from printing a warning when trying to export
these symbols via the asan dynamic list after ThinLTO promotes them
from private symbols to external symbols with hidden visibility.

Differential Revision: https://reviews.llvm.org/D49498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337428 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 22:23:14 +00:00
Xin Tong
5aa954ccaf Skip debuginfo intrinsic in markLiveBlocks.
Summary:
The optimizer is 10%+ slower with vs without debuginfo. I started checking where
the difference is coming from.

I compiled sqlite3.c with and without debug info from CTMark and compare the time difference.

I use Xcode Instrument to find where time is spent. This brings about 20ms, out of ~20s.

Reviewers: davide, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D49337

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337416 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 18:40:45 +00:00
Simon Pilgrim
1a0909c7b2 [SLPVectorizer] Avoid duplicate scalar cost calculations in BoUpSLP::getEntryCost. NFCI.
Pulled out from D49225, we have a lot of repeated scalar cost calculations, often with arguments that don't look the same but turn out to be.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337390 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 13:53:55 +00:00
Roman Lebedev
1cd41cb270 [InstCombine] Re-commit: Fold 'check for [no] signed truncation' pattern
Summary:
[[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]]

As discussed in https://reviews.llvm.org/D49179#1158957 and later,
the IR for 'check for [no] signed truncation' pattern can be improved:
https://rise4fun.com/Alive/gBf
^ that pattern will be produced by Implicit Integer Truncation sanitizer,
https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530
in signed case, therefore it is probably a good idea to improve it.

The DAGCombine will reverse this transform, see
https://reviews.llvm.org/D49266

This transform is surprisingly frustrating.
This does not deal with non-splat shift amounts, or with undef shift amounts.
I've outlined what i think the solution should be:
```
  // Potential handling of non-splats: for each element:
  //  * if both are undef, replace with constant 0.
  //    Because (1<<0) is OK and is 1, and ((1<<0)>>1) is also OK and is 0.
  //  * if both are not undef, and are different, bailout.
  //  * else, only one is undef, then pick the non-undef one.
```

This is a re-commit, as the original patch, committed in rL337190
was reverted in rL337344 as it broke chromium build:
https://bugs.llvm.org/show_bug.cgi?id=38204 and
https://crbug.com/864832
Proofs that the fixed folds are ok: https://rise4fun.com/Alive/VYM

Differential Revision: https://reviews.llvm.org/D49320

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337376 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 10:55:17 +00:00
Bob Haarman
adf4ac8b22 Revert "[InstCombine] Fold 'check for [no] signed truncation' pattern"
This reverts r337190 (and a few follow-up commits), which caused the
Chromium build to fail. See
https://bugs.llvm.org/show_bug.cgi?id=38204 and
https://crbug.com/864832

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337344 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 02:18:28 +00:00
Vedant Kumar
f241fec5ea [InstCombine] Preserve debug value when simplifying cast-of-select
InstCombine has a cast transform that matches a cast-of-select:

  Orig = cast (Src = select Cond TV FV)

And tries to replace it with a select which has the cast folded in:

  NewSel = select Cond (cast TV) (cast FV)

The combiner does RAUW(Orig, NewSel), so any debug values for Orig would
survive the transform. But debug values for Src would be lost.

This patch teaches InstCombine to replace all debug uses of Src with
NewSel (taking care of doing any necessary DIExpression rewriting).

Differential Revision: https://reviews.llvm.org/D49270

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337310 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-17 18:08:36 +00:00
Florian Hahn
35603461c5 [IPSCCP] Run Solve each time we resolved an undef in a function.
Once we resolved an undef in a function we can run Solve, which could
lead to finding a constant return value for the function, which in turn
could turn undefs into constants in other functions that call it, before
resolving undefs there.

Computationally the amount of work we are doing stays the same, just the
order we process things is slightly different and potentially there are
a few less undefs to resolve.

We are still relying on the order of functions in the IR, which means
depending on the order, we are able to resolve the optimal undef first
or not. For example, if @test1 comes before @testf, we find the constant
return value of @testf too late and we cannot use it while solving
@test1.

This on its own does not lead to more constants removed in the
test-suite, probably because currently we have to be very lucky to visit
applicable functions in the right order.

Maybe we manage to come up with a better way of resolving undefs in more
'profitable' functions first.

Reviewers: efriedma, mssimpso, davide

Reviewed By: efriedma, davide

Differential Revision: https://reviews.llvm.org/D49385


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337283 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-17 14:04:59 +00:00
Simon Pilgrim
de720479bb [SLPVectorizer] Don't attempt horizontal reduction on pointer types (PR38191)
TTI::getMinMaxReductionCost typically can't handle pointer types - until this is changed its better to limit horizontal reduction to integer/float vector types only.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337280 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-17 13:43:33 +00:00
whitequark
938172a55a [LLVM-C] Fix name mangling on AggressiveInstCombine
Similarly to rL336736, at least one more C API function does not
properly get declared as extern "C" due to a missing header, causing
name mangling and linking errors.

This patch fixes calls to LLVMAddAggressiveInstCombinerPass().

Differential Revision: https://reviews.llvm.org/D49416

Reviewed By: whitequark

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337264 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-17 11:13:58 +00:00
Simon Pilgrim
15fa57ae79 Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337257 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-17 09:39:55 +00:00
Roman Lebedev
a5425a350e [InstCombine] Fold 'check for [no] signed truncation' pattern
Summary:
[[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]]

As discussed in https://reviews.llvm.org/D49179#1158957 and later,
the IR for 'check for [no] signed truncation' pattern can be improved:
https://rise4fun.com/Alive/gBf
^ that pattern will be produced by Implicit Integer Truncation sanitizer,
https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530
in signed case, therefore it is probably a good idea to improve it.

Proofs for this transform: https://rise4fun.com/Alive/mgu
This transform is surprisingly frustrating.
This does not deal with non-splat shift amounts, or with undef shift amounts.
I've outlined what i think the solution should be:
```
  // Potential handling of non-splats: for each element:
  //  * if both are undef, replace with constant 0.
  //    Because (1<<0) is OK and is 1, and ((1<<0)>>1) is also OK and is 0.
  //  * if both are not undef, and are different, bailout.
  //  * else, only one is undef, then pick the non-undef one.
```

The DAGCombine will reverse this transform, see
https://reviews.llvm.org/D49266

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: JDevlieghere, rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D49320

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337190 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 16:45:42 +00:00
Teresa Johnson
92f5878901 Restore "[ThinLTO] Ensure we always select the same function copy to import"
This reverts commit r337081, therefore restoring r337050 (and fix in
r337059), with test fix for bot failure described after the original
description below.

In order to always import the same copy of a linkonce function,
even when encountering it with different thresholds (a higher one then a
lower one), keep track of the summary we decided to import.
This ensures that the backend only gets a single definition to import
for each GUID, so that it doesn't need to choose one.

Move the largest threshold the GUID was considered for import into the
current module out of the ImportMap (which is part of a larger map
maintained across the whole index), and into a new map just maintained
for the current module we are computing imports for. This saves some
memory since we no longer have the thresholds maintained across the
whole index (and throughout the in-process backends when doing a normal
non-distributed ThinLTO build), at the cost of some additional
information being maintained for each invocation of ComputeImportForModule
(the selected summary pointer for each import).

There is an additional map lookup for each callee being considered for
importing, however, this was able to subsume a map lookup in the
Worklist iteration that invokes computeImportForFunction. We also are
able to avoid calling selectCallee if we already failed to import at the
same or higher threshold.

I compared the run time and peak memory for the SPEC2006 471.omnetpp
benchmark (running in-process ThinLTO backends), as well as for a large
internal benchmark with a distributed ThinLTO build (so just looking at
the thin link time/memory). Across a number of runs with and without
this change there was no significant change in the time and memory.

(I tried a few other variations of the change but they also didn't
improve time or peak memory).

The new commit removes a test that no longer makes sense
(Transforms/FunctionImport/hotness_based_import2.ll), as exposed by the
reverse-iteration bot. The test depends on the order of processing the
summary call edges, and actually depended on the old problematic
behavior of selecting more than one summary for a given GUID when
encountered with different thresholds. There was no guarantee even
before that we would eventually pick the linkonce copy with the hottest
call edges, it just happened to work with the test and the old code, and
there was no guarantee that we would end up importing the selected
version of the copy that had the hottest call edges (since the backend
would effectively import only one of the selected copies).

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D48670

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337184 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 15:30:27 +00:00
Alexander Potapenko
06a94a9b18 MSan: minor fixes, NFC
- remove an extra space after |ID| declaration
 - drop the unused |FirstInsn| parameter in getShadowOriginPtrUserspace()


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337159 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 10:57:19 +00:00
Alexander Potapenko
0f2a60eba6 [MSan] factor userspace-specific declarations into createUserspaceApi(). NFC
This patch introduces createUserspaceApi() that creates function/global
declarations for symbols used by MSan in the userspace.
This is a step towards the upcoming KMSAN implementation patch.

Reviewed at https://reviews.llvm.org/D49292



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337155 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 10:03:30 +00:00
Chen Zheng
d27cef10a8 [InstCombine] add more SPFofSPF folding
Differential Revision: https://reviews.llvm.org/D49238


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337143 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 02:23:00 +00:00
Chen Zheng
727a214dd5 [InstCombine] fold icmp pred (sub 0, X) C for vector type
Differential Revision: https://reviews.llvm.org/D49283


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337141 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 00:51:40 +00:00
Michael J. Spencer
3da6ce5ebe Recommit r335794 "Add support for generating a call graph profile from Branch Frequency Info." with fix for removed functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337140 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 00:28:24 +00:00
Sanjay Patel
597deb9284 [InstCombine] Corrections in comments for division transformation (NFC)
The actual code seems to be correct, but the comments were misleading.

Patch by Aaron Puchert!

Differential Revision: https://reviews.llvm.org/D49276


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337131 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-15 17:06:59 +00:00
Roman Lebedev
6e43f22733 [NFC][InstCombine] foldICmpWithLowBitMaskedVal(): update comments.
All predicates are handled.
There does not seem to be any other possible folds here.
There are some more folds possible with inverted mask though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337112 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-14 20:08:52 +00:00
Roman Lebedev
3a68fbf4b5 [InstCombine] Fold x & (-1 >> y) s< x to x s> (-1 >> y)
https://bugs.llvm.org/show_bug.cgi?id=38123
https://rise4fun.com/Alive/I3O

This pattern is not commutative!
We must make sure not to fold the commuted version!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337111 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-14 20:08:47 +00:00
Roman Lebedev
0d94eaa92c [InstCombine] Fold x & (-1 >> y) s>= x to x s<= (-1 >> y)
https://bugs.llvm.org/show_bug.cgi?id=38123
https://rise4fun.com/Alive/I3O

This pattern is not commutative!
We must make sure not to fold the commuted version!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337109 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-14 20:08:37 +00:00
Roman Lebedev
0e039b76e0 [InstCombine] Fold x s<= x & (-1 >> y) to x s<= (-1 >> y)
https://bugs.llvm.org/show_bug.cgi?id=38123
https://rise4fun.com/Alive/I3O

This pattern is not commutative!
We must make sure not to fold the commuted version!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337107 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-14 20:08:26 +00:00
Roman Lebedev
f938155483 [InstCombine] Fold x s> x & (-1 >> y) to x s> (-1 >> y)
https://bugs.llvm.org/show_bug.cgi?id=38123
https://rise4fun.com/Alive/I3O

This pattern is not commutative!
We must make sure not to fold the commuted version!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337105 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-14 20:08:16 +00:00
Roman Lebedev
d8e175bca5 [InstCombine] Fold x u<= x & C to x u<= C
https://bugs.llvm.org/show_bug.cgi?id=38123
https://rise4fun.com/Alive/Fqp

This pattern is not commutative. But InstSimplify will
already have taken care of the 'commutative' variant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337102 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-14 16:44:54 +00:00
Roman Lebedev
fc95a84f5d [InstCombine] Fold x u> x & C to x u> C
https://bugs.llvm.org/show_bug.cgi?id=38123
https://rise4fun.com/Alive/JvS

This pattern is not commutative. But InstSimplify will
already have taken care of the 'commutative' variant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337100 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-14 16:44:43 +00:00
Roman Lebedev
81c991bbc4 [InstCombine] Fold x & (-1 >> y) u< x to x u> (-1 >> y)
https://bugs.llvm.org/show_bug.cgi?id=38123
https://rise4fun.com/Alive/ocb

This pattern is not commutative. But InstSimplify will
already have taken care of the 'commutative' variant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337098 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-14 12:20:16 +00:00
Roman Lebedev
21d6697e49 [InstCombine] Fold x & (-1 >> y) u>= x to x u<= (-1 >> y)
https://bugs.llvm.org/show_bug.cgi?id=38123
https://rise4fun.com/Alive/azI

This pattern is not commutative. But InstSimplify will
already have taken care of the 'commutative' variant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337096 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-14 12:20:06 +00:00
Teresa Johnson
43658456ae Revert "[ThinLTO] Ensure we always select the same function copy to import"
This reverts commits r337050 and r337059. Caused failure in
reverse-iteration bot that needs more investigation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337081 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-14 01:45:49 +00:00
Tim Shen
c31e75d199 [LSR] If no Use is interesting, early return.
Summary:
By looking at the callers of getUse(), we can see that even though
IVUsers may offer uses, but they may not be interesting to
LSR. It's possible that none of them is interesting.

Reviewers: sanjoy

Subscribers: jlebar, hiraditya, bixia, llvm-commits

Differential Revision: https://reviews.llvm.org/D49049

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337072 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 23:40:00 +00:00
Vedant Kumar
2d1b15b036 Fix comments which mixed up 'before' and 'after', NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337061 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 22:39:31 +00:00
Teresa Johnson
3393e5b81d [ThinLTO] Ensure we always select the same function copy to import
In order to always import the same copy of a linkonce function,
even when encountering it with different thresholds (a higher one then a
lower one), keep track of the summary we decided to import.
This ensures that the backend only gets a single definition to import
for each GUID, so that it doesn't need to choose one.

Move the largest threshold the GUID was considered for import into the
current module out of the ImportMap (which is part of a larger map
maintained across the whole index), and into a new map just maintained
for the current module we are computing imports for. This saves some
memory since we no longer have the thresholds maintained across the
whole index (and throughout the in-process backends when doing a normal
non-distributed ThinLTO build), at the cost of some additional
information being maintained for each invocation of ComputeImportForModule
(the selected summary pointer for each import).

There is an additional map lookup for each callee being considered for
importing, however, this was able to subsume a map lookup in the
Worklist iteration that invokes computeImportForFunction. We also are
able to avoid calling selectCallee if we already failed to import at the
same or higher threshold.

I compared the run time and peak memory for the SPEC2006 471.omnetpp
benchmark (running in-process ThinLTO backends), as well as for a large
internal benchmark with a distributed ThinLTO build (so just looking at
the thin link time/memory). Across a number of runs with and without
this change there was no significant change in the time and memory.

(I tried a few other variations of the change but they also didn't
improve time or peak memory).

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D48670

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337050 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 21:35:51 +00:00
Vlad Tsyrklevich
7dc602e516 [LowerTypeTests] Limit when icall jumptable entries are emitted
Summary:
Currently LowerTypeTests emits jumptable entries for all live external
and address-taken functions; however, we could limit the number of
functions that we emit entries for significantly.

For Cross-DSO CFI, we continue to emit jumptable entries for all
exported definitions.  In the non-Cross-DSO CFI case, we only need to
emit jumptable entries for live functions that are address-taken in live
functions. This ignores exported functions and functions that are only
address taken in dead functions. This change uses ThinLTO summary data
(now emitted for all modules during ThinLTO builds) to determine
address-taken and liveness info.

The logic for emitting jumptable entries is more conservative in the
regular LTO case because we don't have summary data in the case of
monolithic LTO builds; however, once summaries are emitted for all LTO
builds we can unify the Thin/monolithic LTO logic to only use summaries
to determine the liveness of address taking functions.

This change is a partial fix for PR37474. It reduces the build size for
nacl_helper by ~2-3%, the reduction is due to nacl_helper compiling in
lots of unused code and unused functions that are address taken in dead
functions no longer being being considered live due to emitted jumptable
references. The reduction for chromium is ~0.1-0.2%.

Reviewers: pcc, eugenis, javed.absar

Reviewed By: pcc

Subscribers: aheejin, dexonsmith, dschuff, mehdi_amini, eraman, steven_wu, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D47652

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337038 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 19:57:39 +00:00
Simon Pilgrim
1e086c7b69 [SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED-2)
We currently only support binary instructions in the alternate opcode shuffles.

This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism:

1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly.
2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this.
3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc.
4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements.

Reapplied with fix to only accept 2 different casts if they come from the same source type (PR38154).

Differential Revision: https://reviews.llvm.org/D49135

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336989 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 11:09:52 +00:00
Sanjay Patel
8a35df349b [InstCombine] return when SimplifyAssociativeOrCommutative makes a change
This bug was created by rL335258 because we used to always call instsimplify
after trying the associative folds. After that change it became possible
for subsequent folds to encounter unsimplified code (and potentially assert
because of it). 

Instead of carrying changed state through instcombine, we can just return 
immediately. This allows instsimplify to run, so we can continue assuming
that easy folds have already occurred.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336965 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 01:18:07 +00:00
Piotr Padlewski
674f0a1174 Simplify recursive launder.invariant.group and strip
Summary:
This patch is crucial for proving equality laundered/stripped
pointers. eg:

  bool foo(A *a) {
    return a == std::launder(a);
  }

Clang with -fstrict-vtable-pointers will emit something like:

    define dso_local zeroext i1 @_Z3fooP1A(%struct.A* %a) {
    entry:
      %c = bitcast %struct.A* %a to i8*
      %call = tail call i8* @llvm.launder.invariant.group.p0i8(i8* %c)
      %0 = bitcast %struct.A* %a to i8*
      %1 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %0)
      %2 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %call)
      %cmp = icmp eq i8* %1, %2
      ret i1 %cmp
    }

and because %2 can be replaced with @llvm.strip.invariant.group(%0)
and that %2 and %1 will produce the same value (because strip is readnone)
we can replace compare with true.

Reviewers: rsmith, hfinkel, majnemer, amharc, kuhar

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D47423

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336963 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-12 23:55:20 +00:00
Martin Storsjo
54919303bf Revert "[SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED)"
This reverts commit r336812, which broke compilation of a number
of projects, see PR38154.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336949 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-12 21:33:42 +00:00
Matt Morehouse
3418676967 [SanitizerCoverage] Add associated metadata to 8-bit counters.
Summary:
This allows counters associated with unused functions to be
dead-stripped along with their functions.  This approach is the same one
we used for PC tables.

Fixes an issue where LLD removes an unused PC table but leaves the 8-bit
counter.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: llvm-commits, hiraditya, kcc

Differential Revision: https://reviews.llvm.org/D49264

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336941 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-12 20:24:58 +00:00
Roman Lebedev
83a86dd616 [InstCombine] Fold x & (-1 >> y) != x to x u> (-1 >> y)
Summary:
A complementary fold to D49179.

https://bugs.llvm.org/show_bug.cgi?id=38123
https://rise4fun.com/Alive/Rny

Caveat: one more thing in `test/Transforms/InstCombine/icmp-logical.ll` breaks.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49205

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336911 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-12 14:56:12 +00:00
David Green
ecc246961c [UnJ] Use SmallPtrSets for block collections. NFC
We no longer care about the order of blocks in these collections,
so can change to SmallPtrSets, making contains checks quicker.

Differential revision: https://reviews.llvm.org/D49060



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336897 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-12 10:44:47 +00:00