22429 Commits

Author SHA1 Message Date
Roman Lebedev
5a5bcb7952 [SimplifyCFG] mergeConditionalStoreToAddress(): try to pacify MSAN
MSAN bot complains that there is use-of-uninitialized-value
of this FreeStores later in IsWorthwhile().
Perhaps FreeStores needs to be stored in a vector?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372262 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-18 21:04:39 +00:00
Roman Lebedev
ebc98fa7a0 [InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251)
Summary:
I don't have a direct motivational case for this,
but it would be good to have this for completeness/symmetry.

This pattern is basically the motivational pattern from
https://bugs.llvm.org/show_bug.cgi?id=43251
but with different predicate that requires that the offset is non-zero.

The completeness bit comes from the fact that a similar pattern (offset != zero)
will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259,
so it'd seem to be good to not overlook very similar patterns..

Proofs: https://rise4fun.com/Alive/21b

Also, there is something odd with `isKnownNonZero()`, if the non-zero
knowledge was specified as an assumption, it didn't pick it up (PR43267)

With this, i see no other missing folds for
https://bugs.llvm.org/show_bug.cgi?id=43251

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67412

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372257 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-18 20:10:07 +00:00
Roman Lebedev
2ecd35251f [SimplifyCFG] mergeConditionalStoreToAddress(): consider cost, not instruction count
Summary:
As it can be see in the changed test, while `div` is really costly,
we were speculating it. This does not seem correct.

Also, the old code would run for every single insturuction in BB,
instead of eagerly bailing out as soon as there are too many instructions.

This function still has a problem that `PHINodeFoldingThreshold` is
per-basic-block, while it should be for all the basic blocks.

Reviewers: efriedma, craig.topper, dmgreen, jmolloy

Reviewed By: jmolloy

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67315

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372255 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-18 19:46:57 +00:00
Roman Lebedev
ce09123b44 [InstCombine] dropRedundantMaskingOfLeftShiftInput(): some cleanup before upcoming patch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372245 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-18 18:38:40 +00:00
Wei Mi
2e1df5f759 [SampleFDO] Minimize performance impact when profile-sample-accurate
is enabled.

We can save memory and reduce binary size significantly by enabling
ProfileSampleAccurate. However when ProfileSampleAccurate is true,
function without sample will be regarded as cold and this could
potentially cause performance regression.

To minimize the potential negative performance impact, we want to be
a little conservative here saying if a function shows up in the profile,
no matter as outline instance, inline instance or call targets, treat
the function as not being cold. This will handle the cases such as most
callsites of a function are inlined in sampled binary (thus outline copy
don't get any sample) but not inlined in current build (because of source
code drift, imprecise debug information, or the callsites are all cold
individually but not cold accumulatively...), so that the outline function
showing up as cold in sampled binary will actually not be cold after current
build. After the change, such function will be treated as not cold even
profile-sample-accurate is enabled.

At the same time we lower the hot criteria of callsiteIsHot check when
profile-sample-accurate is enabled. callsiteIsHot is used to determined
whether a callsite is hot and qualified for early inlining. When
profile-sample-accurate is enabled, functions without profile will be
regarded as cold and much less inlining will happen in CGSCC inlining pass,
so we can worry less about size increase and be aggressive to allow more
early inlining to happen for warm callsites and it is helpful for performance
overall.

Differential Revision: https://reviews.llvm.org/D67561


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372232 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-18 16:06:28 +00:00
Sanjay Patel
e700dd9b79 [SimplifyLibCalls] fix crash with empty function name (PR43347)
...and improve some variable names while here.

https://bugs.llvm.org/show_bug.cgi?id=43347

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372227 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-18 14:33:40 +00:00
Teresa Johnson
584a08b94d [PGO] Change hardcoded thresholds for cold/inlinehint to use summary
Summary:
The PGO counter reading will add cold and inlinehint (hot) attributes
to functions that are very cold or hot. This was using hardcoded
thresholds, instead of the profile summary cutoffs which are used in
other hot/cold detection and are more dynamic and adaptable. Switch
to using the summary-based cold/hot detection.

The hardcoded limits were causing some code that had a medium level of
hotness (per the summary) to be incorrectly marked with a cold
attribute, blocking inlining.

Reviewers: davidxl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67673

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372189 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 23:12:13 +00:00
Reid Kleckner
f624998449 [PGO] Don't use comdat groups for counters & data on COFF
For COFF, a comdat group is really a symbol marked
IMAGE_COMDAT_SELECT_ANY and zero or more other symbols marked
IMAGE_COMDAT_SELECT_ASSOCIATIVE. Typically the associative symbols in
the group are not external and are not referenced by other TUs, they are
things like debug info, C++ dynamic initializers, or other section
registration schemes. The Visual C++ linker reports a duplicate symbol
error for symbols marked IMAGE_COMDAT_SELECT_ASSOCIATIVE even if they
would be discarded after handling the leader symbol.

Fixes coverage-inline.cpp in check-profile after r372020.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372182 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 21:10:49 +00:00
Roman Lebedev
0b9b1fa8a1 [NFC][InstCombine] dropRedundantMaskingOfLeftShiftInput(): some NFC diff shaving
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372171 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 19:32:26 +00:00
David Bolvansky
833ff6adf3 Reland "[SLC] Preserve attrs for strncpy(x, "", y) -> memset(align 1 x, '\0', y)"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372142 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 17:12:24 +00:00
Krasimir Georgiev
d5f8516fb7 Revert "[SLC] Preserve attrs for strncpy(x, "", y) -> memset(align 1 x, '\0', y)"
Summary:
This reverts commit r372101.

Causes ASAN build bot failures:

http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/14176
From http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/14176/steps/64-bit%20check-asan/logs/stdio:

```
[ RUN      ] AddressSanitizer.StrNCatOOBTest
/home/buildbots/ppc64be-sanitizer/sanitizer-ppc64be/build/llvm-project/compiler-rt/lib/asan/tests/asan_str_test.cpp:462: Failure
Death test: strncat(to - 1, from, 0)
    Result: failed to die.
```

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67658

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372125 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 14:15:23 +00:00
Simon Pilgrim
f2aa921cfe [LoopVectorize] Don't dereference a dyn_cast result. NFCI.
The static analyzer is warning about potential null dereferences of dyn_cast<> results, we can use cast<> directly as we know that these cases should all be CastInst, which is why its working atm and anyway cast<> will assert if they aren't.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372116 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 13:24:54 +00:00
Benjamin Kramer
1d7df59e12 Hide implementation details in namespaces.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372113 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 12:56:29 +00:00
Johannes Doerfert
80781fec87 [Attributor][Fix] Initialize the cache prior to using it
Summary:
There were segfaults as we modified and iterated the instruction maps in
the cache at the same time. This was happening because we created new
instructions while we populated the cache. This fix changes the order
in which we perform these actions. First, the caches for the whole
module are created, then we start to create abstract attributes.

I don't have a unit test but the LLVM test suite exposes this problem.

Reviewers: uenoku, sstefan1

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67232

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372105 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 10:52:41 +00:00
David Bolvansky
d15afd3638 [SLC] Preserve attrs for strncpy(x, "", y) -> memset(align 1 x, '\0', y)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372101 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 10:25:38 +00:00
David Bolvansky
d4e3b6ef79 [InstCombine] Annotate strdup with deref_or_null
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372098 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 10:12:48 +00:00
David Bolvansky
f0f4b7afaa [NFCI] Fixed buildbots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372097 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 10:03:45 +00:00
Fangrui Song
2920626e02 [SimplifyLibCalls] Fix -Wunused-result after D53342/r372091
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372096 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 09:56:55 +00:00
David Bolvansky
077e4d5d58 [SimplifyLibCalls] Mark known arguments with nonnull
Reviewers: efriedma, jdoerfert

Reviewed By: jdoerfert

Subscribers: ychen, rsmith, joerg, aaron.ballman, lebedev.ri, uenoku, jdoerfert, hfinkel, javed.absar, spatel, dmgreen, llvm-commits

Differential Revision: https://reviews.llvm.org/D53342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372091 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 09:32:52 +00:00
Florian Hahn
bac41ae77d [LoopUnroll] Use LoopSize+1 as threshold, to allow unrolling loops matching LoopSize.
We use `< UP.Threshold` later on, so we should use LoopSize + 1, to
allow unrolling if the result won't exceed to loop size.

Fixes PR43305.

Reviewers: efriedma, dmgreen, paquette

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D67594

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372084 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 09:02:48 +00:00
Hideto Ueno
f6a402010b [Attributor] Use Alias Analysis in noalias callsite argument deduction
Summary: This patch adds a check of alias analysis in `noalias` callsite argument deduction.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67604

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372075 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 06:53:27 +00:00
Hideto Ueno
8a6eaada3c [Attributor] Create helper struct for handling analysis getters
Summary: This patch introduces a helper struct `AnalysisGetter` to put together analysis getters. In this patch, a getter for `AAResult` is also added for  `noalias`.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372072 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-17 05:45:18 +00:00
Reid Kleckner
713b7a45a9 [PGO] Use linkonce_odr linkage for __profd_ variables in comdat groups
This fixes relocations against __profd_ symbols in discarded sections,
which is PR41380.

In general, instrumentation happens very early, and optimization and
inlining happens afterwards. The counters for a function are calculated
early, and after inlining, counters for an inlined function may be
widely referenced by other functions.

For C++ inline functions of all kinds (linkonce_odr &
available_externally mainly), instr profiling wants to deduplicate these
__profc_ and __profd_ globals. Otherwise the binary would be quite
large.

I made __profd_ and __profc_ comdat in r355044, but I chose to make
__profd_ internal. At the time, I was only dealing with coverage, and in
that case, none of the instrumentation needs to reference __profd_.
However, if you use PGO, then instrumentation passes add calls to
__llvm_profile_instrument_range which reference __profd_ globals. The
solution is to make these globals externally visible by using
linkonce_odr linkage for data as was done for counters.

This is safe because PGO adds a CFG hash to the names of the data and
counter globals, so if different TUs have different globals, they will
get different data and counter arrays.

Reviewers: xur, hans

Differential Revision: https://reviews.llvm.org/D67579

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372020 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 18:49:09 +00:00
Roman Lebedev
d1f4a55c8c [SimplifyCFG] FoldTwoEntryPHINode(): consider *total* speculation cost, not per-BB cost
Summary:
Previously, if the threshold was 2, we were willing to speculatively
execute 2 cheap instructions in both basic blocks (thus we were willing
to speculatively execute cost = 4), but weren't willing to speculate
when one BB had 3 instructions and other one had no instructions,
even thought that would have total cost of 3.

This looks inconsistent to me.
I don't think `cmov`-like instructions will start executing
until both of it's inputs are available: https://godbolt.org/z/zgHePf
So i don't see why the existing behavior is the correct one.

Also, let's add it's own `cl::opt` for this threshold,
with default=4, so it is not stricter than the previous threshold:
will allow to fold when there are 2 BB's each with cost=2.
And since the logic has changed, it will also allow to fold when
one BB has cost=3 and other cost=1, or there is only one BB with cost=4.

This is an alternative solution to D65148:
This fix is mainly motivated by `signbit-like-value-extension.ll` test.
That pattern comes up in JPEG decoding, see e.g.
`Figure F.12 – Extending the sign bit of a decoded value in V`
of `ITU T.81` (JPEG specification).
That branch is not predictable, and it is within the innermost loop,
so the fact that that pattern ends up being stuck with a branch
instead of `select` (i.e. `CMOV` for x86) is unlikely to be beneficial.

This has great results on the final assembly (vanilla test-suite + RawSpeed): (metric pass - D67240)
| metric                                 |     old |     new | delta |      % |
| x86-mi-counting.NumMachineFunctions    |   37720 |   37721 |     1 |  0.00% |
| x86-mi-counting.NumMachineBasicBlocks  |  773545 |  771181 | -2364 | -0.31% |
| x86-mi-counting.NumMachineInstructions | 7488843 | 7486442 | -2401 | -0.03% |
| x86-mi-counting.NumUncondBR            |  135770 |  135543 |  -227 | -0.17% |
| x86-mi-counting.NumCondBR              |  423753 |  422187 | -1566 | -0.37% |
| x86-mi-counting.NumCMOV                |   24815 |   25731 |   916 |  3.69% |
| x86-mi-counting.NumVecBlend            |      17 |      17 |     0 |  0.00% |

We significantly decrease basic block count, notably decrease instruction count,
significantly decrease branch count and very significantly increase `cmov` count.

Performance-wise, unsurprisingly, this has great effect on
target RawSpeed benchmark. I'm seeing 5 **major** improvements:
```
Benchmark                                                                                             Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_mean                                  -0.3064         -0.3064      226.9913      157.4452      226.9800      157.4384
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_median                                -0.3057         -0.3057      226.8407      157.4926      226.8282      157.4828
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_stddev                                -0.4985         -0.4954        0.3051        0.1530        0.3040        0.1534
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_mean                                   -0.1747         -0.1747       80.4787       66.4227       80.4771       66.4146
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_median                                 -0.1742         -0.1743       80.4686       66.4542       80.4690       66.4436
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_stddev                                 +0.6089         +0.5797        0.0670        0.1078        0.0673        0.1062
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_mean                                  -0.1598         -0.1598      171.6996      144.2575      171.6915      144.2538
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_median                                -0.1598         -0.1597      171.7109      144.2755      171.7018      144.2766
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_stddev                                +0.4024         +0.3850        0.0847        0.1187        0.0848        0.1175
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_mean                                   -0.0550         -0.0551      280.3046      264.8800      280.3017      264.8559
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_median                                 -0.0554         -0.0554      280.2628      264.7360      280.2574      264.7297
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_stddev                                 +0.7005         +0.7041        0.2779        0.4725        0.2775        0.4729
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_mean                                   -0.0354         -0.0355      316.7396      305.5208      316.7342      305.4890
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_median                                 -0.0354         -0.0356      316.6969      305.4798      316.6917      305.4324
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_stddev                                 +0.0493         +0.0330        0.3562        0.3737        0.3563        0.3681
```

That being said, it's always best-effort, so there will likely
be cases where this worsens things.

Reviewers: efriedma, craig.topper, dmgreen, jmolloy, fhahn, Carrot, hfinkel, chandlerc

Reviewed By: jmolloy

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372009 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 16:18:24 +00:00
Sanjay Patel
49121b4df9 [InstCombine] remove unneeded one-use checks for icmp fold
Related folds were added in:
rL125734
...the code comment about register pressure is discussed in
more detail in:
https://bugs.llvm.org/show_bug.cgi?id=2698

But 10 years later, perf testing bzip2 with this change now
shows a slight (0.2% average) improvement on Haswell although
that's probably within test noise.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 and rL371981 are related patches in this series.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372007 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 16:15:25 +00:00
Sanjay Patel
cba3b2c1c3 [InstCombine] remove unneeded one-use checks for icmp fold
This fold and several others were added in:
rL125734 <https://reviews.llvm.org/rL125734>
...with no explanation for the one-use checks other than the code
comments about register pressure.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 is a related patch in this series.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371981 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 12:54:34 +00:00
Sanjay Patel
1a3c0ac4ea [InstCombine] fix comments to match code; NFC
This blob was written before match() existed, so it
could probably be reduced significantly.

But I suspect it isn't well tested, so tests would have
to be added to reduce risk from logic changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371978 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 12:12:05 +00:00
Simon Pilgrim
f4c41d6cba [VPlanSLP] Don't dereference a cast_or_null<VPInstruction> result. NFCI.
The static analyzer is warning about a potential null dereference of the cast_or_null result, I've split the cast_or_null check from the ->getUnderlyingInstr() call to avoid this, but it appears that we weren't seeing any null pointers in the dumped bundles in the first place.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371975 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 11:22:44 +00:00
Simon Pilgrim
5a41182176 [SLPVectorizer] Assert that we find a LastInst to silence analyzer null dereference warning. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371974 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 10:48:16 +00:00
Simon Pilgrim
c46379da2c [SLPVectorizer] Don't dereference a dyn_cast result. NFCI.
The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371973 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 10:35:09 +00:00
Stefan Stipanovic
e342d834af [Attributor] Heap-To-Stack Conversion
D53362 gives a prototype heap-to-stack conversion pass. With addition of new attributes in the attributor, this can now be revisted and improved. This will place it in the Attributor to make it easier to use new attributes (eg. nofree, nosync, willreturn, etc.) and other attributor features.

Reviewers: jdoerfert, uenoku, hfinkel, efriedma

Subscribers: lebedev.ri, xbolva00, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D65408

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371942 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 21:47:41 +00:00
Sanjay Patel
3050f70ce7 [InstCombine] remove unneeded one-use checks for icmp fold
This fold and several others were added in:
rL125734
...with no explanation for the one-use checks other than the code
comments about register pressure.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

There are similar checks as noted with the TODO comments. I'm
hoping to remove those restrictions too, but if any of these
does cause a regression, it should be easier to correct by making
small, individual commits.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371940 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 20:56:34 +00:00
Simon Pilgrim
81bcd7a858 [LoadStoreVectorizer] vectorizeLoadChain - ensure we find a valid Type down the load chain. NFCI.
Silence static analyzer uninitialized variable warning by setting the LoadTy to null and then asserting we find a real value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371936 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 16:44:35 +00:00
Sanjay Patel
71f28d66dd [SLP] limit vectorization of Constant subclasses (PR33958)
This is a fix for:
https://bugs.llvm.org/show_bug.cgi?id=33958

It seems universally true that we would not want to transform this kind of
sequence on any target, but if that's not correct, then we could view this
as a target-specific cost model problem. We could also white-list ConstantInt,
ConstantFP, etc. rather than blacklist Global and ConstantExpr.

Differential Revision: https://reviews.llvm.org/D67362

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371931 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 13:03:24 +00:00
Johannes Doerfert
60c9595c32 [Attributor][Fix] Use right type to replace expressions
Summary: This should be obsolete once the functionality in D66967 is integrated.

Reviewers: uenoku, sstefan1

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67231

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371915 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-14 02:57:50 +00:00
Florian Hahn
768257cad5 [BasicBlockUtils] Add optional BBName argument, in line with BB:splitBasicBlock
Reviewers: spatel, asbirlea, craig.topper

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D67521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371819 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-13 08:03:32 +00:00
Alina Sbirlea
333387e4a8 [MemorySSA] Pass (for update) MSSAU when hoisting instructions.
Summary: Pass MSSAU to makeLoopInvariant in order to properly update MSSA.

Reviewers: george.burgess.iv

Subscribers: Prazek, sanjoy.google, uabelho, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67470

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371748 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-12 17:12:51 +00:00
Sanjay Patel
21dbe0be4a [InstCombine] rename variable for readability; NFC
There's more that can be done here, but "OpI"
doesn't convey that we casted to BinaryOperator.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371682 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 22:31:34 +00:00
Eli Friedman
3473785eb3 [ConstantHoisting] Fix non-determinism.
Differential Revision: https://reviews.llvm.org/D66114



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371644 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 18:55:00 +00:00
Petr Hosek
62e7ac6bbc Reland "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
This patch contains the basic functionality for reporting potentially
incorrect usage of __builtin_expect() by comparing the developer's
annotation against a collected PGO profile. A more detailed proposal and
discussion appears on the CFE-dev mailing list
(http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a
prototype of the initial frontend changes appear here in D65300

We revised the work in D65300 by moving the misexpect check into the
LLVM backend, and adding support for IR and sampling based profiles, in
addition to frontend instrumentation.

We add new misexpect metadata tags to those instructions directly
influenced by the llvm.expect intrinsic (branch, switch, and select)
when lowering the intrinsics. The misexpect metadata contains
information about the expected target of the intrinsic so that we can
check against the correct PGO counter when emitting diagnostics, and the
compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight.
We use these branch weight values to determine when to emit the
diagnostic to the user.

A future patch should address the comment at the top of
LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and
UnlikelyBranchWeight values into a shared space that can be accessed
outside of the LowerExpectIntrinsic pass. Once that is done, the
misexpect metadata can be updated to be smaller.

In the long term, it is possible to reconstruct portions of the
misexpect metadata from the existing profile data. However, we have
avoided this to keep the code simple, and because some kind of metadata
tag will be required to identify which branch/switch/select instructions
are influenced by the use of llvm.expect

Patch By: paulkirth
Differential Revision: https://reviews.llvm.org/D66324

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371635 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 16:19:50 +00:00
Florian Hahn
c60c66e7ae Revert [InstCombine] Use SimplifyFMulInst to simplify multiply in fma.
This introduces additional rounding error in some cases. See D67434.

This reverts r371518 (git commit 18a1f0818b659cee13865b4fad2648d85984a4ed)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371634 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 16:17:03 +00:00
Whitney Tsang
0123ecb125 LLVM: Optimization Pass: Remove conflicting attribute, if any, before
adding new read attribute to an argument
Summary: Update optimization pass to prevent adding read-attribute to an
argument without removing its conflicting attribute.

A read attribute, based on the result of the attribute deduction
process, might be added to an argument. The attribute might be in
conflict with other read/write attribute currently associated with the
argument. To ensure the compatibility of attributes, conflicting
attribute, if any, must be removed before a new one is added.

The following snippet shows the current behavior of the compiler, where
the compilation process is aborted due to incompatible attributes.

$ cat x.ll
; ModuleID = 'x.bc'

%_type_of_d-ccc = type <{ i8*, i8, i8, i8, i8 }>

@d-ccc = internal global %_type_of_d-ccc <{ i8* null, i8 1, i8 13, i8 0,
i8 -127 }>, align 8

define void @foo(i32* writeonly %.aaa) {
foo_entry:
  %_param_.aaa = alloca i32*, align 8
  store i32* %.aaa, i32** %_param_.aaa, align 8
  store i8 0, i8* getelementptr inbounds (%_type_of_d-ccc,
%_type_of_d-ccc* @d-ccc, i32 0, i32 3)
  ret void
}

$ opt -O3 x.ll
Attributes 'readnone and writeonly' are incompatible!
void (i32*)* @foo
in function foo
LLVM ERROR: Broken function found, compilation aborted!
The purpose of this changeset is to fix the above error. This fix is
based on a suggestion from Johannes @jdoerfert (many thanks!!!)
Authored By: anhtuyen
Reviewer: nicholas, rnk, chandlerc, jdoerfert
Reviewed By: rnk
Subscribers: hiraditya, jdoerfert, llvm-commits, anhtuyen, LLVM
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D58694

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371622 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 14:26:22 +00:00
Sanjay Patel
3813551fb7 [InstCombine] fold sign-bit compares of srem
(srem X, pow2C) sgt/slt 0 can be reduced using bit hacks by masking
off the sign bit and the module (low) bits:
https://rise4fun.com/Alive/jSO
A '2' divisor allows slightly more folding:
https://rise4fun.com/Alive/tDBM

Any chance to remove an 'srem' use is probably worthwhile, but this is limited
to the one-use improvement case because doing more may expose other missing
folds. That means it does nothing for PR21929 yet:
https://bugs.llvm.org/show_bug.cgi?id=21929

Differential Revision: https://reviews.llvm.org/D67334

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371610 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 12:04:26 +00:00
David Bolvansky
43dfaac978 [InstCombine] Fixed handling of isOpNewLike (PR11748)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371602 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 10:37:03 +00:00
Florian Hahn
fcb40334f1 [LoopInterchange] Drop unused splitInnerLoopHeader declaration.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371601 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 10:32:15 +00:00
Dmitri Gribenko
b91891ad06 Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
This reverts commit r371584. It introduced a dependency from compiler-rt
to llvm/include/ADT, which is problematic for multiple reasons.

One is that it is a novel dependency edge, which needs cross-compliation
machinery for llvm/include/ADT (yes, it is true that right now
compiler-rt included only header-only libraries, however, if we allow
compiler-rt to depend on anything from ADT, other libraries will
eventually get used).

Secondly, depending on ADT from compiler-rt exposes ADT symbols from
compiler-rt, which would cause ODR violations when Clang is built with
the profile library.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371598 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 09:16:17 +00:00
Florian Hahn
e72ccee9d5 [LoopInterchange] Properly move condition, induction increment and ops to latch.
Currently we only rely on the induction increment to come before the
condition to ensure the required instructions get moved to the new
latch.

This patch duplicates and moves the required instructions to the
newly created latch. We move the condition to the end of the new block,
then process its operands. We stop at operands that are defined
outside the loop, or are the induction PHI.

We duplicate the instructions and update the uses in the moved
instructions, to ensure other users remain intact. See the added
test2 for such an example.

Reviewers: efriedma, mcrosier

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D67367

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371595 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 08:23:23 +00:00
Hideto Ueno
df18bf27eb [Attributor] Implement "noalias" callsite argument deduction
Summary: Now, `nocapture` is deduced in Attributor therefore, this patch introduces deduction for `noalias` callsite argument using `nocapture`.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: lebedev.ri, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67286

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371590 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 07:00:33 +00:00
Hideto Ueno
58dc7cdf01 [Attributor][Fix] Manifest nocapture only in CSArgument or Argument
Summary:
We can query to Attributor whether the value is captured in the scope or not on the following way:

```
    const auto & NoCapAA = A.getAAFor<AANoCapture>(*this, IRPosition::value(V));
```
And if V is CallSiteReturned then `getDeducedAttribute` will add `nocatpure` to the callsite returned value. It is not valid.
This patch checks the position is an argument or call site argument.

This is tested in D67286.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371589 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 06:52:11 +00:00
Alexey Lapshin
6a1542545b [Debuginfo][Instcombiner] Do not clone dbg.declare.
TryToSinkInstruction() has a bug: While updating debug info for
sunk instruction, it could clone dbg.declare intrinsic.
That is wrong. There could be only one dbg.declare.
The fix is to not clone dbg.declare intrinsic and to update
it`s arguments, to not to point to sunk instruction.

Differential Revision: https://reviews.llvm.org/D67217

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371587 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 06:07:16 +00:00