185311 Commits

Author SHA1 Message Date
Vedant Kumar
c7b6e02431 [Coverage] Assert that filenames in a TU are unique, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372024 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 19:08:41 +00:00
Steven Wu
1cb2400471 [LTO][Legacy] Add new C inferface to query libcall functions
Summary:
This is needed to implemented the same approach as lld (implemented in r338434)
for how to handling symbols that can be generated by LTO code generator
but not present in the symbol table for linker that uses legacy C APIs.

libLTO is in charge of providing the list of symbols. Linker is in
charge of implementing the eager loading from static libraries using
the list of symbols.

rdar://problem/52853974

Reviewers: tejohnson, bd1976llvm, deadalnix, espindola

Reviewed By: tejohnson

Subscribers: emaste, arichardson, hiraditya, MaskRay, dang, kledzik, mehdi_amini, inglorion, jkorous, dexonsmith, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372021 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 18:49:54 +00:00
Reid Kleckner
713b7a45a9 [PGO] Use linkonce_odr linkage for __profd_ variables in comdat groups
This fixes relocations against __profd_ symbols in discarded sections,
which is PR41380.

In general, instrumentation happens very early, and optimization and
inlining happens afterwards. The counters for a function are calculated
early, and after inlining, counters for an inlined function may be
widely referenced by other functions.

For C++ inline functions of all kinds (linkonce_odr &
available_externally mainly), instr profiling wants to deduplicate these
__profc_ and __profd_ globals. Otherwise the binary would be quite
large.

I made __profd_ and __profc_ comdat in r355044, but I chose to make
__profd_ internal. At the time, I was only dealing with coverage, and in
that case, none of the instrumentation needs to reference __profd_.
However, if you use PGO, then instrumentation passes add calls to
__llvm_profile_instrument_range which reference __profd_ globals. The
solution is to make these globals externally visible by using
linkonce_odr linkage for data as was done for counters.

This is safe because PGO adds a CFG hash to the names of the data and
counter globals, so if different TUs have different globals, they will
get different data and counter arrays.

Reviewers: xur, hans

Differential Revision: https://reviews.llvm.org/D67579

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372020 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 18:49:09 +00:00
Roman Lebedev
ae054dc434 [ARM][Codegen] Autogenerate arm-cgp-casts.ll test.
Apparently it got broken by r372009 while i thought it was r372012.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372019 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 18:28:22 +00:00
Simon Pilgrim
a8399fa299 [X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operands
Determine if all of the uses of LHS/RHS operands can be replaced with a zero vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372013 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 17:30:33 +00:00
David Green
3be9fc5a0d [ARM] A predicate cast of a predicate cast is a predicate cast
The adds some very basic folding of PREDICATE_CASTS, removing cases when they
are chained together. These would already be removed eventually, as these are
lowered to copies. This just allows it to happen earlier, which can help other
simplifications.

Differential Revision: https://reviews.llvm.org/D67591


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372012 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 17:29:07 +00:00
Roman Lebedev
d1f4a55c8c [SimplifyCFG] FoldTwoEntryPHINode(): consider *total* speculation cost, not per-BB cost
Summary:
Previously, if the threshold was 2, we were willing to speculatively
execute 2 cheap instructions in both basic blocks (thus we were willing
to speculatively execute cost = 4), but weren't willing to speculate
when one BB had 3 instructions and other one had no instructions,
even thought that would have total cost of 3.

This looks inconsistent to me.
I don't think `cmov`-like instructions will start executing
until both of it's inputs are available: https://godbolt.org/z/zgHePf
So i don't see why the existing behavior is the correct one.

Also, let's add it's own `cl::opt` for this threshold,
with default=4, so it is not stricter than the previous threshold:
will allow to fold when there are 2 BB's each with cost=2.
And since the logic has changed, it will also allow to fold when
one BB has cost=3 and other cost=1, or there is only one BB with cost=4.

This is an alternative solution to D65148:
This fix is mainly motivated by `signbit-like-value-extension.ll` test.
That pattern comes up in JPEG decoding, see e.g.
`Figure F.12 – Extending the sign bit of a decoded value in V`
of `ITU T.81` (JPEG specification).
That branch is not predictable, and it is within the innermost loop,
so the fact that that pattern ends up being stuck with a branch
instead of `select` (i.e. `CMOV` for x86) is unlikely to be beneficial.

This has great results on the final assembly (vanilla test-suite + RawSpeed): (metric pass - D67240)
| metric                                 |     old |     new | delta |      % |
| x86-mi-counting.NumMachineFunctions    |   37720 |   37721 |     1 |  0.00% |
| x86-mi-counting.NumMachineBasicBlocks  |  773545 |  771181 | -2364 | -0.31% |
| x86-mi-counting.NumMachineInstructions | 7488843 | 7486442 | -2401 | -0.03% |
| x86-mi-counting.NumUncondBR            |  135770 |  135543 |  -227 | -0.17% |
| x86-mi-counting.NumCondBR              |  423753 |  422187 | -1566 | -0.37% |
| x86-mi-counting.NumCMOV                |   24815 |   25731 |   916 |  3.69% |
| x86-mi-counting.NumVecBlend            |      17 |      17 |     0 |  0.00% |

We significantly decrease basic block count, notably decrease instruction count,
significantly decrease branch count and very significantly increase `cmov` count.

Performance-wise, unsurprisingly, this has great effect on
target RawSpeed benchmark. I'm seeing 5 **major** improvements:
```
Benchmark                                                                                             Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_mean                                  -0.3064         -0.3064      226.9913      157.4452      226.9800      157.4384
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_median                                -0.3057         -0.3057      226.8407      157.4926      226.8282      157.4828
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_stddev                                -0.4985         -0.4954        0.3051        0.1530        0.3040        0.1534
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_mean                                   -0.1747         -0.1747       80.4787       66.4227       80.4771       66.4146
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_median                                 -0.1742         -0.1743       80.4686       66.4542       80.4690       66.4436
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_stddev                                 +0.6089         +0.5797        0.0670        0.1078        0.0673        0.1062
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_mean                                  -0.1598         -0.1598      171.6996      144.2575      171.6915      144.2538
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_median                                -0.1598         -0.1597      171.7109      144.2755      171.7018      144.2766
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_stddev                                +0.4024         +0.3850        0.0847        0.1187        0.0848        0.1175
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_mean                                   -0.0550         -0.0551      280.3046      264.8800      280.3017      264.8559
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_median                                 -0.0554         -0.0554      280.2628      264.7360      280.2574      264.7297
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_stddev                                 +0.7005         +0.7041        0.2779        0.4725        0.2775        0.4729
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_mean                                   -0.0354         -0.0355      316.7396      305.5208      316.7342      305.4890
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_median                                 -0.0354         -0.0356      316.6969      305.4798      316.6917      305.4324
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_stddev                                 +0.0493         +0.0330        0.3562        0.3737        0.3563        0.3681
```

That being said, it's always best-effort, so there will likely
be cases where this worsens things.

Reviewers: efriedma, craig.topper, dmgreen, jmolloy, fhahn, Carrot, hfinkel, chandlerc

Reviewed By: jmolloy

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372009 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 16:18:24 +00:00
Sanjay Patel
49121b4df9 [InstCombine] remove unneeded one-use checks for icmp fold
Related folds were added in:
rL125734
...the code comment about register pressure is discussed in
more detail in:
https://bugs.llvm.org/show_bug.cgi?id=2698

But 10 years later, perf testing bzip2 with this change now
shows a slight (0.2% average) improvement on Haswell although
that's probably within test noise.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 and rL371981 are related patches in this series.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372007 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 16:15:25 +00:00
Sanjay Patel
bada9fa12d [InstCombine] move tests for icmp+add; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372004 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 15:33:40 +00:00
Oliver Cruickshank
4e7f7eff1b [ARM] Add patterns for BSWAP intrinsic on MVE
BSWAP can use the VREV instruction on MVE to produce better results than
expanding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372002 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 15:20:10 +00:00
Oliver Cruickshank
c3d8b899f0 [ARM] Add patterns for bitreverse intrinsic on MVE
BITREVERSE can use the VBRSR which will reverse and right shift.
Shifting right by 0 will just reverse the bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372001 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 15:20:03 +00:00
Oliver Cruickshank
2498183766 [ARM] Lower CTTZ on MVE
Lower CTTZ on MVE using VBRSR and VCLS which will reverse the bits and
count the leading zeros, equivalent to a count trailing zeros (CTTZ).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372000 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 15:19:56 +00:00
Oliver Cruickshank
e68996a611 [ARM] Add patterns for CTLZ on MVE
CTLZ intrinsic can use the VCLS instruction on MVE, which produces
better results than expanding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371999 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 15:19:49 +00:00
Simon Pilgrim
f4ee527f9a [ExecutionEngine] Don't dereference a dyn_cast result. NFCI.
The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371998 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 15:19:11 +00:00
Sjoerd Meijer
4da58da41b [LV] Add ARM MVE tail-folding tests
Now that the vectorizer can do tail-folding (rL367592), and the ARM backend
understands MVE masked loads/stores (rL371932), it's time to add the MVE
tail-folding equivalent of the X86 tests that I added.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371996 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:56:26 +00:00
Jonas Paulsson
6f3a6de0a7 [SystemZ] Call erase() on the right MBB in SystemZTargetLowering::emitSelect()
Since MBB was split *before* MI, the MI(s) will reside in JoinMBB (MBB) at
the point of erasing them, so calling StartMBB->erase() is actually wrong,
although it is "working" by all appearances.

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371995 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:49:36 +00:00
Guillaume Chatelet
5932076b5e [NFC] remove unused functions
Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67616

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371994 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:48:58 +00:00
Matt Arsenault
dc25a20424 AMDGPU/GlobalISel: Fail select of G_INSERT non-32-bit source
This was producing an illegal copy which would hit an assert
later. Error on selection for now until this is implemented.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371993 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:26:14 +00:00
Matt Arsenault
fef53526a6 AMDGPU/GlobalISel: Fix some broken run lines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371992 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:14:40 +00:00
Matt Arsenault
8f86794aa2 AMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEIL
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371991 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:14:37 +00:00
Matt Arsenault
4fb941a716 AMDGPU/GlobalISel: Remove another illegal select test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371990 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:14:31 +00:00
Clement Courbet
dea3e0608c [X86][NFC] Add a use-aa feature.
Summary:
This allows enabling useaa on the command-line and will allow enabling the
feature on a per-CPU basis where benchmarking shows improvements.

This is modelled after the ARM/AArch64 target.

Reviewers: RKSimon, andreadb, craig.topper

Subscribers: javed.absar, kristof.beyls, hiraditya, ychen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67266

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371989 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:05:28 +00:00
Sanjay Patel
4a751dd645 [InstCombine] add/move tests for icmp with add operand; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371988 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:05:19 +00:00
James Henderson
5544011439 [docs][llvm-strings] Write llvm-strings documentation
Previously we only had a stub document.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D67554

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371984 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 13:56:12 +00:00
James Henderson
bc9f059dd8 [docs][llvm-size] Write llvm-size documentation
Previously we only had a stub document.

Reviewed by: serge-sans-paille, MaskRay

Differential Revision: https://reviews.llvm.org/D67555

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371983 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 13:20:37 +00:00
David Green
437310b30f [ARM] Fold VCMP into VPT
MVE has VPT instructions, which perform the duties of both a VCMP and a VPST in
a single instruction, performing the compare and starting the VPT block in one.
This teaches the MVEVPTBlockPass to fold them, searching back through the
basicblock for a valid VCMP and creating the VPT from its operands.

There are some changes to the VPT instructions to accommodate this, altering
the order of the operands to match the VCMP better, and changing P0 register
defs to be VPR defs, as is used in other places.

Differential Revision: https://reviews.llvm.org/D66577


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371982 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 13:02:41 +00:00
Sanjay Patel
cba3b2c1c3 [InstCombine] remove unneeded one-use checks for icmp fold
This fold and several others were added in:
rL125734 <https://reviews.llvm.org/rL125734>
...with no explanation for the one-use checks other than the code
comments about register pressure.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 is a related patch in this series.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371981 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 12:54:34 +00:00
Sanjay Patel
8624cc4ae4 [InstCombine] add icmp tests with extra uses; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371979 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 12:19:18 +00:00
Sanjay Patel
1a3c0ac4ea [InstCombine] fix comments to match code; NFC
This blob was written before match() existed, so it
could probably be reduced significantly.

But I suspect it isn't well tested, so tests would have
to be added to reduce risk from logic changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371978 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 12:12:05 +00:00
Nico Weber
bd50d31a90 gn build: Merge r371976
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371977 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 11:33:54 +00:00
Simon Pilgrim
f4c41d6cba [VPlanSLP] Don't dereference a cast_or_null<VPInstruction> result. NFCI.
The static analyzer is warning about a potential null dereference of the cast_or_null result, I've split the cast_or_null check from the ->getUnderlyingInstr() call to avoid this, but it appears that we weren't seeing any null pointers in the dumped bundles in the first place.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371975 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 11:22:44 +00:00
Simon Pilgrim
5a41182176 [SLPVectorizer] Assert that we find a LastInst to silence analyzer null dereference warning. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371974 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 10:48:16 +00:00
Simon Pilgrim
c46379da2c [SLPVectorizer] Don't dereference a dyn_cast result. NFCI.
The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371973 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 10:35:09 +00:00
Sjoerd Meijer
2eb46d74d2 Added return statement to fix compile and build warning:
llvm-rtdyld.cpp:966:7: warning: variable ‘Result’ set but not used

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371972 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 10:30:37 +00:00
Kerry McLaughlin
48d00babac [SVE][Inline-Asm] Add constraints for SVE predicate registers
Summary:
Adds the following inline asm constraints for SVE:
  - Upl: One of the low eight SVE predicate registers, P0 to P7 inclusive
  - Upa: SVE predicate register with full range, P0 to P15

Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, cameron.mcinally, greened, rengolin

Reviewed By: rovka

Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371967 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 09:45:27 +00:00
Nico Weber
911a837b3d gn build: Merge r371965
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371966 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 09:43:26 +00:00
Nico Weber
7020cdb536 gn build: Merge r371959
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371961 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 07:34:23 +00:00
Sjoerd Meijer
71dc0b23c7 [AArch64] Some more FP16 FMA pattern matching
After our previous machinecombiner exercises (rL371321, rL371818, rL371833), we
were still missing a few FP16 FMA patterns.

Differential Revision: https://reviews.llvm.org/D67576

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371960 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 07:32:13 +00:00
Jonas Paulsson
0af6488be2 [SystemZ] Merge the SystemZExpandPseudo pass into SystemZPostRewrite.
SystemZExpandPseudo:s only job was to expand LOCRMux instructions into jump
sequences. This needs to be done if expandLOCRPseudo() or expandSELRPseudo()
fails to find a legal opcode (all registers "high" or "low"). This task has
now been moved to SystemZPostRewrite while removing the SystemZExpandPseudo
pass.

It is in fact preferred to expand these pseudos directly after register
allocation in SystemZPostRewrite since the hinted register combinations are
then not subject to later optimizations.

Review: Ulrich Weigand
https://reviews.llvm.org/D67432

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371959 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 07:29:37 +00:00
Matt Arsenault
7511280318 AMDGPU/GlobalISel: Remove illegal select tests
These fail in a release build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371955 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 04:21:10 +00:00
Matt Arsenault
4cfc531c99 AMDGPU/GlobalISel: Select SMRD loads for more types
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371954 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:54:07 +00:00
Matt Arsenault
796815a826 AMDGPU/GlobalISel: RegBankSelect for kill
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371953 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:48:37 +00:00
Matt Arsenault
54dea4c5ee AMDGPU/GlobalISel: Legalize s1 source G_[SU]ITOFP
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371952 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:37:10 +00:00
Matt Arsenault
8917673a2f AMDGPU/GlobalISel: Set type on vgpr live in special arguments
Fixes assertion with workitem ID intrinsics used in non-kernel
functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371951 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:33:00 +00:00
Matt Arsenault
83c97ac441 AMDGPU/GlobalISel: Select S16->S32 fptoint
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371950 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:32:56 +00:00
Matt Arsenault
cadee98a6c AMDGPU/GlobalISel: Select s32->s16 G_[US]ITOFP
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371949 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:29:12 +00:00
Matt Arsenault
832d1e1170 AMDGPU/GlobalISel: Fix VALU s16 fneg
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371948 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 00:20:54 +00:00
Stefan Stipanovic
e342d834af [Attributor] Heap-To-Stack Conversion
D53362 gives a prototype heap-to-stack conversion pass. With addition of new attributes in the attributor, this can now be revisted and improved. This will place it in the Attributor to make it easier to use new attributes (eg. nofree, nosync, willreturn, etc.) and other attributor features.

Reviewers: jdoerfert, uenoku, hfinkel, efriedma

Subscribers: lebedev.ri, xbolva00, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D65408

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371942 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 21:47:41 +00:00
Sanjay Patel
3050f70ce7 [InstCombine] remove unneeded one-use checks for icmp fold
This fold and several others were added in:
rL125734
...with no explanation for the one-use checks other than the code
comments about register pressure.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

There are similar checks as noted with the TODO comments. I'm
hoping to remove those restrictions too, but if any of these
does cause a regression, it should be easier to correct by making
small, individual commits.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371940 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 20:56:34 +00:00
Sanjay Patel
81ce279f2b [InstCombine] add icmp tests with extra uses; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371939 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 20:13:27 +00:00