193574 Commits

Author SHA1 Message Date
Scott Linder
a70016c8d5 [AMDGPU] Add Scratch Wave Offset to Scratch Buffer Descriptor in entry functions
Add the scratch wave offset to the scratch buffer descriptor (SRSrc) in
the entry function prologue. This allows us to removes the scratch wave
offset register from the calling convention ABI.

As part of this change, allow the use of an inline constant zero for the
SOffset of MUBUF instructions accessing the stack in entry functions
when a frame pointer is not requested/required. Entry functions with
calls still need to set up the calling convention ABI stack pointer
register, and reference it in order to address arguments of called
functions. The ABI stack pointer register remains unswizzled, but is now
wave-relative instead of queue-relative.

Non-entry functions also use an inline constant zero SOffset for
wave-relative scratch access, but continue to use the stack and frame
pointers as before. When the stack or frame pointer is converted to a
swizzled offset it is now scaled directly, as the scratch wave offset no
longer needs to be subtracted first.

Update llvm/docs/AMDGPUUsage.rst to reflect these changes to the calling
convention.

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75138
2020-03-19 15:35:16 -04:00
Scott Linder
00b0d3e619 [AMDGPU][NFC] Refactor some uses of unsigned to Register
Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76035
2020-03-19 15:35:16 -04:00
Scott Linder
9e034896d6 [AMDGPU][NFC] Refactor emitEntryFunctionPrologue
Remove dead code and factor repeated conditions out into a single check.
Rename and move code to make it more obvious what is running only for
entry functions. Simplify function arguments to make it clearer what the
relevant inputs are. Make flat scratch init accept an MBB iterator and
move it to where it was logically being emitted within the prologue.

These changes will make a future update to the calling convention
simpler.

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75092
2020-03-19 15:35:16 -04:00
Florian Hahn
3de3635359 [Matrix] Hoist load/store generation logic, add helpers for tiled access.
This patch slightly generalizes the code to emit loads and stores of a
matrix and adds helpers to load/store a tile of a larger matrix.

This will be used in a follow-up patch introducing initial tiling.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D75564
2020-03-19 19:28:21 +00:00
Simon Pilgrim
4ca6edb997 [InstCombine][X86] Tests for variable but in-range vector-by-scalar shift amounts (PR40391)
These shifts are masked to be inrange so we should be able to replace them with generic shifts.
2020-03-19 19:24:55 +00:00
Simon Pilgrim
3e6dfd9a5a [InstCombine][X86] simplifyX86immShift - handle variable out-of-range vector shift by immediate amounts (PR40391)
If we know the SSE shift amount is out of range then we can simplify to zero value (logical) or a 'signsplat' bitwidth-1 shift (arithmetic). This allows us to remove the equivalent ConstantInt constant folding path from simplifyX86immShift.
2020-03-19 18:27:31 +00:00
Cameron McInally
b07cda4b10 [AArch64][SVE] Add support for DestructiveBinaryImm DestructiveInstType
Support prefixing destructive operations, with the MOVPRFX instruction, to build constructive operations.

Differential Revision: https://reviews.llvm.org/D75064
2020-03-19 13:11:46 -05:00
Lang Hames
dac63bbcfc [ORC] Don't use a platform mutex for LLJIT's GenericLLVMIRPlatformSupport class.
Along the same lines as eb918d8daf1: This code also had to acquire the session
mutex, and this could cause a deadlock under the wrong circumstances. This
patch updates GenericLLVMIRPlatformSupport to just use the session lock for
everything.
2020-03-19 11:03:34 -07:00
Lang Hames
2945504341 [ORC] Fix indentation in debugging output. 2020-03-19 11:02:56 -07:00
Lang Hames
f62b9d12ef [ORC] Use finer-grained and session locking in MachOPlatform to avoid deadlock.
In MachOPlatform, obtaining the link-order for a JITDylib requires locking the
session, but also needs to be part of a larger atomic operation that collates
initializer symbols tracked by the platform. Trying to do this under a separate
platform mutex leads to potential locking order issues, e.g.

T1 locks session then tries to lock platform to register a new init symbol
meanwhile
T2 locks platform then tries to lock session to obtain link order.

Removing the platform lock and performing all these operations under the session
lock eliminates this possibility.

At the same time we also need to collate init pointers from the
MachOPlatform::InitScraperPlugin, and we don't need or want to lock the session
for that. The new InitSeqMutex has been added to guard these init pointers, and
the session mutex is never obtained while the InitSeqMutex is held.
2020-03-19 11:02:56 -07:00
Lang Hames
ea57541031 [ORC] Don't waste time building empty replacement MaterializationUnits. 2020-03-19 11:02:56 -07:00
Lang Hames
91189ff985 [ORC] Bail out early if a replacement MaterializationUnit is empty.
The MU may define no symbols, but still contain a non-trivial destructor (e.g.
an LLVM IR module that has been stripped of all externally visible
definitions, but which still needs to lock its context to be destroyed).
Bailing out early ensures that we destroy the unit outside the session lock,
rather than under it which may cause deadlocks.

Also adds some extra sanity-checking assertions.
2020-03-19 11:02:56 -07:00
Sanjay Patel
5c8fcd9cfc [SDAG] reduce code duplication in getNegatedExpression(); NFCI 2020-03-19 13:55:15 -04:00
Craig Topper
a63f955255 [X86] Attempt to more accurately model the cost of a bool reduction of wide vector type.
Previously we multiplied the cost for the table entries by the number of splits needed. But that implies that each split goes through a reduction to scalar independently. I think what really happens is that the we AND/OR the split pieces until we're down to a single value with a legal type and then do special reduction sequence on that.

So to model that this patch takes the number of splits minus one multiplied by the cost of a AND/OR at the legal element count and adds that on top of the table lookup.

Differential Revision: https://reviews.llvm.org/D76400
2020-03-19 09:31:05 -07:00
Vedant Kumar
af2c299da6 [test] Re-enable accidentally disabled X86 tests
A number of X86 tests were accidentally disabled in
https://reviews.llvm.org/D73568. This commit re-enables those tests.

```
$ for x86_test in $(gg 'REQUIRES: x86$' llvm/test | fst); do sed -i "" '/REQUIRES: x86/d' $x86_test; done
```

(Note that 'x86' is not an available feature, that's what caused the
tests to be disabled.)
2020-03-19 09:29:23 -07:00
Sam Parker
782943a635 [NFC][ARM] Fix for buildbots
Update broken test.
2020-03-19 15:50:13 +00:00
Simon Pilgrim
b65b8d7149 [InstCombine][X86] simplifyX86immShift - convert variable in-range vector shift by immediate amounts to generic shifts (PR40391)
The slli/srli/srai 'immediate' vector shifts (although its not immediate anymore to match gcc) can be replaced with generic shifts if the shift amount is known to be in range.
2020-03-19 15:44:24 +00:00
Sam Parker
0d946d0b6f [NFC][ARM] Add two tests
Add tests for v8m indvar simplify.
2020-03-19 15:18:33 +00:00
Sean Fertile
ec983b6310 [PowerPC][AIX] Simplify the check prefixes in the ByVal lit tests. [NFC] 2020-03-19 10:59:48 -04:00
Georgii Rymar
ec5f8fac65 [obj2yaml][test] - Update test after output change.
D76227 changed the output. This test was forgotten because
belonged to a different patch.
2020-03-19 17:42:36 +03:00
Georgii Rymar
a5f9503a48 [obj2yaml] - SHT_DYNAMIC and SHT_REL* sections: stop dumping sh_entsize field when it has the default value.
Currently obj2yaml always emits the `EntSize` property when `sh_entsize != 0`.
It is not correct. For example, for `SHT_DYNAMIC` section, `EntSize == 0`
is abnormal, while `sizeof(ELFT::Dyn)` is the expected default.

To reduce the output produces we should not dump default values.

yaml2obj tests that shows `sh_entsize` values produced are:
1) For `SHT_REL*` sections: `yaml2obj\ELF\reloc-sec-entry-size.yaml`
2) For `SHT_DYNAMIC`: `yaml2obj\ELF\dynamic-section.yaml`

Differential revision: https://reviews.llvm.org/D76227
2020-03-19 17:25:53 +03:00
Georgii Rymar
6ed9b0469a [obj2yaml] - SHT_REL*, SHT_DYNAMIC sections: add tests to document the behavior when sh_entsize is broken.
We do not have tests that shows the current behavior.
It is needed for D76227 which changes the logic of dumping of `EntSize` fields.

Differential revision: https://reviews.llvm.org/D76282
2020-03-19 16:43:40 +03:00
Kamau Bridgeman
a5325c666a Test commit. 2020-03-19 08:34:48 -05:00
Piotr Sobczak
bfea4da05b [NFC] Simplify test
Remove extra preheader block as there is no value in keeping it.
2020-03-19 14:29:57 +01:00
Simon Pilgrim
802df63b38 [InstCombine][X86] Tests for variable but in-range vector-by-scalar shift amounts (PR40391)
These shifts are masked to be inrange so we should be able to replace them with generic shifts.
2020-03-19 13:11:06 +00:00
Andrew Ng
f6ce5a32bc [Support] Improve Windows widenPath and add support for long UNC paths
Check the path length limit against the length of the UTF-16 version of
the input rather than the UTF-8 equivalent, as the UTF-16 length may be
shorter. Move widenPath from the llvm::sys::path namespace in Path.h to
the llvm::sys::windows namespace in WindowsSupport.h. Only use the
reduced path length limit for create directory. Canonicalize using
sys::path::remove_dots().

Differential Revision: https://reviews.llvm.org/D75372
2020-03-19 13:00:21 +00:00
Djordje Todorovic
39a83a112e Reland D73534: [DebugInfo] Enable the debug entry values feature by default
The issue that was causing the build failures was fixed with the D76164.
2020-03-19 13:57:30 +01:00
Andrzej Warzynski
e3161d760e [AArch64][SVE] Rename intrinsics for gather prefetch [NFC]
Summary:
In order to keep the names consistent with other SVE gather loads, the
intrinsics for gather prefetch are renamed as follows:
  * @llvm.aarch64.sve.gather.prfb -> @llvm.aarch64.sve.prfb.gather

Reviewed by: fpetrogalli

Differential Revision: https://reviews.llvm.org/D76421
2020-03-19 12:53:36 +00:00
Florian Hahn
e306a84b0e [SCCP] Use constant ranges for PHI nodes.
For PHIs with multiple incoming values, we can improve precision by
using constant ranges for integers. We can over-approximate phis
by merging the incoming values.

Reviewers: davide, efriedma, mssimpso

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D71933
2020-03-19 12:45:33 +00:00
Igor Kudrin
9373d5a9bf [llvm-dwp] Start error messages with a lowercase letter.
We usually start error messages with lowercase letters and most of them
in llvm-dwp follow that rule. This patch fixes a few messages that
started with capital letters.

Differential revision: https://reviews.llvm.org/D76277
2020-03-19 19:43:08 +07:00
Simon Pilgrim
a650625f54 [ValueTracking] Add computeKnownBits DemandedElts support to ADD/SUB/MUL instructions (PR36319) 2020-03-19 12:41:29 +00:00
Simon Pilgrim
5e7b2db237 [InstSimplify] Add missing vector ADD+SUB tests to show lack of DemandedElts support 2020-03-19 11:27:27 +00:00
Simon Pilgrim
5ae8a3fb52 [InstSimplify] Add missing vector MUL tests to show lack of DemandedElts support 2020-03-19 11:27:27 +00:00
Cullen Rhodes
f7276b13f4 [ValueTypes] Add support for scalable EVTs
Summary:
* Remove a bunch of asserts checking for unsupported scalable types and
  add some more now that they are supported.
* Propagate the scalable flag where necessary.
* Add another `EVT::getExtendedVectorVT` method that takes an
  ElementCount parameter.
* Add `EVT::isExtendedScalableVector` and
  `EVT::getExtendedVectorElementCount` - latter is currently unused.

Reviewers: sdesmalen, efriedma, rengolin, craig.topper, huntergr

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75672
2020-03-19 11:04:15 +00:00
Georgii Rymar
6b415a778f [obj2yaml] - Stop dumping an empty sh_info field for SHT_RELA/SHT_REL sections.
`.rela.dyn` is a dynamic relocation section that normally has
no value in `sh_info` field.

The existent `elf-reladyn-section-shinfo.yaml` which tests this piece has issues:

1) It does not check the case when we have more than one `SHT_REL[A]`
   section with `sh_info == 0` in the object. Because of this it did not catch the issue.
   Currently we print an excessive "Info" field:

```
  - Name:            .rela.dyn
    Type:            SHT_RELA
    EntSize:         0x0000000000000018
  - Name:            .rel.dyn
    Type:            SHT_REL
    EntSize:         0x0000000000000010
    Info:            ' [1]'
```

2) It seems can be more generic. I've added a `rel-rela-section.yaml` instead.

Differential revision: https://reviews.llvm.org/D76281
2020-03-19 14:00:21 +03:00
Adrian Kuegel
09d2d2b02c Revert "CFGDiff: Simplify/common the begin/end implementations to use a common range helper"
This reverts commit 79a7ed92a9b135212a6a271dd8dbc625038c8f06.
This breaks the asan buildbot.
2020-03-19 11:25:10 +01:00
Cullen Rhodes
157d5e9a63 [ValueTypes] Add EVT::isFixedLengthVector
Summary:
Related to D75672, this patch adds EVT::isFixedLengthVector to determine
if the underlying vector type is of fixed length.

An assert is introduced in EVT::getVectorNumElements that triggers for
types that aren't fixed length. This is currently guarded by a flag
added D75297 that is off by default and has been renamed to the more
generic ENABLE_STRICT_FIXED_SIZE_VECTORS.

Ideally we want to get rid of getVectorNumElements but a quick grep
shows there are >350 uses in lib/CodeGen and 75 in lib/Target/AArch64
alone. All of these probably aren't EVT::getVectorNumElements (some may
be the MVT equivalent), but there are many places to fixup and having
the assert on by default would make the SVE upstreaming effort
difficult.

Reviewers: sdesmalen, efriedma, ctetreau, huntergr, rengolin

Reviewed By: efriedma

Subscribers: mgorny, kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76376
2020-03-19 10:08:17 +00:00
LLVM GN Syncbot
a00881efbf [gn build] Port 733b3199487 2020-03-19 09:52:27 +00:00
Simon Moll
4405e5770f [VP,Integer,#1] Vector-predicated integer intrinsics
Summary:
This patch adds IR intrinsics for vector-predicated integer arithmetic.

It is subpatch #1 of the [integer
slice](https://reviews.llvm.org/D57504#1732277) of
[LLVM-VP](https://reviews.llvm.org/D57504).  LLVM-VP is a larger effort to bring
native vector predication to LLVM.

Reviewed By: andrew.w.kaylor

Differential Revision: https://reviews.llvm.org/D69891
2020-03-19 10:51:47 +01:00
Florian Hahn
b3d85ce4e1 [SCCP] Use constant ranges for binary operators.
If one of the operands of a binary operator is a constant range, we can
use ConstantRange::binaryOp to approximate the result.

We still handle single element constant ranges as we did previously,
with ConstantExpr::get(), because ConstantRange::binaryOp still gives
worse results in a few cases for single element ranges.

Also note that we bail out early if any of the operands is still unknown.

Reviewers: davide, efriedma, mssimpso

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D71936
2020-03-19 09:35:48 +00:00
Chen Zheng
b875f44290 [Reassociate] add testcases for more than 1 pairs - NFC 2020-03-19 05:21:24 -04:00
Chen Zheng
4ee17cafbe [PowerPC] implement target hook isProfitableToHoist
On Powerpc fma is faster than fadd + fmul for some types,
(PPCTargetLowering::isFMAFasterThanFMulAndFAdd). we should implement target
hook isProfitableToHoist to prevent simplifyCFGpass from breaking fma
pattern by hoisting fmul to predecessor block.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D76207
2020-03-19 00:17:25 -04:00
Huihui Zhang
83d6fa0d27 [InstCombine][SVE] Fix InstCombiner::visitAllocaInst for scalable vector.
Summary:
DataLayout::getTypeAllocSize() return TypeSize. For cases where scalable
property doesn't matter (check for zero-sized alloca), we should explicitly
call getKnownMinSize() to avoid implicit type conversion to uint64_t, which is
invalid for scalable vector type.

Reviewers: sdesmalen, efriedma, spatel, apazos

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76386
2020-03-18 20:57:14 -07:00
David Blaikie
e19b51fd35 CFGDiff: Simplify/common the begin/end implementations to use a common range helper
(would be nice to revisit the CFG traits and change them to use ranges
rather than begin/end - if anyone wants to do that refactor)

Also use more auto because writing the names of range utilty iterators
isn't helping readability here - they're sort of implementation details
for the most part, especially once you nest a few different filtering
and adapting iterators.
2020-03-18 20:56:11 -07:00
Chen Zheng
a051c821b0 [PowerPC] add IR level isFMAFasterThanFMulAndFAdd - NFC
And also refactor legacy MIR level isFMAFasterThanFMulAndFAdd.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D76265
2020-03-18 23:24:40 -04:00
Craig Topper
c887300cf8 [SelectionDAG] When splitting gathers/scatters in type legalization, set MMO size to UnknownSize
Gather/scatter don't access one memory location, they access multiple disjoint locations. So using a fixed size isn't accurate. But we don't have a way to represent the true behavior so just use UnknownSize.

Previously we "split" the memory VT and use that size for the MMO of each half. But the memory VT is scalar so splitting usually just returned the original scalar VT, but on 32-bit X86 if the scalar VT was i64 it probably returned i32?

Differential Revision: https://reviews.llvm.org/D76388
2020-03-18 16:07:15 -07:00
Sanjay Patel
a66aea4c4a [LangRef] fix typo in select poison explanation; NFC 2020-03-18 18:59:14 -04:00
Louis Dionne
c0d7629008 [lit] Add builtin support for flaky tests in lit
This commit adds a new keyword in lit called ALLOW_RETRIES. This keyword
takes a single integer as an argument, and it allows the test to fail that
number of times before it first succeeds.

This work attempts to make the existing test_retry_attempts more flexible
by allowing by-test customization, as well as eliminate libc++'s FLAKY_TEST
custom logic.

Differential Revision: https://reviews.llvm.org/D76288
2020-03-18 18:04:01 -04:00
Simon Pilgrim
ab2d09da1b [ValueTracking] Add computeKnownBits DemandedElts support to masked add instructions (PR36319) 2020-03-18 21:50:56 +00:00
Florian Hahn
35af785b71 [VPlan] Do not print mapping for Value2VPValue.
The latest improvements to VPValue printing make this mapping clear when
printing the operand. Printing the mapping separately is not required
any longer.

Reviewers: rengolin, hsaito, Ayal, gilr

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D76375
2020-03-18 21:44:07 +00:00