Commit Graph

31684 Commits

Author SHA1 Message Date
James Molloy
27fd3594bf [ModuloSchedule] Peel out prologs and epilogs, generate actual code
Summary:
This extends the PeelingModuloScheduleExpander to generate prolog and epilog code,
and correctly stitch uses through the prolog, kernel, epilog DAG.

The key concept in this patch is to ensure that all transforms are *local*; only a
function of a block and its immediate predecessor and successor. By defining the problem in this way
we can inductively rewrite the entire DAG using only local knowledge that is easy to
reason about.

For example, we assume that all prologs and epilogs are near-perfect clones of the
steady-state kernel. This means that if a block has an instruction that is predicated out,
we can redirect all users of that instruction to that equivalent instruction in our
immediate predecessor. As all blocks are clones, every instruction must have an equivalent in
every other block.

Similarly we can make the assumption by construction that if a value defined in a block is used
outside that block, the only possible user is its immediate successors. We maintain this
even for values that are used outside the loop by creating a limited form of LCSSA.

This code isn't small, but it isn't complex.

Enabled a bunch of testing from Hexagon. There are a couple of tests not enabled yet;
I'm about 80% sure there isn't buggy codegen but the tests are checking for patterns
that we don't produce. Those still need a bit more investigation. In the meantime we
(Google) are happy with the code produced by this on our downstream SMS implementation,
and believe it generates correct code.

Subscribers: mgorny, hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68205

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373462 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 12:46:44 +00:00
Hans Wennborg
e1e678465b Revert r373431 "Switch lowering: omit range check for bit tests when default is unreachable (PR43129)"
This broke http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/19967

> Switch lowering: omit range check for bit tests when default is unreachable (PR43129)
>
> This is modeled after the same functionality for jump tables, which was
> added in r357067.
>
> Differential revision: https://reviews.llvm.org/D68131

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373454 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 12:08:44 +00:00
David Green
2e3f961efc [ARM] Identity shuffles are legal
Identity shuffles, of the form (0, 1, 2, 3, ...) are perfectly OK under MVE
(they essentially just become bitcasts). We were not catching that in the
existing set of what we considered legal though. On NEON, they would be covered
by vext's, but that is not generally available in MVE.

This uses ShuffleVectorInst::isIdentityMask which is a little odd to use here
but does what we want and prevents us from just rewriting what is the same
function.

Differential Revision: https://reviews.llvm.org/D68241


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373446 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 11:40:51 +00:00
Hans Wennborg
1256340bcb Switch lowering: omit range check for bit tests when default is unreachable (PR43129)
This is modeled after the same functionality for jump tables, which was
added in r357067.

Differential revision: https://reviews.llvm.org/D68131

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373431 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 08:32:15 +00:00
Craig Topper
2d10106e0c [X86] Add broadcast load folding patterns to the NoVLX compare patterns.
These patterns use zmm registers for 128/256-bit compares when
the VLX instructions aren't available. Previously we only
supported registers, but as PR36191 notes we can fold broadcast
loads, but not regular loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373423 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 04:45:02 +00:00
Matt Arsenault
65c0907e68 AMDGPU/GlobalISel: Assume VGPR for G_FRAME_INDEX
In principle this should behave as any other constant. However
eliminateFrameIndex currently assumes a VALU use and uses a vector
shift. Work around this by selecting to VGPR for now until
eliminateFrameIndex is fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373415 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 01:02:24 +00:00
Matt Arsenault
addd7f5bb6 AMDGPU/GlobalISel: Private loads always use VGPRs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373414 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 01:02:21 +00:00
Matt Arsenault
8899bbd753 AMDGPU/GlobalISel: Legalize 1024-bit G_BUILD_VECTOR
This will be needed to support AGPR operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373413 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 01:02:18 +00:00
Matt Arsenault
db73d795b0 AMDGPU/GlobalISel: Fix RegBankSelect for 1024-bit values
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373412 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 01:02:14 +00:00
Stanislav Mekhanoshin
e2995cfed4 [AMDGPU] separate accounting for agprs
Account and report agprs separately on gfx908. Other targets
do not change the reporting.

Differential Revision: https://reviews.llvm.org/D68307

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373411 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 00:26:58 +00:00
Craig Topper
8513fb6993 [X86] Add a DAG combine to shrink vXi64 gather/scatter indices that are constant with sufficient sign bits to fit in vXi32
The gather/scatter instructions can implicitly sign extend the indices. If we're operating on 32-bit data, an v16i64 index can force a v16i32 gather to be split in two since the index needs 2 registers. If we can shrink the index to the i32 we can avoid the split. It should always be safe to shrink the index regardless of the number of elements. We have gather/scatter instructions that can use v2i32 index stored in a v4i32 register with v2i64 data size.

I've limited this to before legalize types to avoid creating a v2i32 after type legalization. We could check for it, but we'd also need testing. I'm also only handling build_vectors with no bitcasts to be sure the truncate will constant fold.

Differential Revision: https://reviews.llvm.org/D68247

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373408 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 23:18:31 +00:00
Changpeng Fang
5165c118c2 AMDGPU: Fix an out of date assert in addressing FrameIndex
Reviewers:
  arsenm

Differential Revision:
  https://reviews.llvm.org/D67574

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373404 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 23:07:14 +00:00
Craig Topper
611bf0527c Revert r373172 "[X86] Add custom isel logic to match VPTERNLOG from 2 logic ops."
This seems to be causing some performance regresions that I'm
trying to investigate.

One thing that stands out is that this transform can increase
the live range of the operands of the earlier logic op. This
can be bad for register allocation. If there are two logic
op inputs we should really combine the one that is closest, but
SelectionDAG doesn't have a good way to do that. Maybe we need
to do this as a basic block transform in Machine IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373401 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 22:40:03 +00:00
Craig Topper
e9fa1c7882 [X86] convertToThreeAddress, make sure second operand of SUB32ri is really an immediate before calling getImm().
It might be a symbol instead. We can't fold those since we can't
negate them.

Similar for other SUB with immediates.

Fixes PR43529.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373397 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 21:55:55 +00:00
Sanjay Patel
e5b9a43eaa [BypassSlowDivision][CodeGenPrepare] avoid crashing on unused code (PR43514)
https://bugs.llvm.org/show_bug.cgi?id=43514

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373394 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 21:25:36 +00:00
Jakub Kuderski
51ef81581b Add a missing pass in ARM O3 pipeline
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373382 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 18:53:54 +00:00
Jakub Kuderski
f2bab36046 [Dominators][CodeGen] Don't mark MachineDominatorTree as preserved in MachineLICM
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373378 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 18:27:44 +00:00
David Green
92d26cbe33 [ARM] Some MVE shuffle plus extend tests. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373368 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 18:04:02 +00:00
Matt Arsenault
8ec8c66e71 AMDGPU/GlobalISel: Increase max legal size to 1024
There are 1024 bit register classes defined for AGPRs. Additionally
OpenCL defines vectors up to 16 x i64, and this helps those tests
legalize.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373350 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 16:35:06 +00:00
Craig Topper
61fe4ef60e [X86] Add a VBROADCAST_LOAD ISD opcode representing a scalar load broadcasted to a vector.
Summary:
This adds the ISD opcode and a DAG combine to create it. There are
probably some places where we can directly create it, but I'll
leave that for future work.

This updates all of the isel patterns to look for this new node.
I had to add a few additional isel patterns for aligned extloads
which we should probably fix with a DAG combine or something. This
does mean that the broadcast load folding for avx512 can no
longer match a broadcasted aligned extload.

There's still some work to do here for combining a broadcast of
a broadcast_load. We also need to improve extractelement or
demanded vector elements of a broadcast_load. I'll try to get
those done before I submit this patch.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68198

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373349 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 16:28:20 +00:00
Jakub Kuderski
dbd92ddd52 [Dominators][CodeGen] Add MachinePostDominatorTree verification
Summary:
This patch implements Machine PostDominator Tree verification and ensures that the verification doesn't fail the in-tree tests.

MPDT verification can be enabled using `verify-machine-dom-info` -- the same flag used by Machine Dominator Tree verification.

Flipping the flag revealed that MachineSink falsely claimed to preserve CFG and MDT/MPDT. This patch fixes that.

Reviewers: arsenm, hliao, rampitec, vpykhtin, grosser

Reviewed By: hliao

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68235

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373341 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 15:23:27 +00:00
Sam Parker
694d0b5821 [NFC][ARM][MVE] More tests
Add some tail predication tests with fast math.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373331 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 13:02:14 +00:00
Dmitri Gribenko
30e86caa76 Revert "GlobalISel: Handle llvm.read_register"
This reverts commit r373294. It broke Clang's
CodeGen/arm64-microsoft-status-reg.cpp:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/18483

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373310 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 08:24:01 +00:00
Craig Topper
f7e6d9fc31 [X86] Consider isCodeGenOnly in the EVEX2VEX pass to make VMAXPD/PS map to the non-commutable VEX instruction. Use EVEX2VEX override to fix the scalar instructions.
Previously the match was ambiguous and VMAXPS/PD and VMAXCPS/PD
were mapped to the same VEX instruction. But we should keep
the commutableness when change the opcode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373303 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 07:10:09 +00:00
Heejin Ahn
9aae96d050 [WebAssembly] Make sure EH pads are preferred in sorting
Summary:
In CFGSort, we try to make EH pads have higher priorities as soon as
they are ready to be sorted, to prevent creation of unwind destination
mismatches in CFGStackify. We did that by making priority queues'
comparison function  prefer EH pads, but it was possible for an EH pad
to be popped from `Preferred` queue and then not sorted immediately and
enter `Ready` queue instead in a certain condition. This patch makes
sure that special condition does not consider EH pads as its candidates.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68229

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373302 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 06:53:28 +00:00
Heejin Ahn
d103507e82 [WebAssembly] Unstackify regs after fixing unwinding mismatches
Summary:
Fixing unwind mismatches for exception handling can result in splicing
existing BBs and moving some of instructions to new BBs. In this case
some of stackified def registers in the original BB can be used in the
split BB. For example, we have this BB and suppose %r0 is a stackified
register.
```
bb.1:
  %r0 = call @foo
  ... use %r0 ...
```

After fixing unwind mismatches in CFGStackify, `bb.1` can be split and
some instructions can be moved to a newly created BB:
```
bb.1:
  %r0 = call @foo

bb.split (new):
  ... use %r0 ...
```

In this case we should make %r0 un-stackified, because its use is now in
another BB.

When spliting a BB, this CL unstackifies all def registers that have
uses in the new split BB.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68218

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373301 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 06:21:53 +00:00
Matt Arsenault
1346fe80b5 AMDGPU/GlobalISel: Select s1 src G_SITOFP/G_UITOFP
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373298 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 02:23:20 +00:00
Matt Arsenault
cbd0775331 AMDGPU/GlobalISel: Add support for init.exec intrinsics
TThe existing wave32 behavior seems broken and incomplete, but this
reproduces it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373296 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 02:07:25 +00:00
Matt Arsenault
af5c54f584 GlobalISel: Handle llvm.read_register
SelectionDAG has a bunch of machinery to defer this to selection time
for some reason. Just directly emit a copy during IRTranslator. The
x86 usage does somewhat questionably check hasFP, which could depend
on the whole function being at minimum translated.

This does lose the convergent bit if the callsite had it, which may be
a problem. We also lose that in general for intrinsics, which may also
be a problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373294 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 02:07:16 +00:00
Matt Arsenault
900bd7250e AMDGPU/GlobalISel: Avoid creating shift of 0 in arg lowering
This is sort of papering over the fact that we don't run a combiner
anywhere, but avoiding creating 2 instructions in the first place is
easy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373293 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 01:44:46 +00:00
Craig Topper
d1e95f6529 [X86] Add test case to show missed opportunity to shrink a constant index to a gather in order to avoid splitting.
Also add a test case for an index that could be shrunk, but
would create a narrow type. We can go ahead and do it we just
need to be before type legalization.

Similar test cases for scatter as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373290 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 01:27:52 +00:00
Matt Arsenault
1c6b68965a AMDGPU/GlobalISel: Select G_UADDO/G_USUBO
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373288 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 01:23:13 +00:00
Matt Arsenault
fa3f3e76a0 GlobalISel: Implement widenScalar for G_SITOFP/G_UITOFP sources
Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373287 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 01:06:48 +00:00
Matt Arsenault
c58403f341 AMDGPU/GlobalISel: Legalize G_GLOBAL_VALUE
Handle other cases besides LDS. Mostly a straight port of the existing
handling, without the intermediate custom nodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373286 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 01:06:43 +00:00
Amaury Sechet
b087c8244d Add partial bswap test to the X86 backend. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373271 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 22:52:28 +00:00
Craig Topper
f187c1e6d6 [X86] Mask off upper bits of splat element in LowerBUILD_VECTORvXi1 when forming a SELECT.
The i1 scalar would have been type legalized to i8, but that
doesn't guarantee anything about the upper bits. If we're going
to use it as condition we need to make sure the upper bits are 0.

I've special cased ISD::SETCC conditions since that should
guarantee zero upper bits. We could go further and use
computeKnownBits, but we have no tests that would need that.

Fixes PR43507.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373246 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 18:43:44 +00:00
Craig Topper
80bba055ba [X86] Add ANY_EXTEND to switch in ReplaceNodeResults, but just fall back to default handling.
ANY_EXTEND of v8i8 is marked Custom on AVX512 for handling extends
from v8i8. But the type legalization infrastructure will call
ReplaceNodeResults for v8i8 results. We should just defer it the
default handling instead of asserting in the default of the switch.

Fixes PR43509.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373234 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 17:14:22 +00:00
Kerry McLaughlin
fe0144dd77 [AArch64][SVE] Implement punpk[hi|lo] intrinsics
Summary:
Adds the following two intrinsics:
  - int_aarch64_sve_punpkhi
  - int_aarch64_sve_punpklo

This patch also contains a fix which allows LLVMHalfElementsVectorType
to forward reference overloadable arguments.

Reviewers: sdesmalen, rovka, rengolin

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, greened, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67830

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373232 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 17:10:21 +00:00
Jessica Paquette
b40beba15e [AArch64][GlobalISel] Support lowering variadic musttail calls
This adds support for lowering variadic musttail calls. To do this, we have
to...

- Detect a musttail call in a variadic function before attempting to lower the
  call's formal arguments. This is done in the IRTranslator.
- Compute forwarded registers in `lowerFormalArguments`, and add copies for
  those registers.
- Restore the forwarded registers in `lowerTailCall`.

Because there doesn't seem to be any nice way to wrap these up into the outgoing
argument handler, the restore code in `lowerTailCall` is done separately.

Also, irritatingly, you have to make sure that the registers don't overlap with
any passed parameters. Otherwise, the scheduler doesn't know what to do with the
extra copies and asserts.

Add call-translator-variadic-musttail.ll to test this. This is pretty much the
same as the X86 musttail-varargs.ll test. We didn't have as nice of a test to
base this off of, but the idea is the same.

Differential Revision: https://reviews.llvm.org/D68043

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373226 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 16:49:13 +00:00
Amaury Sechet
adac506267 Add tests for rotate with demanded bits. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373223 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 16:26:09 +00:00
Alexander Timofeev
864a620db8 [AMDGPU] SIFoldOperands should not fold register acrocc the EXEC definition
Reviewers: rampitec

      Differential Revision: https://reviews.llvm.org/D67662

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373221 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 15:31:17 +00:00
Paul Robinson
3315a1eeb2 [SSP] [3/3] cmpxchg and addrspacecast instructions can now
trigger stack protectors.  Fixes PR42238.

Add test coverage for llvm.memset, as proxy for all llvm.mem*
intrinsics. There are two issues here: (1) they could be lowered to a
libc call, which could be intercepted, and do Bad Stuff; (2) with a
non-constant size, they could overwrite the current stack frame.

The test was mostly written by Matt Arsenault in r363169, which was
later reverted; I tweaked what he had and added the llvm.memset part.

Differential Revision: https://reviews.llvm.org/D67845

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373220 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 15:11:23 +00:00
Paul Robinson
4d6c2abfdd [SSP] [1/3] Revert "StackProtector: Use PointerMayBeCaptured"
"Captured" and "relevant to Stack Protector" are not the same thing.

This reverts commit f29366b1f594f48465c5a2754bcffac6d70fd0b1.
aka r363169.

Differential Revision: https://reviews.llvm.org/D67842

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373216 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 15:01:35 +00:00
Sam Parker
ac0f4f5b2f [NFC][ARM][MVE] More tests
Add some loop tests that cover different float operations and types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373192 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 08:49:42 +00:00
Hans Wennborg
f608f8a7c1 Pre-commit a test case for PR43129.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373190 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 08:47:46 +00:00
Sam Parker
132df1d1cd [ARM][MVE] Change VCTP operand
The VCTP instruction will calculate the predicate masked based upon
the number of elements that need to be processed. I had inserted the
sub before the vctp intrinsic and supplied it as the operand, but
this is incorrect as the phi should directly feed the vctp. The sub
is calculating the value for the next iteration.

Differential Revision: https://reviews.llvm.org/D67921

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373188 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 08:03:23 +00:00
Roger Ferrer Ibanez
ca6e7338d6 [TargetLowering] Simplify expansion of S{ADD,SUB}O
ISD::SADDO uses the suggested sequence described in the section §2.4 of
the RISCV Spec v2.2. ISD::SSUBO uses the dual approach but checking for
(non-zero) positive.

Differential Revision: https://reviews.llvm.org/D47927

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373187 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 07:58:50 +00:00
Sam Parker
58649a3897 [ARM][CGP] Allow signext arguments
As we perform a zext on any arguments used in the promoted tree, it
doesn't matter if they're marked as signext. The only permitted
user(s) in the tree which would interpret the sign bits are signed
icmps. For these instructions, their promoted operands are truncated
before the icmp uses them.

Differential Revision: https://reviews.llvm.org/D68019

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373186 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 07:52:10 +00:00
Jonas Paulsson
3adcef5bd4 [SystemZ] Add SystemZPostRewrite in addPostRegAlloc() instead at -O0.
SystemZPostRewrite needs to be run before (it may emit COPYs) the Post-RA
pseudo pass also at -O0, so it should be added in addPostRegAlloc().

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373182 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 07:29:54 +00:00
Matt Arsenault
fe518bfae1 AMDGPU/GlobalISel: Fix select for v2s16 and/or/xor
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373180 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 06:31:30 +00:00