590 Commits

Author SHA1 Message Date
Kevin P. Neal
5a8b466d55 [FPEnv] Strict FP tests should use the requisite function attributes.
A set of function attributes is required in any function that uses constrained
floating point intrinsics. None of our tests use these attributes.

This patch fixes this.

These tests have been tested against the IR verifier changes in D68233.

Reviewed by:	andrew.w.kaylor, cameron.mcinally, uweigand
Approved by:	andrew.w.kaylor
Differential Revision:	https://reviews.llvm.org/D67925

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373761 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-04 17:03:46 +00:00
Jonas Paulsson
3adcef5bd4 [SystemZ] Add SystemZPostRewrite in addPostRegAlloc() instead at -O0.
SystemZPostRewrite needs to be run before (it may emit COPYs) the Post-RA
pseudo pass also at -O0, so it should be added in addPostRegAlloc().

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373182 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 07:29:54 +00:00
Kai Nacke
12e921eb1b Change -march=systemz to triple and fix test
These two test cases use -march=systemz instead of a triple. In
particular, the used file format is then based on the default host
triple. This leads to different behaviour on different platforms.

The SystemZ implementation uses the integrated assembler for a
long time now. The mature-mc-support test can be fully enabled.

Differential Revision: https://reviews.llvm.org/D68129

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373098 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-27 16:19:15 +00:00
Jonas Paulsson
c0a66ac414 [SystemZ] Recognize mnop-mcount in backend
With -pg -mfentry -mnop-mcount, a nop is emitted instead of the call to
fentry.

Review: Ulrich Weigand
https://reviews.llvm.org/D67765

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372950 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-26 08:38:07 +00:00
Jonas Paulsson
e22ce6ac6a [SystemZ] Improve emitSelect()
Merge more Select pseudo instructions in emitSelect() by allowing other
instructions between them as long as they do not clobber CC.

Debug value instructions are now moved down to below the new PHIs instead of
erasing them.

Review: Ulrich Weigand
https://reviews.llvm.org/D67619

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372873 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-25 14:00:33 +00:00
Ulrich Weigand
3e44783bf7 [SystemZ] Support z15 processor name
The recently announced IBM z15 processor implements the architecture
already supported as "arch13" in LLVM.  This patch adds support for
"z15" as an alternate architecture name for arch13.

The patch also uses z15 in a number of places where we used arch13
as long as the official name was not yet announced.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372435 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-20 23:04:45 +00:00
Guillaume Chatelet
75f0bef615 [Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes mir parsing
Summary:
This catches malformed mir files which specify alignment as log2 instead of pow2.
See https://reviews.llvm.org/D65945 for reference,

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67433

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371608 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 11:16:48 +00:00
Guozhi Wei
5423bd20ef [BPI] Adjust the probability for floating point unordered comparison
Since NaN is very rare in normal programs, so the probability for floating point unordered comparison should be extremely small. Current probability is 3/8, it is too large, this patch changes it to a tiny number.

Differential Revision: https://reviews.llvm.org/D65303



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371541 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-10 17:25:11 +00:00
Jonas Paulsson
e9b99edbbb [SystemZ] Recognize INLINEASM_BR in backend
Handle the remaining cases also by handling asm goto in
SystemZInstrInfo::getBranchInfo().

Review: Ulrich Weigand
https://reviews.llvm.org/D67151

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371048 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-05 10:20:05 +00:00
Jonas Paulsson
8e4143afe8 [SystemZ] Recognize INLINEASM_BR in backend.
SystemZInstrInfo::analyzeBranch() needs to check for INLINEASM_BR
instructions, or it will crash.

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370753 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-03 13:31:22 +00:00
Jonas Paulsson
e491ffc957 [SystemZ] Add support for fentry.
SystemZAsmPrinter now properly emits function calls to __fentry__.

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370743 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-03 11:21:12 +00:00
Ulrich Weigand
3d6f38ae74 [SystemZ] Support constrained fpto[su]i intrinsics
Now that constrained fpto[su]i intrinsic are available,
add codegen support to the SystemZ backend.

In addition to pure back-end changes, I've also needed
to add the strict_fp_to_[su]int and any_fp_to_[su]int
pattern fragments in the obvious way.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370674 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-02 16:49:29 +00:00
Simon Pilgrim
02b0ff16a7 [SystemZ] Regenerate <8 x i31> store test
To help show the diffs from an upcoming SimplifyDemandedBits patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367216 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-29 09:49:23 +00:00
Simon Pilgrim
c0664972a0 [TargetLowering] Add SimplifyMultipleUseDemandedBits
This patch introduces the DAG version of SimplifyMultipleUseDemandedBits, which attempts to peek through ops (mainly and/or/xor so far) that don't contribute to the demandedbits/elts of a node - which means we can do this even in cases where we have multiple uses of an op, which normally requires us to demanded all bits/elts. The intention is to remove a similar instruction - SelectionDAG::GetDemandedBits - once SimplifyMultipleUseDemandedBits has matured.

The InstCombine version of SimplifyMultipleUseDemandedBits can constant fold which I haven't added here yet, and so far I've only wired this up to some basic binops (and/or/xor/add/sub/mul) to demonstrate its use.

We do see a couple of regressions that need to be addressed:

    AMDGPU unsigned dot product codegen retains an AND mask (for ZERO_EXTEND) that it previously removed (but otherwise the dotproduct codegen is a lot better).
	
    X86/AVX2 has poor handling of vector ANY_EXTEND/ANY_EXTEND_VECTOR_INREG - it prematurely gets converted to ZERO_EXTEND_VECTOR_INREG.

The code owners have confirmed its ok for these cases to fixed up in future patches.

Differential Revision: https://reviews.llvm.org/D63281

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366799 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-23 12:39:08 +00:00
Ulrich Weigand
a63177906c [Strict FP] Allow more relaxed scheduling
Reimplement scheduling constraints for strict FP instructions in
ScheduleDAGInstrs::buildSchedGraph to allow for more relaxed
scheduling.  Specifially, allow one strict FP instruction to
be scheduled across another, as long as it is not moved across
any global barrier.

Differential Revision: https://reviews.llvm.org/D64412

Reviewed By: cameron.mcinally



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366222 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-16 15:55:45 +00:00
Nikita Popov
bbb1b4ffaf [SystemZ] Fix addcarry of addcarry of const carry (PR42606)
This fixes https://bugs.llvm.org/show_bug.cgi?id=42606 by extending
D64213. Instead of only checking if the carry comes from a matching
operation, we now check the full chain of carries. Otherwise we might
custom lower the outermost addcarry, but then generically legalize
an inner addcarry.

Differential Revision: https://reviews.llvm.org/D64658

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365949 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-12 20:03:34 +00:00
Ulrich Weigand
29e5bc5c35 [SystemZ] Add support for new cpu architecture - arch13
This patch series adds support for the next-generation arch13
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Assembler/disassembler support for new instructions.
- CodeGen for new instructions, including new LLVM intrinsics.
- Scheduler description for the new processor.
- Detection of arch13 as host processor.

Note: No currently available Z system supports the arch13
architecture.  Once new systems become available, the
official system name will be added as supported -march name.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365932 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-12 18:13:16 +00:00
Quentin Colombet
99e372c260 [RegisterCoalescer] Fix an overzealous assert
Although removeCopyByCommutingDef deals with full copies, it is still
possible to copy undef lanes and thus, we wouldn't have any a value
number for these lanes.

This fixes PR40215.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365256 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-06 00:34:54 +00:00
Nikita Popov
95c68f1139 [SystemZ] Fix addcarry of usubo (PR42512)
Only custom lower uaddo+addcarry or usubo+subcarry chains and leave
mixtures like usubo+addcarry or uaddo+subcarry to the generic
legalizer. Otherwise we run into issues because SystemZ uses
different CC values for carries and borrows.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42512.

Differential Revision: https://reviews.llvm.org/D64213

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365242 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-05 20:35:11 +00:00
Ulrich Weigand
3f58e45931 Allow matching extend-from-memory with strict FP nodes
This implements a small enhancement to https://reviews.llvm.org/D55506

Specifically, while we were able to match strict FP nodes for
floating-point extend operations with a register as source, this
did not work for operations with memory as source.

That is because from regular operations, this is represented as
a combined "extload" node (which is a variant of a load SD node);
but there is no equivalent using a strict FP operation.

However, it turns out that even in the absence of an extload
node, we can still just match the operations explicitly, e.g.
   (strict_fpextend (f32 (load node:$ptr))

This patch implements that method to match the LDEB/LXEB/LXDB
SystemZ instructions even when the extend uses a strict-FP node.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364450 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 17:19:12 +00:00
Ulrich Weigand
d3bcee4790 [SystemZ] Support vector load/store alignment hints
Vector load/store instructions support an optional alignment field
that the compiler can use to provide known alignment info to the
hardware.  If the field is used (and the information is correct),
the hardware may be able (on some models) to perform faster memory
accesses than otherwise.

This patch adds support for alignment hints in the assembler and
disassembler, and fills in known alignment during codegen.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363806 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-19 14:20:00 +00:00
Matt Arsenault
5b56cc85b0 Rename ExpandISelPseudo->FinalizeISel, delay register reservation
This allows targets to make more decisions about reserved registers
after isel. For example, now it should be certain there are calls or
stack objects in the frame or not, which could have been introduced by
legalization.

Patch by Matthias Braun

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363757 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-19 00:25:39 +00:00
Jonas Paulsson
066bfdcdb8 [SystemZ] Fix AHIMuxK pseudo expansion.
Do not emit a copy if the source and destination registers are the same.

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363665 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-18 12:10:02 +00:00
Luis Marques
e36a3e35dd [DAGCombiner] [CodeGenPrepare] More comprehensive GEP splitting
Some GEPs were not being split, presumably because that split would just be 
undone by the DAGCombiner. Not performing those splits can prevent important 
optimizations, such as preventing the element indices / member offsets from 
being (partially) folded into load/store instruction immediates. This patch:

- Makes the splits also occur in the cases where the base address and the GEP 
  are in the same BB.
- Ensures that the DAGCombiner doesn't reassociate them back again.

Differential Revision: https://reviews.llvm.org/D60294



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363544 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 10:54:12 +00:00
Fangrui Song
1002960b9d [lit] Delete empty lines at the end of lit.local.cfg NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363538 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 09:51:07 +00:00
Sander de Smalen
f4bff34d4d Describe stack-id as an enum
This patch changes MIR stack-id from an integer to an enum,
and adds printing/parsing support for this in MIR files. The default
stack-id '0' is now renamed to 'default'.

This should make MIR tests that have stack objects with different stack-ids
more descriptive. It also clarifies code operating on StackID.

Reviewers: arsenm, thegameg, qcolombet

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D60137


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363533 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 09:13:29 +00:00
Guozhi Wei
7eae8125c6 [MBP] Move a latch block with conditional exit and multi predecessors to top of loop
Current findBestLoopTop can find and move one kind of block to top, a latch block has one successor. Another common case is:

    * a latch block
    * it has two successors, one is loop header, another is exit
    * it has more than one predecessors

If it is below one of its predecessors P, only P can fall through to it, all other predecessors need a jump to it, and another conditional jump to loop header. If it is moved before loop header, all its predecessors jump to it, then fall through to loop header. So all its predecessors except P can reduce one taken branch.

Differential Revision: https://reviews.llvm.org/D43256




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363471 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-14 23:08:59 +00:00
Francis Visoiu Mistrih
c030038e73 [FastISel] Skip creating unnecessary vregs for arguments
This behavior was added in r130928 for both FastISel and SD, and then
disabled in r131156 for FastISel.

This re-enables it for FastISel with the corresponding fix.

This is triggered only when FastISel can't lower the arguments and falls
back to SelectionDAG for it.

FastISel contains a map of "register fixups" where at the end of the
selection phase it replaces all uses of a register with another
register that FastISel sometimes pre-assigned. Code at the end of
SelectionDAGISel::runOnMachineFunction is doing the replacement at the
very end of the function, while other pieces that come in before that
look through the MachineFunction and assume everything is done. In this
case, the real issue is that the code emitting COPY instructions for the
liveins (physreg to vreg) (EmitLiveInCopies) is checking if the vreg
assigned to the physreg is used, and if it's not, it will skip the COPY.
If a register wasn't replaced with its assigned fixup yet, the copy will
be skipped and we'll end up with uses of undefined registers.

This fix moves the replacement of registers before the emission of
copies for the live-ins.

The initial motivation for this fix is to enable tail calls for
swiftself functions, which were blocked because we couldn't prove that
the swiftself argument (which is callee-save) comes from a function
argument (live-in), because there was an extra copy (vreg to vreg).

A few tests are affected by this:

* llvm/test/CodeGen/AArch64/swifterror.ll: we used to spill x21
(callee-save) but never reload it because it's attached to the return.
We now don't even spill it anymore.
* llvm/test/CodeGen/*/swiftself.ll: we tail-call now.
* llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands.ll: I believe this
test was not really testing the right thing, but it worked because the
same registers were re-used.
* llvm/test/CodeGen/ARM/cmpxchg-O0.ll: regalloc changes
* llvm/test/CodeGen/ARM/swifterror.ll: get rid of a copy
* llvm/test/CodeGen/Mips/*: get rid of spills and copies
* llvm/test/CodeGen/SystemZ/swift-return.ll: smaller stack
* llvm/test/CodeGen/X86/atomic-unordered.ll: smaller stack
* llvm/test/CodeGen/X86/swifterror.ll: same as AArch64
* llvm/test/DebugInfo/X86/dbg-declare-arg.ll: stack size changed

Differential Revision: https://reviews.llvm.org/D62361

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362963 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-10 16:53:37 +00:00
Jonas Paulsson
ff136c756d [SystemZ, RegAlloc] Favor 3-address instructions during instruction selection.
This patch aims to reduce spilling and register moves by using the 3-address
versions of instructions per default instead of the 2-address equivalent
ones. It seems that both spilling and register moves are improved noticeably
generally.

Regalloc hints are passed to increase conversions to 2-address instructions
which are done in SystemZShortenInst.cpp (after regalloc).

Since the SystemZ reg/mem instructions are 2-address (dst and lhs regs are
the same), foldMemoryOperandImpl() can no longer trivially fold a spilled
source register since the reg/reg instruction is now 3-address. In order to
remedy this, new 3-address pseudo memory instructions are used to perform the
folding only when the dst and lhs virtual registers are known to be allocated
to the same physreg. In order to not let MachineCopyPropagation run and
change registers on these transformed instructions (making it 3-address), a
new target pass called SystemZPostRewrite.cpp is run just after
VirtRegRewriter, that immediately lowers the pseudo to a target instruction.

If it would have been possibe to insert a COPY instruction and change a
register operand (convert to 2-address) in foldMemoryOperandImpl() while
trusting that the caller (e.g. InlineSpiller) would update/repair the
involved LiveIntervals, the solution involving pseudo instructions would not
have been needed. This is perhaps a potential improvement (see Phabricator
post).

Common code changes:

* A new hook TargetPassConfig::addPostRewrite() is utilized to be able to run a
target pass immediately before MachineCopyPropagation.

* VirtRegMap is passed as an argument to foldMemoryOperand().

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D60888

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362868 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-08 06:19:15 +00:00
Ulrich Weigand
ef54162998 Allow target to handle STRICT floating-point nodes
The ISD::STRICT_ nodes used to implement the constrained floating-point
intrinsics are currently never passed to the target back-end, which makes
it impossible to handle them correctly (e.g. mark instructions are depending
on a floating-point status and control register, or mark instructions as
possibly trapping).

This patch allows the target to use setOperationAction to switch the action
on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code
will stop converting the STRICT nodes to regular floating-point nodes, but
instead pass the STRICT nodes to the target using normal SelectionDAG
matching rules.

To avoid having the back-end duplicate all the floating-point instruction
patterns to handle both strict and non-strict variants, we make the MI
codegen explicitly aware of the floating-point exceptions by introducing
two new concepts:

- A new MCID flag "mayRaiseFPException" that the target should set on any
  instruction that possibly can raise FP exception according to the
  architecture definition.
- A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI
  instruction resulting from expansion of any constrained FP intrinsic.

Any MI instruction that is *both* marked as mayRaiseFPException *and*
FPExcept then needs to be considered as raising exceptions by MI-level
codegen (e.g. scheduling).

Setting those two new flags is straightforward. The mayRaiseFPException
flag is simply set via TableGen by marking all relevant instruction
patterns in the .td files.

The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes
in the SelectionDAG, and gets inherited in the MachineSDNode nodes created
from it during instruction selection. The flag is then transfered to an
MIFlag when creating the MI from the MachineSDNode. This is handled just
like fast-math flags like no-nans are handled today.

This patch includes both common code changes required to implement the
new features, and the SystemZ implementation.

Reviewed By: andrew.w.kaylor

Differential Revision: https://reviews.llvm.org/D55506



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362663 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-05 22:33:10 +00:00
QingShan Zhang
117b137213 [NFC] Update the test to check the endianness after the CodeGenPrepare instead of checking the assembly instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362471 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-04 08:45:07 +00:00
Simon Pilgrim
58c0bf1241 [SystemZ] Remove sitofp(undef) from reduced test case.
Pre-commit for D62807 - which adds DAG [us]itofp(undef) --> 0 constant fold

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362396 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-03 12:58:36 +00:00
Kevin P. Neal
c1077e6b40 Revert revert of r362112 with minor SystemZ test file corrections.
[FPEnv] Added a special UnrollVectorOp method to deal with the chain on StrictFP opcodes

This change creates UnrollVectorOp_StrictFP. The purpose of this is to address a failure that consistently occurs when calling StrictFP functions on vectors whose number of elements is 3 + 2n on most platforms, such as PowerPC or SystemZ. The old UnrollVectorOp method does not expect that the vector that it will unroll will have a chain, so it has an assert that prevents it from running if this is the case. This new StrictFP version of the method deals with the chain while unrolling the vector. With this new function in place during vector widending, llc can run vector-constrained-fp-intrinsics.ll for SystemZ successfully.

Submitted by:	Drew Wock <drew.wock@sas.com>
Reviewed by:	Cameron McInally, Kevin P. Neal
Approved by:	Cameron McInally
Differential Revision:	https://reviews.llvm.org/D62546


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362241 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-31 16:32:12 +00:00
Roman Lebedev
58c76e2b4e [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold. Try 3
Summary:
Direct sibling of D62223 patch.
While i don't have a direct motivational pattern for this,
it would seem to make sense to handle both patterns (or none),
for symmetry?

The aarch64 changes look neutral;
sparc and systemz look like improvement (one less instruction each);
x86 changes - 32bit case improves, 64bit case shows that LEA no longer
gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea`

https://rise4fun.com/Alive/ffh

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs, and then reverted in
rL362109 to fix missing constant folds that were causing
endless combine loops.

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62252

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362143 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 20:37:18 +00:00
Kevin P. Neal
17b854c7c2 Revert r362112, it broke the bots with the message "Unsupported vector argument or return type"
Differential Revision:	http://reviews.llvm.org/D62546


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362117 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 17:10:21 +00:00
Kevin P. Neal
618cbae770 [FPEnv] Added a special UnrollVectorOp method to deal with the chain on StrictFP opcodes
This change creates UnrollVectorOp_StrictFP. The purpose of this is to address a failure that consistently occurs when calling StrictFP functions on vectors whose number of elements is 3 + 2n on most platforms, such as PowerPC or SystemZ. The old UnrollVectorOp method does not expect that the vector that it will unroll will have a chain, so it has an assert that prevents it from running if this is the case. This new StrictFP version of the method deals with the chain while unrolling the vector. With this new function in place during vector widending, llc can run vector-constrained-fp-intrinsics.ll for SystemZ successfully.

Submitted by:	Drew Wock <drew.wock@sas.com>
Reviewed by:	Cameron McInally, Kevin P. Neal
Approved by:	Cameron McInally
Differential Revision:	http://reviews.llvm.org/D62546


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362112 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 16:44:47 +00:00
Roman Lebedev
ed880bf1d6 [DAGCombine] Revert of recommit of "binop-with-const hoisting" patches
I was looking into an endless combine loop the uncommitted follow-up patch
was causing, and it appears even these patches can exibit such an
endless loop. The root cause is that we try to hoist one binop (add/sub) with
constant operand, and if we get two such binops both of which are
eligible for this hoisting, we get stuck.

Some cases may highlight missing constant-folds.

Reverts r361871,r361872,r361873,r361874.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362109 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 16:07:11 +00:00
Roman Lebedev
1504fb4b30 [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold. Try 2
Summary:
Direct sibling of D62223 patch.
While i don't have a direct motivational pattern for this,
it would seem to make sense to handle both patterns (or none),
for symmetry?

The aarch64 changes look neutral;
sparc and systemz look like improvement (one less instruction each);
x86 changes - 32bit case improves, 64bit case shows that LEA no longer
gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea`

https://rise4fun.com/Alive/ffh

This is a recommit, originally committed in rL361853, but reverted
to investigate test-suite compile-time hangs.

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62252

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361872 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 20:39:55 +00:00
Roman Lebedev
f8b1164eaa Revert DAGCombine "hoist binop with const" folds
Appear to introduce test-suite compile-time hang.

http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/22825

This reverts r361852,r361853,r361854,r361855,r361856

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361865 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 19:04:21 +00:00
Roman Lebedev
7728af49e5 [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold
Summary:
Direct sibling of D62223 patch.
While i don't have a direct motivational pattern for this,
it would seem to make sense to handle both patterns (or none),
for symmetry?

The aarch64 changes look neutral;
sparc and systemz look like improvement (one less instruction each);
x86 changes - 32bit case improves, 64bit case shows that LEA no longer
gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea`

https://rise4fun.com/Alive/ffh

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62252

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361853 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 17:53:54 +00:00
Kees Cook
40fd361936 [TargetLowering] Extend bool args to inline-asm according to getBooleanType
Summary:
This extends Krzysztof Parzyszek's X86-specific solution
(https://reviews.llvm.org/D60208) to the generic code pointed out by
James Y Knight.

Reviewers: kparzysz, craig.topper, nickdesaulniers

Subscribers: efriedma, sdardis, nemanjai, javed.absar, eraman, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, srhines, void, nickdesaulniers, jyknight

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60224


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361404 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-22 16:16:15 +00:00
Roman Lebedev
928e0daca7 [NFC][SystemZ] Autogenerate alloca-03.ll test to make test changes more visible
The check lines are being affected by an upcoming patch,
regenerate the checklines to visualize the changes better.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361380 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-22 13:04:24 +00:00
Jonas Paulsson
71efe3e09f [SystemZ] Bugfix in SystemZTargetLowering::combineIntDIVREM()
Make sure to not unroll a vector division/remainder (with a constant splat
divisor) after type legalization, since the scalar type may then be illegal.

Review: Ulrich Weigand
https://reviews.llvm.org/D62036

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360965 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-17 00:50:35 +00:00
Ulrich Weigand
12d462cd1b [SystemZ] Model floating-point control register
This adds the FPC (floating-point control register) as a reserved
physical register and models its use by SystemZ instructions.

Note that only the current rounding modes and the IEEE exception
masks are modeled.  *Changes* of the FPC due to exceptions (in
particular the IEEE exception flags and the DXC) are not modeled.

At this point, this patch is mostly NFC, but it will prevent
scheduling of floating-point instructions across SPFC/LFPC etc.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360570 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-13 09:47:26 +00:00
Bjorn Pettersson
57b216445b [DAG] Refactor DAGCombiner::ReassociateOps
Summary:
Extract the logic for doing reassociations
from DAGCombiner::reassociateOps into a helper
function DAGCombiner::reassociateOpsCommutative,
and use that helper to trigger reassociation
on the original operand order, or the commuted
operand order.

Codegen is not identical since the operand order will
be different when doing the reassociations for the
commuted case. That causes some unfortunate churn in
some test cases. Apart from that this should be NFC.

Reviewers: spatel, craig.topper, tstellar

Reviewed By: spatel

Subscribers: dmgreen, dschuff, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359476 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-29 17:50:10 +00:00
Nick Desaulniers
b3cb8ab451 [AsmPrinter] refactor to support %c w/ GlobalAddress'
Summary:
Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when
printing the address of a MachineOperand::MO_GlobalAddress. Move that
handling into a new overriden method in each base class. A virtual
method was added to the base class for handling the generic case.

Refactors a few subclasses to support the target independent %a, %c, and
%n.

The patch also contains small cleanups for AVRAsmPrinter and
SystemZAsmPrinter.

It seems that NVPTXTargetLowering is possibly missing some logic to
transform GlobalAddressSDNodes for
TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended
inline assembly asm constraints.

Fixes:
- https://bugs.llvm.org/show_bug.cgi?id=41402
- https://github.com/ClangBuiltLinux/linux/issues/449

Reviewers: echristo, void

Reviewed By: void

Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60887

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359337 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26 18:45:04 +00:00
David Green
45a375eb6b Revert rL357745: [SelectionDAG] Compute known bits of CopyFromReg
Certain optimisations from ConstantHoisting and CGP rely on Selection DAG not
seeing through to the constant in other blocks. Revert this patch while we come
up with a better way to handle that.

I will try to follow this up with some better tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358113 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-10 18:00:41 +00:00
Craig Topper
b77427871e [DAGCombiner][X86][SystemZ] Canonicalize SSUBO with immediate RHS to SADDO by negating the immediate.
This lines up with what we do for regular subtract and it matches up better with X86 assumptions in isel patterns that add with immediate is more canonical than sub with immediate.

Differential Revision: https://reviews.llvm.org/D60020

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358027 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-09 18:33:56 +00:00
Piotr Sobczak
959b42493f [SelectionDAG] Compute known bits of CopyFromReg
Summary:
Teach SelectionDAG how to compute known bits of ISD::CopyFromReg if
the virtual reg used has one def only.

This can be particularly useful when calling isBaseWithConstantOffset()
with the ISD::CopyFromReg argument, as more optimizations may get enabled
in the result.

Also add a missing truncation on X86, found by testing of this patch.

Change-Id: Id1c9fceec862d118c54a5b53adf72ada5d6daefa

Reviewers: bogner, craig.topper, RKSimon

Reviewed By: RKSimon

Subscribers: lebedev.ri, nemanjai, jvesely, nhaehnle, javed.absar, jsji, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59535

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357745 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-05 07:44:09 +00:00
Jonas Paulsson
8d88d919bb [SystemZ] Bugfix in isFusableLoadOpStorePattern()
This function is responsible for checking the legality of fusing an instance
of load -> op -> store into a single operation. In the SystemZ backend the
check was incomplete and a test case emerged with a cycle in the instruction
selection DAG as a result.

Instead of using the NodeIds to determine node relationships,
hasPredecessorHelper() now is used just like in the X86 backend. This handled
the failing tests and as well gave a few additional transformations on
benchmarks.

The SystemZ isFusableLoadOpStorePattern() is now a very near copy of the X86
function, and it seems this could be made a utility function in common code
instead.

Review: Ulrich Weigand
https://reviews.llvm.org/D60255

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357688 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-04 12:12:35 +00:00