Commit Graph

1345 Commits

Author SHA1 Message Date
David Tenty
ab4afead09 [NFC]Fix IR/MC depency issue for function descriptor SDAG implementation
Summary: llvm/IR/GlobalValue.h can't be included in MC, that creates a circular dependency between MC and IR libraries. This circular dependency is causing an issue for build system that enforce layering.

Author: Xiangling_L

Reviewers: sfertile, jasonliu, hubert.reinterpretcast, gribozavr

Reviewed By: gribozavr

Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64445

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365701 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-10 22:13:55 +00:00
Fangrui Song
8048f0070d [PowerPC] Support constraint code "ww"
Summary:
"ww" and "ws" are both constraint codes for VSX vector registers that
hold scalar double data. "ww" is preferred for float while "ws" is
preferred for double.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D64119

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365106 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-04 04:44:42 +00:00
Roman Lebedev
7429560931 [Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457)
Summary:
This is the backend part of [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]].
In middle-end, we'd want to prefer the form with two adds - D63992,
but as this diff shows, not every target will prefer that pattern.

Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars,
but only X86 prefer that same pattern for vectors.

Here i'm adding a new TLI hook, always defaulting to the inc-of-add,
but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars.

Reviewers: spatel, RKSimon, efriedma, t.p.northover, hfinkel

Reviewed By: efriedma

Subscribers: nemanjai, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64090

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365010 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-03 09:41:35 +00:00
Jinsong Ji
e946fcb412 [PowerPC][HTM] Fix disassembling buffer overflow for tabortdc and others
This was reported in https://bugs.llvm.org/show_bug.cgi?id=41751
llvm-mc aborted when disassembling tabortdc.

This patch try to clean up TM related DAGs.

* Fixes the problem by remove explicit output of cr0, and put it as implicit def.
* Update int_ppc_tbegin pattern to accommodate the implicit def of cr0.
* Update the TCHECK operand and int_ppc_tcheck accordingly.
* Add some builtin test and disassembly tests.
* Remove unused CRRC0/crrc0

Differential Revision: https://reviews.llvm.org/D61935

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364544 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 14:11:31 +00:00
Nemanja Ivanovic
4044bf9b0d [PowerPC] Mark FCOPYSIGN legal for FP vectors
This was just an omission in the back end. We have had the instructions for both
single and double precision for a few HW generations, but never got around to
legalizing these.

Differential revision: https://reviews.llvm.org/D63634


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364373 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 01:48:57 +00:00
Matt Arsenault
a2b05bc24d CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364191 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-24 15:50:29 +00:00
Justin Hibbits
05b9698f31 PowerPC: Optimize SPE double parameter calling setup
Summary:
SPE passes doubles the same as soft-float, in register pairs as i32
types.  This is all handled by the target-independent layer.  However,
this is not optimal when splitting or reforming the doubles, as it
pushes to the stack and loads from, on either side.

For instance, to pass a double argument to a function, assuming the
double value is in r5, the sequence currently looks like this:

    evstdd      5, X(1)
    lwz         3, X(1)
    lwz         4, X+4(1)

Likewise, to form a double into r5 from args in r3 and r4:

    stw         3, X(1)
    stw         4, X+4(1)
    evldd       5, X(1)

This optimizes the fence to use SPE instructions.  Now, to pass a double
to a function:

    mr          4, 5
    evmergehi   3, 5, 5

And to form a double into r5 from args in r3 and r4:

    evmergelo   5, 3, 4

This is comparable to the way that gcc generates the double splits.

This also fixes a bug with expanding builtins to libcalls, where the
LowerCallTo() code path was generating intermediate illegal type nodes.

Reviewers: nemanjai, hfinkel, joerg

Subscribers: kbarton, jfb, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D54583

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363526 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 03:15:23 +00:00
Kang Zhang
f8a4e52cc9 [PowerPC] Set the innermost hot loop to align 32 bytes
Summary:
If the nested loop is an innermost loop, prefer to a 32-byte alignment, so that
we can decrease cache misses and branch-prediction misses. Actual alignment of
 the loop will depend on the hotness check and other logic in alignBlocks.

The old code will only align hot loop to 32 bytes when the LoopSize larger than
16 bytes and smaller than 32 bytes, this patch will align the innermost hot loop
 to 32 bytes not only for the hot loop whose size is 16~32 bytes.

Reviewed By: steven.zhang, jsji

Differential Revision: https://reviews.llvm.org/D61228


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363495 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-15 15:10:24 +00:00
Simon Pilgrim
9b95d3f22f [TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests (PR42123)
As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space.

This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them.

If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores.

Differential Revision: https://reviews.llvm.org/D63075

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363179 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-12 17:14:03 +00:00
Sam Parker
c313a177b4 [CodeGen] Generic Hardware Loop Support
Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops and PowerPC has been ported over to use this generic
pass. The target dependent parts have been moved into
TargetTransformInfo, via isHardwareLoopProfitable, with
HardwareLoopInfo introduced to transfer information from the backend.
    
Three generic intrinsics have been introduced:
- void @llvm.set_loop_iterations
  Takes as a single operand, the number of iterations to be executed.
- i1 @llvm.loop_decrement(anyint)
  Takes the maximum number of elements processed in an iteration of
  the loop body and subtracts this from the total count. Returns
  false when the loop should exit.
- anyint @llvm.loop_decrement_reg(anyint, anyint)
  Takes the number of elements remaining to be processed as well as
  the maximum numbe of elements processed in an iteration of the loop
  body. Returns the updated number of elements remaining.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362774 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-07 07:35:30 +00:00
Nemanja Ivanovic
03ae046337 [PowerPC] Exploit the vector min/max instructions
Use the PPC vector min/max instructions for computing the corresponding
operation as these should be faster than the compare/select sequences
we currently emit.

Differential revision: https://reviews.llvm.org/D47332


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362759 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-06 23:49:01 +00:00
Jason Liu
3d1f1eb045 [AIX] Implement function descriptor on SDAG
Summary:
(1) Function descriptor on AIX
On AIX, a called routine may have 2 distinct symbols associated with it:
 * A function descriptor (Name)
 * A function entry point (.Name)

The descriptor structure on AIX is the same as those in the ELF V1 ABI:
 * The address of the entry point of the function.
 * The TOC base address for the function.
 * The environment pointer.

The descriptor symbol uses the same name as the source level function in C.
The function entry point is analogous to the symbol we would generate for a
 function in a non-descriptor-based ABI, except that it is renamed by
prepending a ".".

Which symbol gets referenced depends on the context:
 * Taking the address of the function references the descriptor symbol.
 * Calling the function references the entry point symbol.

(2) Speaking of implementation on AIX, for direct function call target, we
 create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to
 replace original TargetGlobalAddress SDNode. Then down the path, we can
 take advantage of this MCSymbol.

Patch by: Xiangling_L

Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara

Differential Revision: https://reviews.llvm.org/D62532

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362735 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-06 19:13:36 +00:00
Jason Liu
be6d1bcd6a [AIX] Implement call lowering with parameters could pass onto GPRs
Summary:
This patch implements SDAG call lowering on AIX for functions
which only have parameters that could fit into GPRs.

Reviewers: hubert.reinterpretcast, syzaara

Differential Revision: https://reviews.llvm.org/D62823

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362708 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-06 14:36:43 +00:00
Jason Liu
ea8ee651a9 Implement call lowering without parameters on AIX
Summary:dd
This patch implements call lowering for calls without parameters
on AIX as initial support.

Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma

Differential Revision: https://reviews.llvm.org/D61948

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361669 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-24 20:54:35 +00:00
Chen Zheng
0b2b4e1922 [PowerPC] [ISEL] select x-form instruction for unaligned offset
Differential Revision: https://reviews.llvm.org/D62173


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361346 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-22 02:57:31 +00:00
Chen Zheng
04c01a17c3 [PowerPC] use more meaningful name - NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361218 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-21 03:54:42 +00:00
Lei Huang
5238a36901 [PowerPC] custom lower v2f64 fpext v2f32
Reduces scalarization overhead via custom lowering of v2f64 fpext v2f32.

eg. For the following IR
  %0 = load <2 x float>, <2 x float>* %Ptr, align 8
  %1 = fpext <2 x float> %0 to <2 x double>
  ret <2 x double> %1

Pre custom lowering:
  ld r3, 0(r3)
  mtvsrd f0, r3
  xxswapd vs34, vs0
  xscvspdpn f0, vs0
  xxsldwi vs1, vs34, vs34, 3
  xscvspdpn f1, vs1
  xxmrghd vs34, vs0, vs1

After custom lowering:
  lfd f0, 0(r3)
  xxmrghw vs0, vs0, vs0
  xvcvspdp vs34, vs0

Differential Revision: https://reviews.llvm.org/D57857

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360429 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-10 14:04:06 +00:00
Nemanja Ivanovic
9817b74a74 [PowerPC] Use the two-constant NR algorithm for refining estimates
The single-constant algorithm produces infinities on a lot of denormal values.
The precision of the two-constant algorithm is actually sufficient across the
range of denormals. We will switch to that algorithm for now to avoid the
infinities on denormals. In the future, we will re-evaluate the algorithm to
find the optimal one for PowerPC.

Differential revision: https://reviews.llvm.org/D60037


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360144 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-07 13:48:03 +00:00
Nemanja Ivanovic
8d416103aa [PowerPC] Fix erroneous condition for converting uint-to-fp vector conversion
A condition for exiting the legalization of v4i32 conversion to v2f64 through
extract/convert/build erroneously checks for the extract having type i32.
This is not adequate as smaller extracts are actually legalized to i32 as well.
Furthermore, an early exit is missing which means that we only check that
both extracts are from the same vector if that check fails.
As a result, both cases in the included test case fail - the first gets a
select error and the second generates incorrect code.

The culprit commit is r274535.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360043 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-06 13:35:49 +00:00
Simon Pilgrim
a5aaefa640 Avoid cppcheck operator precedence warnings. NFCI.
Prefer ((X & Y) ? A : B) to (X & Y ? A : B)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359884 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-03 13:50:38 +00:00
Kang Zhang
de63356ef3 [NFC][PowerPC] Return early if the element type is not byte-sized in combineBVOfConsecutiveLoads
Summary:
Based on the Eli Friedman's comments in https://reviews.llvm.org/D60811 , we'd better return early if the element type is not byte-sized in `combineBVOfConsecutiveLoads`.

Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D61076


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359764 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-02 08:15:13 +00:00
Sjoerd Meijer
67c2691c8d [TargetLowering] Change getOptimalMemOpType to take a function attribute list
The MachineFunction wasn't used in getOptimalMemOpType, but more importantly,
this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType.

This is the groundwork for the changes in D59766 and D59787, that allows
implementation of TTI::getMemcpyCost.

Differential Revision: https://reviews.llvm.org/D59785


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359537 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-30 08:38:12 +00:00
Roland Froese
076a39af99 [PowerPC] Try harder to avoid load/move-to VSR for partial vector loads
Change the PPCISelLowering.cpp function that decides to avoid update form in
favor of partial vector loads to know about newer load types and to not be
confused by the chain operand.

Differential Revision: https://reviews.llvm.org/D60102



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359504 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-29 21:08:35 +00:00
Joerg Sonnenberger
4eb4e8f21e [PowerPC] Allow using initial-exec TLS with PIC
Using initial-exec TLS variables is a reasonable performance
optimisation for system libraries. Use the correct PIC mechanism to get
hold of the GOT to avoid text relocations.

Differential Revision: https://reviews.llvm.org/D61026


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359146 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-24 22:12:22 +00:00
Sean Fertile
4d4155f08e Add period at end of comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359144 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-24 21:51:30 +00:00
Kang Zhang
a2c8107c80 [PowerPC] Fix wrong ElemSIze when calling isConsecutiveLS()
Summary:
This issue from the bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41177

When the two operands for BUILD_VECTOR are same, we will get assert error.
llvm::SDValue combineBVOfConsecutiveLoads(llvm::SDNode*, llvm::SelectionDAG&):
Assertion `!(InputsAreConsecutiveLoads && InputsAreReverseConsecutive) &&
"The loads cannot be both consecutive and reverse consecutive."' failed.

This error caused by the wrong ElemSIze when calling isConsecutiveLS(). We
should use `getScalarType().getStoreSize();` to get the ElemSize instread of
 `getScalarSizeInBits() / 8`.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60811


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358644 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-18 07:24:15 +00:00
Evandro Menezes
d71ea05ab7 [IR] Refactor attribute methods in Function class (NFC)
Rename the functions that query the optimization kind attributes.

Differential revision: https://reviews.llvm.org/D60287

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357731 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-04 22:40:06 +00:00
Kang Zhang
16c4a8836b [PowerPC] Add the support for __builtin_setrnd()
Summary:
PowerPC64/PowerPC64le supports the builtin function __builtin_setrnd to set the floating point rounding mode. This function will use the least significant two bits of integer argument to set the floating point rounding mode.
double __builtin_setrnd(int mode);
The effective values for mode are:
0 - round to nearest
1 - round to zero
2 - round to +infinity
3 - round to -infinity
Note that the mode argument will modulo 4, so if the int argument is greater than 3, it will only use the least significant two bits of the mode. Namely, builtin_setrnd(102)) is equal to builtin_setrnd(2).

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D59405


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357241 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 08:45:24 +00:00
Zi Xuan Wu
fc1bf1b19a [PowerPC] Strength reduction of multiply by a constant by shift and add/sub in place
A shift and add/sub sequence combination is faster in place of a multiply by constant. 
Because the cycle or latency of multiply is not huge, we only consider such following
worthy patterns.

```
(mul x, 2^N + 1) => (add (shl x, N), x)
(mul x, -(2^N + 1)) => -(add (shl x, N), x)
(mul x, 2^N - 1) => (sub (shl x, N), x)
(mul x, -(2^N - 1)) => (sub x, (shl x, N))
```

And the cycles or latency is subtarget-dependent so that we need consider the
subtarget to determine to do or not do such transformation. 
Also data type is considered for different cycles or latency to do multiply.

Differential Revision: https://reviews.llvm.org/D58950


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357233 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 03:08:39 +00:00
Simon Pilgrim
04a982d9fb Fix for ABS legalization on PPC buildbot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356498 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-19 18:55:46 +00:00
Simon Pilgrim
5425f5c42b [SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect
These changes are related to PR37743 and include:

    SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node.

    Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner.

    Add promoting the integer ABS node in the LegalizeIntegerType.

    Expand-based legalization of integer result for the ABS nodes.

    Expand-based legalization of ABS vector operations.

    Add some integer abs testcases for different typesizes for Thumb arch

    Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to:
        tmp = (SRA, Hi, 31)
        Lo = (UADDO tmp, Lo)
        Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1))
        Lo = (XOR tmp, Lo)

    The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern:
        (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))).

    Change integer abs testcases for codegen with the ABS node support for AArch64.
        Indicate that the ABS is legal for the i64 type when the NEON is supported.
        Change the integer abs testcases to show changing of codegen.

    Add combine and legalization of ABS nodes for Thumb arch.

    Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition.

For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743

Patch by: @ikulagin (Ivan Kulagin)

Differential Revision: https://reviews.llvm.org/D49837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356468 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-19 16:24:55 +00:00
Adhemerval Zanella
0ce3660e40 [TargetLowering] Add code size information on isFPImmLegal. NFC
This allows better code size for aarch64 floating point materialization
in a future patch.

Reviewers: evandro

Differential Revision: https://reviews.llvm.org/D58690



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356389 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-18 18:40:07 +00:00
Roland Froese
0783d48fed [PowerPC] Avoid scalarization of vector truncate
The PowerPC code generator currently scalarizes vector truncates that would fit in a vector register, resulting in vector extracts, scalar operations, and vector merges. This patch custom lowers a vector truncate that would fit in a register to a vector shuffle instead.

Differential Revision: https://reviews.llvm.org/D56507


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353724 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-11 17:29:14 +00:00
Reid Kleckner
732eb58985 [PPC] Include tablegenerated PPCGenCallingConv.inc once
Move the CC analysis implementation to its own .cpp file instead of
duplicating it and artificually using functions in PPCISelLowering.cpp
and PPCFastISel.cpp. Follow-up to the same change done for X86, ARM, and
AArch64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352444 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-29 00:30:35 +00:00
Chandler Carruth
6b547686c5 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 08:50:56 +00:00
Kang Zhang
55e4ae96b0 [PowerPC] Fix machine verify pass error for PATCHPOINT pseudo instruction that bad machine code
Summary:
For SDAG, we pretend patchpoints aren't special at all until we emit the code for the pseudo.
Then the verifier runs and it seems like we have a use of an undefined register (the register will 
be reserved later, but the verifier doesn't know that).

So this patch call setUsesTOCBasePtr before emit the code for the pseudo, so verifier can know 
X2 is a reserved register.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D56148


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350165 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-30 15:13:51 +00:00
Nemanja Ivanovic
4125b366e1 [PowerPC] Complete the custom legalization of vector int to fp conversion
A recent patch has added custom legalization of vector conversions of
v2i16 -> v2f64. This just rounds it out for other types where the input vector
has an illegal (narrower) type than the result vector. Specifically, this will
handle the following conversions:

v2i8 -> v2f64
v4i8 -> v4f32
v4i16 -> v4f32

Differential revision: https://reviews.llvm.org/D54663


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350155 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-29 13:40:48 +00:00
Zi Xuan Wu
99610fde48 [NFC] clang-format functions related to r350113
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350114 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-28 02:45:17 +00:00
Zi Xuan Wu
b534db940c [PowerPC] Fix assert from machine verify pass that atomic pseudo expanding causes mismatched register class
For atomic value operand which less than 4 bytes need to be masked. 
And the related operation to calculate the newvalue can be done in 32 bit gprc. 
So just use gprc for mask and value calculation.

Differential Revision: https://reviews.llvm.org/D56077



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350113 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-28 02:12:55 +00:00
Kang Zhang
d2ca398f08 [PowerPC] Fix the bug of ISD::ADDE to set its second return type to glue
Summary:
This patch is to fix the bug imported by rL341634.
In above submit , the the return type of ISD::ADDE is 
14224: SDVTList VTs = DAG.getVTList(MVT::i64, MVT::i64), 
but in fact, the second return type of ISD::ADDE should be 
MVT::Glue not MVT::i64.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D55977


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350061 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-25 03:29:51 +00:00
Simon Pilgrim
cf4d42cf34 [PPC] Always use the version of computeKnownBits that returns a value. NFCI.
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@349903 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-21 14:32:39 +00:00
Kewen Lin
979b87d6b7 [PowerPC]Exploit P9 vabsdu for unsigned vselect patterns
For type v4i32/v8ii16/v16i8, do following transforms:
  (vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) -> (vabsd a, b)
  (vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) -> (vabsd a, b)
  (vselect (setcc a, b, setult), (sub b, a), (sub a, b)) -> (vabsd a, b)
  (vselect (setcc a, b, setule), (sub b, a), (sub a, b)) -> (vabsd a, b)

Differential Revision: https://reviews.llvm.org/D55812


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@349599 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-19 03:04:07 +00:00
Kewen Lin
5ebd7af2ed [PowerPC] Improve vec_abs on P9
Improve the current vec_abs support on P9, generate ISD::ABS node for vector types,
combine ABS node to VABSD node for some special cases to make use of P9 VABSD* insns,
do custom lowering to vsub(vneg later)+vmax if it has no combination opportunity.

Differential Revision: https://reviews.llvm.org/D54783



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@349437 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-18 03:16:43 +00:00
QingShan Zhang
e28eceb434 [NFC] [PowerPC] add an routine in PPCTargetLowering to determine if a global is accessed as got-indirect or not.
In theory, we should let the PPC target to determine how to lower the TOC Entry for globals. 
And the PPCTargetLowering requires this query to do some optimization for TOC_Entry. 

Differential Revision: https://reviews.llvm.org/D54925


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348108 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-03 03:32:16 +00:00
Nemanja Ivanovic
ecd20daf2c [PowerPC] Do not use vectors to codegen bswap with Altivec turned off
We have efficient codegen on P9 for lowering bswap that involves moving
the value into a vector reg and moving it back. However, the check under
which we custom lowered it did not adequately reflect the actual requirements.
It required only that the subtarget be an implementation of ISA 3.0 since all
compliant implementations have to provide the vector instructions.
However, the kernel builds have a valid use case for -mno-altivec -mcpu=pwr9
(i.e. don't emit vector code, don't have to save vector regs for context
switch). So we should require the correct features for this lowering.
Fixes https://bugs.llvm.org/show_bug.cgi?id=39334


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347376 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-21 02:53:50 +00:00
Nemanja Ivanovic
86787453a3 [PowerPC] Don't combine to bswap store on 1-byte truncating store
Turns out that there was no check for a store that truncates down
to a single byte when combining a (store (bswap...)) into a byte-swapping
store. This patch just adds that check.

Fixes https://bugs.llvm.org/show_bug.cgi?id=39478.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347288 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-20 04:42:31 +00:00
Zi Xuan Wu
382877c17c [PowerPC] Enhance the selection(ISD::VSELECT) of vector type
To make ISD::VSELECT available(legal) so long as there are altivec instruction, otherwise it's default behavior is expanding,
which is legalized at type-legalization phase. Use xxsel to match vselect if vsx is open, or use vsel.


Differential Revision: https://reviews.llvm.org/D49531



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346824 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-14 02:34:45 +00:00
Reid Kleckner
b7d45e1d88 Fix clang -Wimplicit-fallthrough warnings across llvm, NFC
This patch should not introduce any behavior changes. It consists of
mostly one of two changes:
1. Replacing fall through comments with the LLVM_FALLTHROUGH macro
2. Inserting 'break' before falling through into a case block consisting
   of only 'break'.

We were already using this warning with GCC, but its warning behaves
slightly differently. In this patch, the following differences are
relevant:
1. GCC recognizes comments that say "fall through" as annotations, clang
   doesn't
2. GCC doesn't warn on "case N: foo(); default: break;", clang does
3. GCC doesn't warn when the case contains a switch, but falls through
   the outer case.

I will enable the warning separately in a follow-up patch so that it can
be cleanly reverted if necessary.

Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu

Differential Revision: https://reviews.llvm.org/D53950

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345882 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-01 19:54:45 +00:00
Li Jia He
fb14999433 [PowerPC] Support constraint 'wi' in asm
From the gcc manual, we can see that the specific limit of wi inline asm is “FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS”. The link is https://gcc.gnu.org/onlinedocs/gcc-8.2.0/gcc/Machine-Constraints.html#Machine-Constraints. We should accept this constraint.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D53265


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345810 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-01 02:35:17 +00:00
Li Jia He
ad84a8b9be [PowerPC] Fix some missed optimization opportunities in combineSetCC
For both operands are bool, short, int, long, long long, add the following optimization.
1. 0-x == y --> x+y ==0
2. 0-x != y --> x+y != 0

Review: nemanjai
Differential Revision: https://reviews.llvm.org/D53360


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345366 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-26 06:48:53 +00:00