1316 Commits

Author SHA1 Message Date
Simon Pilgrim
04a982d9fb Fix for ABS legalization on PPC buildbot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356498 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-19 18:55:46 +00:00
Simon Pilgrim
5425f5c42b [SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect
These changes are related to PR37743 and include:

    SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node.

    Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner.

    Add promoting the integer ABS node in the LegalizeIntegerType.

    Expand-based legalization of integer result for the ABS nodes.

    Expand-based legalization of ABS vector operations.

    Add some integer abs testcases for different typesizes for Thumb arch

    Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to:
        tmp = (SRA, Hi, 31)
        Lo = (UADDO tmp, Lo)
        Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1))
        Lo = (XOR tmp, Lo)

    The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern:
        (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))).

    Change integer abs testcases for codegen with the ABS node support for AArch64.
        Indicate that the ABS is legal for the i64 type when the NEON is supported.
        Change the integer abs testcases to show changing of codegen.

    Add combine and legalization of ABS nodes for Thumb arch.

    Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition.

For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743

Patch by: @ikulagin (Ivan Kulagin)

Differential Revision: https://reviews.llvm.org/D49837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356468 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-19 16:24:55 +00:00
Adhemerval Zanella
0ce3660e40 [TargetLowering] Add code size information on isFPImmLegal. NFC
This allows better code size for aarch64 floating point materialization
in a future patch.

Reviewers: evandro

Differential Revision: https://reviews.llvm.org/D58690



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356389 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-18 18:40:07 +00:00
Roland Froese
0783d48fed [PowerPC] Avoid scalarization of vector truncate
The PowerPC code generator currently scalarizes vector truncates that would fit in a vector register, resulting in vector extracts, scalar operations, and vector merges. This patch custom lowers a vector truncate that would fit in a register to a vector shuffle instead.

Differential Revision: https://reviews.llvm.org/D56507


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353724 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-11 17:29:14 +00:00
Reid Kleckner
732eb58985 [PPC] Include tablegenerated PPCGenCallingConv.inc once
Move the CC analysis implementation to its own .cpp file instead of
duplicating it and artificually using functions in PPCISelLowering.cpp
and PPCFastISel.cpp. Follow-up to the same change done for X86, ARM, and
AArch64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352444 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-29 00:30:35 +00:00
Chandler Carruth
6b547686c5 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 08:50:56 +00:00
Kang Zhang
55e4ae96b0 [PowerPC] Fix machine verify pass error for PATCHPOINT pseudo instruction that bad machine code
Summary:
For SDAG, we pretend patchpoints aren't special at all until we emit the code for the pseudo.
Then the verifier runs and it seems like we have a use of an undefined register (the register will 
be reserved later, but the verifier doesn't know that).

So this patch call setUsesTOCBasePtr before emit the code for the pseudo, so verifier can know 
X2 is a reserved register.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D56148


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350165 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-30 15:13:51 +00:00
Nemanja Ivanovic
4125b366e1 [PowerPC] Complete the custom legalization of vector int to fp conversion
A recent patch has added custom legalization of vector conversions of
v2i16 -> v2f64. This just rounds it out for other types where the input vector
has an illegal (narrower) type than the result vector. Specifically, this will
handle the following conversions:

v2i8 -> v2f64
v4i8 -> v4f32
v4i16 -> v4f32

Differential revision: https://reviews.llvm.org/D54663


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350155 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-29 13:40:48 +00:00
Zi Xuan Wu
99610fde48 [NFC] clang-format functions related to r350113
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350114 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-28 02:45:17 +00:00
Zi Xuan Wu
b534db940c [PowerPC] Fix assert from machine verify pass that atomic pseudo expanding causes mismatched register class
For atomic value operand which less than 4 bytes need to be masked. 
And the related operation to calculate the newvalue can be done in 32 bit gprc. 
So just use gprc for mask and value calculation.

Differential Revision: https://reviews.llvm.org/D56077



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350113 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-28 02:12:55 +00:00
Kang Zhang
d2ca398f08 [PowerPC] Fix the bug of ISD::ADDE to set its second return type to glue
Summary:
This patch is to fix the bug imported by rL341634.
In above submit , the the return type of ISD::ADDE is 
14224: SDVTList VTs = DAG.getVTList(MVT::i64, MVT::i64), 
but in fact, the second return type of ISD::ADDE should be 
MVT::Glue not MVT::i64.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D55977


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350061 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-25 03:29:51 +00:00
Simon Pilgrim
cf4d42cf34 [PPC] Always use the version of computeKnownBits that returns a value. NFCI.
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@349903 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-21 14:32:39 +00:00
Kewen Lin
979b87d6b7 [PowerPC]Exploit P9 vabsdu for unsigned vselect patterns
For type v4i32/v8ii16/v16i8, do following transforms:
  (vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) -> (vabsd a, b)
  (vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) -> (vabsd a, b)
  (vselect (setcc a, b, setult), (sub b, a), (sub a, b)) -> (vabsd a, b)
  (vselect (setcc a, b, setule), (sub b, a), (sub a, b)) -> (vabsd a, b)

Differential Revision: https://reviews.llvm.org/D55812


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@349599 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-19 03:04:07 +00:00
Kewen Lin
5ebd7af2ed [PowerPC] Improve vec_abs on P9
Improve the current vec_abs support on P9, generate ISD::ABS node for vector types,
combine ABS node to VABSD node for some special cases to make use of P9 VABSD* insns,
do custom lowering to vsub(vneg later)+vmax if it has no combination opportunity.

Differential Revision: https://reviews.llvm.org/D54783



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@349437 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-18 03:16:43 +00:00
QingShan Zhang
e28eceb434 [NFC] [PowerPC] add an routine in PPCTargetLowering to determine if a global is accessed as got-indirect or not.
In theory, we should let the PPC target to determine how to lower the TOC Entry for globals. 
And the PPCTargetLowering requires this query to do some optimization for TOC_Entry. 

Differential Revision: https://reviews.llvm.org/D54925


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348108 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-03 03:32:16 +00:00
Nemanja Ivanovic
ecd20daf2c [PowerPC] Do not use vectors to codegen bswap with Altivec turned off
We have efficient codegen on P9 for lowering bswap that involves moving
the value into a vector reg and moving it back. However, the check under
which we custom lowered it did not adequately reflect the actual requirements.
It required only that the subtarget be an implementation of ISA 3.0 since all
compliant implementations have to provide the vector instructions.
However, the kernel builds have a valid use case for -mno-altivec -mcpu=pwr9
(i.e. don't emit vector code, don't have to save vector regs for context
switch). So we should require the correct features for this lowering.
Fixes https://bugs.llvm.org/show_bug.cgi?id=39334


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347376 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-21 02:53:50 +00:00
Nemanja Ivanovic
86787453a3 [PowerPC] Don't combine to bswap store on 1-byte truncating store
Turns out that there was no check for a store that truncates down
to a single byte when combining a (store (bswap...)) into a byte-swapping
store. This patch just adds that check.

Fixes https://bugs.llvm.org/show_bug.cgi?id=39478.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347288 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-20 04:42:31 +00:00
Zi Xuan Wu
382877c17c [PowerPC] Enhance the selection(ISD::VSELECT) of vector type
To make ISD::VSELECT available(legal) so long as there are altivec instruction, otherwise it's default behavior is expanding,
which is legalized at type-legalization phase. Use xxsel to match vselect if vsx is open, or use vsel.


Differential Revision: https://reviews.llvm.org/D49531



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346824 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-14 02:34:45 +00:00
Reid Kleckner
b7d45e1d88 Fix clang -Wimplicit-fallthrough warnings across llvm, NFC
This patch should not introduce any behavior changes. It consists of
mostly one of two changes:
1. Replacing fall through comments with the LLVM_FALLTHROUGH macro
2. Inserting 'break' before falling through into a case block consisting
   of only 'break'.

We were already using this warning with GCC, but its warning behaves
slightly differently. In this patch, the following differences are
relevant:
1. GCC recognizes comments that say "fall through" as annotations, clang
   doesn't
2. GCC doesn't warn on "case N: foo(); default: break;", clang does
3. GCC doesn't warn when the case contains a switch, but falls through
   the outer case.

I will enable the warning separately in a follow-up patch so that it can
be cleanly reverted if necessary.

Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu

Differential Revision: https://reviews.llvm.org/D53950

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345882 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-01 19:54:45 +00:00
Li Jia He
fb14999433 [PowerPC] Support constraint 'wi' in asm
From the gcc manual, we can see that the specific limit of wi inline asm is “FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS”. The link is https://gcc.gnu.org/onlinedocs/gcc-8.2.0/gcc/Machine-Constraints.html#Machine-Constraints. We should accept this constraint.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D53265


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345810 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-01 02:35:17 +00:00
Li Jia He
ad84a8b9be [PowerPC] Fix some missed optimization opportunities in combineSetCC
For both operands are bool, short, int, long, long long, add the following optimization.
1. 0-x == y --> x+y ==0
2. 0-x != y --> x+y != 0

Review: nemanjai
Differential Revision: https://reviews.llvm.org/D53360


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345366 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-26 06:48:53 +00:00
Nemanja Ivanovic
699414a493 [PowerPC] Keep vector int to fp conversions in vector domain
At present a v2i16 -> v2f64 convert is implemented by extracts to scalar,
scalar converts, and merge back into a vector. Use vector converts instead,
with the int data permuted into the proper position and extended if necessary.

Patch by RolandF.

Differential revision: https://reviews.llvm.org/D53346


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345361 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-26 03:19:13 +00:00
Stefan Pintilie
c0de197df0 [Power9] Add __float128 support in the backend for bitcast to a i128
Add support to allow bit-casting from f128 to i128 and then
extracting 64 bits from the result.

Differential Revision: https://reviews.llvm.org/D49507

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345053 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-23 17:11:36 +00:00
QingShan Zhang
64c3a57bec [PowerPC] Fix the assert of ISD::SIGN_EXTEND_INREG when type is v2i16 and v2i8
For ISD::SIGN_EXTEND_INREG operation of v2i16 and v2i8 types will cause assert because they are registered as custom operation. 
So that the type legalization phase will enter the custom hook, which do not handle ISD::SIGN_EXTEND_INREG operation and fall throw into unreachable assert.

Patch By: wuzish (Zixuan Wu)
Differential Revision: https://reviews.llvm.org/D52449


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344109 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-10 02:33:48 +00:00
Nemanja Ivanovic
7df2b65a85 [PowerPC] Implement hasBitPreservingFPLogic for types that can be supported
This is the PPC-specific non-controversial part of
https://reviews.llvm.org/D44548 that simply enables this combine for PPC
since PPC has these instructions.
This commit will allow the target-independent portion to be truly target
independent.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344077 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-09 20:35:15 +00:00
QingShan Zhang
5dd1dd3856 [PowerPC] Fix the assert of combineBVOfConsecutiveLoads when element num is 1
Building a vector out of multiple loads can be converted to a load of the vector type if the loads are consecutive.
But the special condition is that the element number is 1, such as <1 x i128>. So just early exit to fix the assert.

Patch By: wuzish (Zixuan Wu)
Differential Revision: https://reviews.llvm.org/D52072


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342611 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-20 03:09:15 +00:00
Strahinja Petrovic
bec58f4635 [PowerPC] Fix label address calculation for ppc64
This patch fixes calculating address of label for non-pic ppc64.

Differential Revision: https://reviews.llvm.org/D50965


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342368 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-17 11:03:40 +00:00
Lion Yang
bc154d861e [PowerPC] Fix the calling convention for i1 arguments on PPC32
Summary:
Integer types smaller than i32 must be extended to i32 by default.
The feature "crbits" introduced at r202451 handles i1 as a special case,
but it did not extend properly.
The caller was, therefore, passing i1 stack arguments by writing 0/1 to
the first byte of the 4-byte stack object and callee was
reading the first byte for the value.

"crbits" is enabled if the optimization level is greater than 1,
which is very common in "release builds".
Such discrepancies with ABI specification also introduces
potential incompatibility with programs or libraries
built with other compilers e.g. GCC.

Fixes PR38661

Reviewers: hfinkel, cuviper

Subscribers: sylvestre.ledru, glaubitz, nagisa, nemanjai, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D51108


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342288 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-14 21:26:05 +00:00
QingShan Zhang
635af94d89 [PowerPC] Combine ADD to ADDZE
On the ppc64le platform, if ir has the following form,

define i64 @addze1(i64 %x, i64 %z) local_unnamed_addr #0 {
entry:
  %cmp = icmp ne i64 %z, CONSTANT      (-32767 <= CONSTANT <= 32768)
  %conv1 = zext i1 %cmp to i64
  %add = add nsw i64 %conv1, %x
  ret i64 %add
}
we can optimize it to the form below.

                                when C == 0
                            --> addze X, (addic Z, -1))
                           /
add X, (zext(setne Z, C))--
                           \    when -32768 <= -C <= 32767 && C != 0
                            --> addze X, (addic (addi Z, -C), -1)

Patch By: HLJ2009 (Li Jia He)
Differential Revision: https://reviews.llvm.org/D51403
Reviewed By: Nemanjai 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341634 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-07 07:56:05 +00:00
Nemanja Ivanovic
24535f7c19 [PowerPC] Revert commit r339779
This commit has caused failures in some internal benchmarks. Temporarily
reverting this patch until the issue can be diagnosed and fixed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340740 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-27 13:20:42 +00:00
Nemanja Ivanovic
b4aa55042a [PowerPC] Recommit r340016 after fixing the reported issue
The internal benchmark failure reported by Google was due to a missing
check for the result type for the sign-extend and shift DAG. This commit
adds the check and re-commits the patch.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340734 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-27 11:20:27 +00:00
Eric Christopher
70801a9fac Temporarily Revert "[PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction" due to it causing a compiler crash on valid.
This reverts commit r340016, testcase forthcoming.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340315 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-21 18:35:08 +00:00
Stefan Pintilie
81e04e9e9a [PowerPC] Generate lxsd instead of the ld->mtvsrd sequence for vector loads
This patch addresses:

- Implementation within PPCISelLowering.cpp to check if we should use direct
load into vector instructions (such as lxsd/lfd ) when the scalar_to_vector
function is used; which will allow us to catch as many cases of the
scalar_to_vector uses as possible to translate the ld->mtvsrd sequence into
lxsd.

- Test cases to exhibit the behaviour of emitting lxsd/lfd.

Patch by amyk

Differential revision: https://reviews.llvm.org/D49698

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340037 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-17 15:15:26 +00:00
Nemanja Ivanovic
ac879930f8 [PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction
Add a DAG combine for the PowerPC code generator to generate the Power9 extswsli
extend sign and shift immediate instruction.

Patch by RolandF.

Differential revision: https://reviews.llvm.org/D49879


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340016 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-17 12:35:44 +00:00
Chandler Carruth
2a752bfdae [MI] Change the array of MachineMemOperand pointers to be
a generically extensible collection of extra info attached to
a `MachineInstr`.

The primary change here is cleaning up the APIs used for setting and
manipulating the `MachineMemOperand` pointer arrays so chat we can
change how they are allocated.

Then we introduce an extra info object that using the trailing object
pattern to attach some number of MMOs but also other extra info. The
design of this is specifically so that this extra info has a fixed
necessary cost (the header tracking what extra info is included) and
everything else can be tail allocated. This pattern works especially
well with a `BumpPtrAllocator` which we use here.

I've also added the basic scaffolding for putting interesting pointers
into this, namely pre- and post-instruction symbols. These aren't used
anywhere yet, they're just there to ensure I've actually gotten the data
structure types correct. I'll flesh out support for these in
a subsequent patch (MIR dumping, parsing, the works).

Finally, I've included an optimization where we store any single pointer
inline in the `MachineInstr` to avoid the allocation overhead. This is
expected to be the overwhelmingly most common case and so should avoid
any memory usage growth due to slightly less clever / dense allocation
when dealing with >1 MMO. This did require several ergonomic
improvements to the `PointerSumType` to reasonably support the various
usage models.

This also has a side effect of freeing up 8 bits within the
`MachineInstr` which could be repurposed for something else.

The suggested direction here came largely from Hal Finkel. I hope it was
worth it. ;] It does hopefully clear a path for subsequent extensions
w/o nearly as much leg work. Lots of thanks to Reid and Justin for
careful reviews and ideas about how to do all of this.

Differential Revision: https://reviews.llvm.org/D50701

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339940 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-16 21:30:05 +00:00
Nemanja Ivanovic
4df6ca02ff [PowerPC] Enhance the selection(ISD::VSELECT) of vector type
To make ISD::VSELECT available(legal) so long as there are altivec instruction,
otherwise it's default behavior is expanding.
Use xxsel to match vselect if vsx is open, or use vsel.

In order to do not write many patterns in td file, promote (for vector it's
bitcast) all other type into v4i32 and only pattern match vselect of v4i32 into
vsel or xxsel.

Patch by wuzish
Differential revision: https://reviews.llvm.org/D49531


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339779 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-15 15:30:36 +00:00
Nemanja Ivanovic
7c82b970fb [PowerPC] Don't run BV DAG Combine before legalization if it assumes legal types
When trying to combine a DAG that builds a vector out of sign-extensions of
vector extracts, the code assumes legal input types. Due to that, we have to
disable this combine prior to legalization.
In some cases, the DAG will look slightly different after legalization so
account for that in the matching code.

This is a fix for https://bugs.llvm.org/show_bug.cgi?id=38087

Differential Revision: https://reviews.llvm.org/D49080


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339769 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-15 12:58:13 +00:00
Zaara Syeda
15de990f32 [PowerPC] Improve codegen for vector loads using scalar_to_vector
This patch aims to improve the codegen for vector loads involving the
scalar_to_vector (load X) sequence. Initially, ld->mv instructions were used
for scalar_to_vector (load X), so this patch allows scalar_to_vector (load X)
to utilize:

LXSD and LXSDX for i64 and f64
LXSIWAX for i32 (sign extension to i64)
LXSIWZX for i32 and f64

Committing on behalf of Amy Kwan.
Differential Revision: https://reviews.llvm.org/D48950

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339260 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-08 15:20:43 +00:00
Nemanja Ivanovic
121bd57dbe [PowerPC] Do not round values prior to converting to integer
Adding the FP_ROUND nodes when combining FP_TO_[SU]INT of elements
feeding a BUILD_VECTOR into an FP_TO_[SU]INT of the built vector
loses precision. This patch removes the code that adds these nodes
to true f64 operands. It also adds patterns required to ensure
the code is still vectorized rather than converting individual
elements and inserting into a vector.

Fixes https://bugs.llvm.org/show_bug.cgi?id=38342

Differential Revision: https://reviews.llvm.org/D50121


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338658 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-02 00:03:22 +00:00
Craig Topper
4a5776cc93 [DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to BuildSDIV/BuildUDIV/etc.
The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338329 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-30 23:22:00 +00:00
Craig Topper
f264341a83 [DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to BuildSDIVPow2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338303 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-30 21:04:34 +00:00
Matt Arsenault
f02d879e99 DAG: Add calling convention argument to calling convention funcs
This seems like a pretty glaring omission, and AMDGPU
wants to treat kernels differently from other calling
conventions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338194 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-28 13:25:19 +00:00
Justin Hibbits
e282c3c35e Fix build failures from r337347, found by clang
* Delete a no-longer-used override, and mark the other
getRegisterTypeForCallingConv() as override.
* SPE only supports i32, not i64, as the internal type, so simply remove
the type check, so that DestReg and Opc are provably always set.

GCC 6.4 did not warn about either of the above.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337350 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 05:19:25 +00:00
Justin Hibbits
c486a43e86 Introduce codegen for the Signal Processing Engine
Summary:
The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1,
e500v2, and several e200 cores.  This adds support targeting the e500v2,
as this is more common than the e500v1, and is in SoCs still on the
market.

This patch is very intrusive because the SPE is binary incompatible with
the traditional FPU.  After discussing with others, the cleanest
solution was to make both SPE and FPU features on top of a base PowerPC
subset, so all FPU instructions are now wrapped with HasFPU predicates.

Supported by this are:
* Code generation following the SPE ABI at the LLVM IR level (calling
conventions)
* Single- and Double-precision math at the level supported by the APU.

Still to do:
* Vector operations
* SPE intrinsics

As this changes the Callee-saved register list order, one test, which
tests the precise generated code, was updated to account for the new
register order.

Reviewed by: nemanjai
Differential Revision: https://reviews.llvm.org/D44830

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337347 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 04:25:10 +00:00
Justin Hibbits
34ada1b523 Add PowerPC e500(v2) core scheduler and directives.
Differential Revision: https://reviews.llvm.org/D44828

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337345 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 04:24:49 +00:00
Stefan Pintilie
7ee7123814 [Power9] Add __float128 builtins for Rounding Operations
Added __float128 support for a number of rounding operations:

trunc
rint
nearbyint
round
floor
ceil

Differential Revision: https://reviews.llvm.org/D48415

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336601 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 20:38:40 +00:00
Stefan Pintilie
0b08bea28b [Power9] Add __float128 support for compare operations
Added handling for the select f128.

Differential Revision: https://reviews.llvm.org/D48294

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336548 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 13:36:14 +00:00
Stefan Pintilie
0b9754cd9a [Power9] Add __float128 library call for frem
Power 9 does not have a hardware instruction for frem but we can call fmodf128.

Differential Revision: https://reviews.llvm.org/D48552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336406 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-06 02:47:02 +00:00
Lei Huang
37aa5135e5 [Power9] Add lib calls for float128 operations with no equivalent PPC instructions
Map the following instructions to the proper float128 lib calls:
  pow[i], exp[2], log[2|10], sin, cos, fmin, fmax

Differential Revision: https://reviews.llvm.org/D48544

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336361 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 15:21:37 +00:00
Lei Huang
ddc48fa71f [Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg
Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory,
or in memory. This patch ensures that float128 members of non-homogenous
aggregates are passed via VSX registers.

This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128
to a new PPCISD node, BUILD_FP128.

Differential Revision: https://reviews.llvm.org/D48308

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336310 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 06:21:37 +00:00