archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Jinsong Ji	e946fcb412	[PowerPC][HTM] Fix disassembling buffer overflow for tabortdc and others This was reported in https://bugs.llvm.org/show_bug.cgi?id=41751 llvm-mc aborted when disassembling tabortdc. This patch try to clean up TM related DAGs. * Fixes the problem by remove explicit output of cr0, and put it as implicit def. * Update int_ppc_tbegin pattern to accommodate the implicit def of cr0. * Update the TCHECK operand and int_ppc_tcheck accordingly. * Add some builtin test and disassembly tests. * Remove unused CRRC0/crrc0 Differential Revision: https://reviews.llvm.org/D61935 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364544 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-27 14:11:31 +00:00
Nemanja Ivanovic	4044bf9b0d	[PowerPC] Mark FCOPYSIGN legal for FP vectors This was just an omission in the back end. We have had the instructions for both single and double precision for a few HW generations, but never got around to legalizing these. Differential revision: https://reviews.llvm.org/D63634 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364373 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-26 01:48:57 +00:00
Matt Arsenault	a2b05bc24d	CodeGen: Introduce a class for registers Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364191 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-24 15:50:29 +00:00
Justin Hibbits	05b9698f31	PowerPC: Optimize SPE double parameter calling setup Summary: SPE passes doubles the same as soft-float, in register pairs as i32 types. This is all handled by the target-independent layer. However, this is not optimal when splitting or reforming the doubles, as it pushes to the stack and loads from, on either side. For instance, to pass a double argument to a function, assuming the double value is in r5, the sequence currently looks like this: evstdd 5, X(1) lwz 3, X(1) lwz 4, X+4(1) Likewise, to form a double into r5 from args in r3 and r4: stw 3, X(1) stw 4, X+4(1) evldd 5, X(1) This optimizes the fence to use SPE instructions. Now, to pass a double to a function: mr 4, 5 evmergehi 3, 5, 5 And to form a double into r5 from args in r3 and r4: evmergelo 5, 3, 4 This is comparable to the way that gcc generates the double splits. This also fixes a bug with expanding builtins to libcalls, where the LowerCallTo() code path was generating intermediate illegal type nodes. Reviewers: nemanjai, hfinkel, joerg Subscribers: kbarton, jfb, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D54583 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363526 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 03:15:23 +00:00
Kang Zhang	f8a4e52cc9	[PowerPC] Set the innermost hot loop to align 32 bytes Summary: If the nested loop is an innermost loop, prefer to a 32-byte alignment, so that we can decrease cache misses and branch-prediction misses. Actual alignment of the loop will depend on the hotness check and other logic in alignBlocks. The old code will only align hot loop to 32 bytes when the LoopSize larger than 16 bytes and smaller than 32 bytes, this patch will align the innermost hot loop to 32 bytes not only for the hot loop whose size is 16~32 bytes. Reviewed By: steven.zhang, jsji Differential Revision: https://reviews.llvm.org/D61228 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363495 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-15 15:10:24 +00:00
Simon Pilgrim	9b95d3f22f	[TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests (PR42123) As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space. This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them. If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores. Differential Revision: https://reviews.llvm.org/D63075 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363179 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-12 17:14:03 +00:00
Sam Parker	c313a177b4	[CodeGen] Generic Hardware Loop Support Patch which introduces a target-independent framework for generating hardware loops at the IR level. Most of the code has been taken from PowerPC CTRLoops and PowerPC has been ported over to use this generic pass. The target dependent parts have been moved into TargetTransformInfo, via isHardwareLoopProfitable, with HardwareLoopInfo introduced to transfer information from the backend. Three generic intrinsics have been introduced: - void @llvm.set_loop_iterations Takes as a single operand, the number of iterations to be executed. - i1 @llvm.loop_decrement(anyint) Takes the maximum number of elements processed in an iteration of the loop body and subtracts this from the total count. Returns false when the loop should exit. - anyint @llvm.loop_decrement_reg(anyint, anyint) Takes the number of elements remaining to be processed as well as the maximum numbe of elements processed in an iteration of the loop body. Returns the updated number of elements remaining. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362774 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-07 07:35:30 +00:00
Nemanja Ivanovic	03ae046337	[PowerPC] Exploit the vector min/max instructions Use the PPC vector min/max instructions for computing the corresponding operation as these should be faster than the compare/select sequences we currently emit. Differential revision: https://reviews.llvm.org/D47332 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362759 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-06 23:49:01 +00:00
Jason Liu	3d1f1eb045	[AIX] Implement function descriptor on SDAG Summary: (1) Function descriptor on AIX On AIX, a called routine may have 2 distinct symbols associated with it: * A function descriptor (Name) * A function entry point (.Name) The descriptor structure on AIX is the same as those in the ELF V1 ABI: * The address of the entry point of the function. * The TOC base address for the function. * The environment pointer. The descriptor symbol uses the same name as the source level function in C. The function entry point is analogous to the symbol we would generate for a function in a non-descriptor-based ABI, except that it is renamed by prepending a ".". Which symbol gets referenced depends on the context: * Taking the address of the function references the descriptor symbol. * Calling the function references the entry point symbol. (2) Speaking of implementation on AIX, for direct function call target, we create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to replace original TargetGlobalAddress SDNode. Then down the path, we can take advantage of this MCSymbol. Patch by: Xiangling_L Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara Differential Revision: https://reviews.llvm.org/D62532 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362735 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-06 19:13:36 +00:00
Jason Liu	be6d1bcd6a	[AIX] Implement call lowering with parameters could pass onto GPRs Summary: This patch implements SDAG call lowering on AIX for functions which only have parameters that could fit into GPRs. Reviewers: hubert.reinterpretcast, syzaara Differential Revision: https://reviews.llvm.org/D62823 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362708 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-06 14:36:43 +00:00
Jason Liu	ea8ee651a9	Implement call lowering without parameters on AIX Summary:dd This patch implements call lowering for calls without parameters on AIX as initial support. Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma Differential Revision: https://reviews.llvm.org/D61948 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361669 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 20:54:35 +00:00
Chen Zheng	0b2b4e1922	[PowerPC] [ISEL] select x-form instruction for unaligned offset Differential Revision: https://reviews.llvm.org/D62173 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361346 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-22 02:57:31 +00:00
Chen Zheng	04c01a17c3	[PowerPC] use more meaningful name - NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361218 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-21 03:54:42 +00:00
Lei Huang	5238a36901	[PowerPC] custom lower `v2f64 fpext v2f32` Reduces scalarization overhead via custom lowering of v2f64 fpext v2f32. eg. For the following IR %0 = load <2 x float>, <2 x float>* %Ptr, align 8 %1 = fpext <2 x float> %0 to <2 x double> ret <2 x double> %1 Pre custom lowering: ld r3, 0(r3) mtvsrd f0, r3 xxswapd vs34, vs0 xscvspdpn f0, vs0 xxsldwi vs1, vs34, vs34, 3 xscvspdpn f1, vs1 xxmrghd vs34, vs0, vs1 After custom lowering: lfd f0, 0(r3) xxmrghw vs0, vs0, vs0 xvcvspdp vs34, vs0 Differential Revision: https://reviews.llvm.org/D57857 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360429 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-10 14:04:06 +00:00
Nemanja Ivanovic	9817b74a74	[PowerPC] Use the two-constant NR algorithm for refining estimates The single-constant algorithm produces infinities on a lot of denormal values. The precision of the two-constant algorithm is actually sufficient across the range of denormals. We will switch to that algorithm for now to avoid the infinities on denormals. In the future, we will re-evaluate the algorithm to find the optimal one for PowerPC. Differential revision: https://reviews.llvm.org/D60037 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360144 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-07 13:48:03 +00:00
Nemanja Ivanovic	8d416103aa	[PowerPC] Fix erroneous condition for converting uint-to-fp vector conversion A condition for exiting the legalization of v4i32 conversion to v2f64 through extract/convert/build erroneously checks for the extract having type i32. This is not adequate as smaller extracts are actually legalized to i32 as well. Furthermore, an early exit is missing which means that we only check that both extracts are from the same vector if that check fails. As a result, both cases in the included test case fail - the first gets a select error and the second generates incorrect code. The culprit commit is r274535. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360043 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-06 13:35:49 +00:00
Simon Pilgrim	a5aaefa640	Avoid cppcheck operator precedence warnings. NFCI. Prefer ((X & Y) ? A : B) to (X & Y ? A : B) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359884 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-03 13:50:38 +00:00
Kang Zhang	de63356ef3	[NFC][PowerPC] Return early if the element type is not byte-sized in combineBVOfConsecutiveLoads Summary: Based on the Eli Friedman's comments in https://reviews.llvm.org/D60811 , we'd better return early if the element type is not byte-sized in `combineBVOfConsecutiveLoads`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D61076 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359764 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-02 08:15:13 +00:00
Sjoerd Meijer	67c2691c8d	[TargetLowering] Change getOptimalMemOpType to take a function attribute list The MachineFunction wasn't used in getOptimalMemOpType, but more importantly, this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType. This is the groundwork for the changes in D59766 and D59787, that allows implementation of TTI::getMemcpyCost. Differential Revision: https://reviews.llvm.org/D59785 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359537 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-30 08:38:12 +00:00
Roland Froese	076a39af99	[PowerPC] Try harder to avoid load/move-to VSR for partial vector loads Change the PPCISelLowering.cpp function that decides to avoid update form in favor of partial vector loads to know about newer load types and to not be confused by the chain operand. Differential Revision: https://reviews.llvm.org/D60102 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359504 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-29 21:08:35 +00:00
Joerg Sonnenberger	4eb4e8f21e	[PowerPC] Allow using initial-exec TLS with PIC Using initial-exec TLS variables is a reasonable performance optimisation for system libraries. Use the correct PIC mechanism to get hold of the GOT to avoid text relocations. Differential Revision: https://reviews.llvm.org/D61026 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359146 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-24 22:12:22 +00:00
Sean Fertile	4d4155f08e	Add period at end of comment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359144 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-24 21:51:30 +00:00
Kang Zhang	a2c8107c80	[PowerPC] Fix wrong ElemSIze when calling isConsecutiveLS() Summary: This issue from the bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41177 When the two operands for BUILD_VECTOR are same, we will get assert error. llvm::SDValue combineBVOfConsecutiveLoads(llvm::SDNode*, llvm::SelectionDAG&): Assertion `!(InputsAreConsecutiveLoads && InputsAreReverseConsecutive) && "The loads cannot be both consecutive and reverse consecutive."' failed. This error caused by the wrong ElemSIze when calling isConsecutiveLS(). We should use `getScalarType().getStoreSize();` to get the ElemSize instread of `getScalarSizeInBits() / 8`. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60811 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358644 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-18 07:24:15 +00:00
Evandro Menezes	d71ea05ab7	[IR] Refactor attribute methods in Function class (NFC) Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357731 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-04 22:40:06 +00:00
Kang Zhang	16c4a8836b	[PowerPC] Add the support for __builtin_setrnd() Summary: PowerPC64/PowerPC64le supports the builtin function __builtin_setrnd to set the floating point rounding mode. This function will use the least significant two bits of integer argument to set the floating point rounding mode. double __builtin_setrnd(int mode); The effective values for mode are: 0 - round to nearest 1 - round to zero 2 - round to +infinity 3 - round to -infinity Note that the mode argument will modulo 4, so if the int argument is greater than 3, it will only use the least significant two bits of the mode. Namely, builtin_setrnd(102)) is equal to builtin_setrnd(2). Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D59405 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357241 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-29 08:45:24 +00:00
Zi Xuan Wu	fc1bf1b19a	[PowerPC] Strength reduction of multiply by a constant by shift and add/sub in place A shift and add/sub sequence combination is faster in place of a multiply by constant. Because the cycle or latency of multiply is not huge, we only consider such following worthy patterns. ``` (mul x, 2^N + 1) => (add (shl x, N), x) (mul x, -(2^N + 1)) => -(add (shl x, N), x) (mul x, 2^N - 1) => (sub (shl x, N), x) (mul x, -(2^N - 1)) => (sub x, (shl x, N)) ``` And the cycles or latency is subtarget-dependent so that we need consider the subtarget to determine to do or not do such transformation. Also data type is considered for different cycles or latency to do multiply. Differential Revision: https://reviews.llvm.org/D58950 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357233 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-29 03:08:39 +00:00
Simon Pilgrim	04a982d9fb	Fix for ABS legalization on PPC buildbot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356498 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-19 18:55:46 +00:00
Simon Pilgrim	5425f5c42b	[SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect These changes are related to PR37743 and include: SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node. Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner. Add promoting the integer ABS node in the LegalizeIntegerType. Expand-based legalization of integer result for the ABS nodes. Expand-based legalization of ABS vector operations. Add some integer abs testcases for different typesizes for Thumb arch Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to: tmp = (SRA, Hi, 31) Lo = (UADDO tmp, Lo) Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1)) Lo = (XOR tmp, Lo) The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern: (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))). Change integer abs testcases for codegen with the ABS node support for AArch64. Indicate that the ABS is legal for the i64 type when the NEON is supported. Change the integer abs testcases to show changing of codegen. Add combine and legalization of ABS nodes for Thumb arch. Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition. For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743 Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D49837 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356468 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-19 16:24:55 +00:00
Adhemerval Zanella	0ce3660e40	[TargetLowering] Add code size information on isFPImmLegal. NFC This allows better code size for aarch64 floating point materialization in a future patch. Reviewers: evandro Differential Revision: https://reviews.llvm.org/D58690 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356389 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-18 18:40:07 +00:00
Roland Froese	0783d48fed	[PowerPC] Avoid scalarization of vector truncate The PowerPC code generator currently scalarizes vector truncates that would fit in a vector register, resulting in vector extracts, scalar operations, and vector merges. This patch custom lowers a vector truncate that would fit in a register to a vector shuffle instead. Differential Revision: https://reviews.llvm.org/D56507 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353724 91177308-0d34-0410-b5e6-96231b3b80d8	2019-02-11 17:29:14 +00:00
Reid Kleckner	732eb58985	[PPC] Include tablegenerated PPCGenCallingConv.inc once Move the CC analysis implementation to its own .cpp file instead of duplicating it and artificually using functions in PPCISelLowering.cpp and PPCFastISel.cpp. Follow-up to the same change done for X86, ARM, and AArch64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352444 91177308-0d34-0410-b5e6-96231b3b80d8	2019-01-29 00:30:35 +00:00
Chandler Carruth	6b547686c5	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8	2019-01-19 08:50:56 +00:00
Kang Zhang	55e4ae96b0	[PowerPC] Fix machine verify pass error for PATCHPOINT pseudo instruction that bad machine code Summary: For SDAG, we pretend patchpoints aren't special at all until we emit the code for the pseudo. Then the verifier runs and it seems like we have a use of an undefined register (the register will be reserved later, but the verifier doesn't know that). So this patch call setUsesTOCBasePtr before emit the code for the pseudo, so verifier can know X2 is a reserved register. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D56148 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350165 91177308-0d34-0410-b5e6-96231b3b80d8	2018-12-30 15:13:51 +00:00
Nemanja Ivanovic	4125b366e1	[PowerPC] Complete the custom legalization of vector int to fp conversion A recent patch has added custom legalization of vector conversions of v2i16 -> v2f64. This just rounds it out for other types where the input vector has an illegal (narrower) type than the result vector. Specifically, this will handle the following conversions: v2i8 -> v2f64 v4i8 -> v4f32 v4i16 -> v4f32 Differential revision: https://reviews.llvm.org/D54663 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350155 91177308-0d34-0410-b5e6-96231b3b80d8	2018-12-29 13:40:48 +00:00
Zi Xuan Wu	99610fde48	[NFC] clang-format functions related to r350113 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350114 91177308-0d34-0410-b5e6-96231b3b80d8	2018-12-28 02:45:17 +00:00
Zi Xuan Wu	b534db940c	[PowerPC] Fix assert from machine verify pass that atomic pseudo expanding causes mismatched register class For atomic value operand which less than 4 bytes need to be masked. And the related operation to calculate the newvalue can be done in 32 bit gprc. So just use gprc for mask and value calculation. Differential Revision: https://reviews.llvm.org/D56077 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350113 91177308-0d34-0410-b5e6-96231b3b80d8	2018-12-28 02:12:55 +00:00
Kang Zhang	d2ca398f08	[PowerPC] Fix the bug of ISD::ADDE to set its second return type to glue Summary: This patch is to fix the bug imported by rL341634. In above submit , the the return type of ISD::ADDE is 14224: SDVTList VTs = DAG.getVTList(MVT::i64, MVT::i64), but in fact, the second return type of ISD::ADDE should be MVT::Glue not MVT::i64. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D55977 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350061 91177308-0d34-0410-b5e6-96231b3b80d8	2018-12-25 03:29:51 +00:00
Simon Pilgrim	cf4d42cf34	[PPC] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@349903 91177308-0d34-0410-b5e6-96231b3b80d8	2018-12-21 14:32:39 +00:00
Kewen Lin	979b87d6b7	[PowerPC]Exploit P9 vabsdu for unsigned vselect patterns For type v4i32/v8ii16/v16i8, do following transforms: (vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setult), (sub b, a), (sub a, b)) -> (vabsd a, b) (vselect (setcc a, b, setule), (sub b, a), (sub a, b)) -> (vabsd a, b) Differential Revision: https://reviews.llvm.org/D55812 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@349599 91177308-0d34-0410-b5e6-96231b3b80d8	2018-12-19 03:04:07 +00:00
Kewen Lin	5ebd7af2ed	[PowerPC] Improve vec_abs on P9 Improve the current vec_abs support on P9, generate ISD::ABS node for vector types, combine ABS node to VABSD node for some special cases to make use of P9 VABSD* insns, do custom lowering to vsub(vneg later)+vmax if it has no combination opportunity. Differential Revision: https://reviews.llvm.org/D54783 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@349437 91177308-0d34-0410-b5e6-96231b3b80d8	2018-12-18 03:16:43 +00:00
QingShan Zhang	e28eceb434	[NFC] [PowerPC] add an routine in PPCTargetLowering to determine if a global is accessed as got-indirect or not. In theory, we should let the PPC target to determine how to lower the TOC Entry for globals. And the PPCTargetLowering requires this query to do some optimization for TOC_Entry. Differential Revision: https://reviews.llvm.org/D54925 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348108 91177308-0d34-0410-b5e6-96231b3b80d8	2018-12-03 03:32:16 +00:00
Nemanja Ivanovic	ecd20daf2c	[PowerPC] Do not use vectors to codegen bswap with Altivec turned off We have efficient codegen on P9 for lowering bswap that involves moving the value into a vector reg and moving it back. However, the check under which we custom lowered it did not adequately reflect the actual requirements. It required only that the subtarget be an implementation of ISA 3.0 since all compliant implementations have to provide the vector instructions. However, the kernel builds have a valid use case for -mno-altivec -mcpu=pwr9 (i.e. don't emit vector code, don't have to save vector regs for context switch). So we should require the correct features for this lowering. Fixes https://bugs.llvm.org/show_bug.cgi?id=39334 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347376 91177308-0d34-0410-b5e6-96231b3b80d8	2018-11-21 02:53:50 +00:00
Nemanja Ivanovic	86787453a3	[PowerPC] Don't combine to bswap store on 1-byte truncating store Turns out that there was no check for a store that truncates down to a single byte when combining a (store (bswap...)) into a byte-swapping store. This patch just adds that check. Fixes https://bugs.llvm.org/show_bug.cgi?id=39478. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347288 91177308-0d34-0410-b5e6-96231b3b80d8	2018-11-20 04:42:31 +00:00
Zi Xuan Wu	382877c17c	[PowerPC] Enhance the selection(ISD::VSELECT) of vector type To make ISD::VSELECT available(legal) so long as there are altivec instruction, otherwise it's default behavior is expanding, which is legalized at type-legalization phase. Use xxsel to match vselect if vsx is open, or use vsel. Differential Revision: https://reviews.llvm.org/D49531 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346824 91177308-0d34-0410-b5e6-96231b3b80d8	2018-11-14 02:34:45 +00:00
Reid Kleckner	b7d45e1d88	Fix clang -Wimplicit-fallthrough warnings across llvm, NFC This patch should not introduce any behavior changes. It consists of mostly one of two changes: 1. Replacing fall through comments with the LLVM_FALLTHROUGH macro 2. Inserting 'break' before falling through into a case block consisting of only 'break'. We were already using this warning with GCC, but its warning behaves slightly differently. In this patch, the following differences are relevant: 1. GCC recognizes comments that say "fall through" as annotations, clang doesn't 2. GCC doesn't warn on "case N: foo(); default: break;", clang does 3. GCC doesn't warn when the case contains a switch, but falls through the outer case. I will enable the warning separately in a follow-up patch so that it can be cleanly reverted if necessary. Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu Differential Revision: https://reviews.llvm.org/D53950 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345882 91177308-0d34-0410-b5e6-96231b3b80d8	2018-11-01 19:54:45 +00:00
Li Jia He	fb14999433	[PowerPC] Support constraint 'wi' in asm From the gcc manual, we can see that the specific limit of wi inline asm is “FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS”. The link is https://gcc.gnu.org/onlinedocs/gcc-8.2.0/gcc/Machine-Constraints.html#Machine-Constraints. We should accept this constraint. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D53265 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345810 91177308-0d34-0410-b5e6-96231b3b80d8	2018-11-01 02:35:17 +00:00
Li Jia He	ad84a8b9be	[PowerPC] Fix some missed optimization opportunities in combineSetCC For both operands are bool, short, int, long, long long, add the following optimization. 1. 0-x == y --> x+y ==0 2. 0-x != y --> x+y != 0 Review: nemanjai Differential Revision: https://reviews.llvm.org/D53360 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345366 91177308-0d34-0410-b5e6-96231b3b80d8	2018-10-26 06:48:53 +00:00
Nemanja Ivanovic	699414a493	[PowerPC] Keep vector int to fp conversions in vector domain At present a v2i16 -> v2f64 convert is implemented by extracts to scalar, scalar converts, and merge back into a vector. Use vector converts instead, with the int data permuted into the proper position and extended if necessary. Patch by RolandF. Differential revision: https://reviews.llvm.org/D53346 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345361 91177308-0d34-0410-b5e6-96231b3b80d8	2018-10-26 03:19:13 +00:00
Stefan Pintilie	c0de197df0	[Power9] Add __float128 support in the backend for bitcast to a i128 Add support to allow bit-casting from f128 to i128 and then extracting 64 bits from the result. Differential Revision: https://reviews.llvm.org/D49507 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345053 91177308-0d34-0410-b5e6-96231b3b80d8	2018-10-23 17:11:36 +00:00
QingShan Zhang	64c3a57bec	[PowerPC] Fix the assert of ISD::SIGN_EXTEND_INREG when type is v2i16 and v2i8 For ISD::SIGN_EXTEND_INREG operation of v2i16 and v2i8 types will cause assert because they are registered as custom operation. So that the type legalization phase will enter the custom hook, which do not handle ISD::SIGN_EXTEND_INREG operation and fall throw into unreachable assert. Patch By: wuzish (Zixuan Wu) Differential Revision: https://reviews.llvm.org/D52449 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344109 91177308-0d34-0410-b5e6-96231b3b80d8	2018-10-10 02:33:48 +00:00

1 2 3 4 5 ...

1342 Commits