RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-23 22:06:19 +00:00

Author	SHA1	Message	Date
Simon Pilgrim	07898901df	[DAGCombiner] Add vector demanded elements support to computeKnownBitsForTargetNode Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes. Differential Revision: https://reviews.llvm.org/D31249 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299201 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-31 11:24:16 +00:00
Eric Christopher	193628a590	Kill some trailing whitespace to make some new changes a bit easier. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298637 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-23 19:41:10 +00:00
Nirav Dave	11fdc7845a	Make library calls sensitive to regparm module flag (Fixes PR3997). Reviewers: mkuper, rnk Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27050 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298179 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-18 00:44:07 +00:00
Reid Kleckner	1c19be8a98	Remove getArgumentList() in favor of arg_begin(), args(), etc Users often call getArgumentList().size(), which is a linear way to get the number of function arguments. arg_size(), on the other hand, is constant time. In general, the fact that arguments are stored in an iplist is an implementation detail, so I've removed it from the Function interface and moved all other users to the argument container APIs (arg_begin(), arg_end(), args(), arg_size()). Reviewed By: chandlerc Differential Revision: https://reviews.llvm.org/D31052 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298010 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-16 22:59:15 +00:00
Hiroshi Inoue	a2d20c4bca	Test commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297959 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-16 16:30:06 +00:00
Tim Shen	2aec349a25	Revert "Revert "[PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size"" After inspection, it's an UB in our code base. Someone cast a var-arg function pointer to a non-var-arg one. :/ Re-commit r296771 to continue testing on the patch. Sorry for the trouble! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297256 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-08 02:41:35 +00:00
Tim Shen	107e5edeba	Revert "[PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size" This reverts commit r296771. We found some wide spread test failures internally. I'm working on a testcase. Politely revert the patch in the mean time. :) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297124 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-07 07:40:10 +00:00
Nemanja Ivanovic	521df0adae	[PowerPC] Fix failure with STBRX when store is narrower than the bswap Fixes a crash caused by r296811 by truncating the input of the STBRX node when the bswap is wider than i32. Fixes https://bugs.llvm.org/show_bug.cgi?id=32140 Differential Revision: https://reviews.llvm.org/D30615 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297001 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-06 07:32:13 +00:00
Guozhi Wei	e225b39cd4	[PPC] Fix code generation for bswap(int32) followed by store16 This patch fixes pr32063. Current code in PPCTargetLowering::PerformDAGCombine can transform bswap store into a single PPCISD::STBRX instruction. but it doesn't consider the case that the operand size of bswap may be larger than store size. When it occurs, we need 2 modifications, 1 For the last operand of PPCISD::STBRX, we should not use DAG.getValueType(N->getOperand(1).getValueType()), instead we should use cast<StoreSDNode>(N)->getMemoryVT(). 2 Before PPCISD::STBRX, we need to shift the original operand of bswap to the right side. Differential Revision: https://reviews.llvm.org/D30362 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296811 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-02 21:07:59 +00:00
Nemanja Ivanovic	6981f9a951	[PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size This patch reduces the stack frame size by not allocating the parameter area if it is not required. In the current implementation LowerFormalArguments_64SVR4 already handles the parameter area, but LowerCall_64SVR4 does not (when calculating the stack frame size). What this patch does is make LowerCall_64SVR4 consistent with LowerFormalArguments_64SVR4. Committing on behalf of Hiroshi Inoue. Differential Revision: https://reviews.llvm.org/D29881 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296771 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-02 17:38:59 +00:00
Tony Jiang	758e067345	[PowerPC] Expand ISEL instruction into if-then-else sequence. Generally, the ISEL is expanded into if-then-else sequence, in some cases (like when the destination register is the same with the true or false value register), it may just be expanded into just the if or else sequence. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292154 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-16 20:12:26 +00:00
Tony Jiang	748e859f36	Revert "[PowerPC] Expand ISEL instruction into if-then-else sequence." This reverts commit 1d0e0374438ca6e153844c683826ba9b82486bb1. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292131 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-16 15:01:07 +00:00
Tony Jiang	541103a1c6	[PowerPC] Expand ISEL instruction into if-then-else sequence. Generally, the ISEL is expanded into if-then-else sequence, in some cases (like when the destination register is the same with the true or false value register), it may just be expanded into just the if or else sequence. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292128 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-16 14:43:12 +00:00
Eugene Zelenko	4a9353b234	[PowerPC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291872 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-13 00:58:58 +00:00
Hal Finkel	68c84942ec	[PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility) This change aims to unify and correct our logic for when we need to allow for the possibility of the linker adding a TOC restoration instruction after a call. This comes up in two contexts: 1. When determining tail-call eligibility. If we make a tail call (i.e. directly branch to a function) then there is no place for the linker to add a TOC restoration. 2. When determining when we need to add a nop instruction after a call. Likewise, if there is a possibility that the linker might need to add a TOC restoration after a call, then we need to put a nop after the call (the bl instruction). First problem: We were using similar, but different, logic to decide (1) and (2). This is just wrong. Both the resideInSameModule function (used when determining tail-call eligibility) and the isLocalCall function (used when deciding if the post-call nop is needed) were supposed to be determining the same underlying fact (i.e. might a TOC restoration be needed after the call). The same logic should be used in both places. Second problem: The logic in both places was wrong. We only know that two functions will share the same TOC when both functions come from the same section of the same object. Otherwise the linker might cause the functions to use different TOC base addresses (unless the multi-TOC linker option is disabled, in which case only shared-library boundaries are relevant). There are a number of factors that can cause functions to be placed in different sections or come from different objects (-ffunction-sections, explicitly-specified section names, COMDAT, weak linkage, etc.). All of these need to be checked. The existing logic only checked properties of the callee, but the properties of the caller must also be checked (for example, calling from a function in a COMDAT section means calling between sections). There was a conceptual error in the resideInSameModule function in that it allowed tail calls to functions with weak linkage and protected/hidden visibility. While protected/hidden visibility does prevent the function implementation from being replaced at runtime (via interposition), it does not prevent the linker from using an alternate implementation at link time (i.e. using some strong definition to replace the provided weak one during linking). If this happens, then we're still potentially looking at a required TOC restoration upon return. Otherwise, in general, the post-call nop is needed wherever ELF interposition needs to be supported. We don't currently support ELF interposition at the IR level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html for more information), and I don't think we should try to make it appear to work in the backend in spite of that fact. Unfortunately, because of the way that the ABI works, we need to generate code as if we supported interposition whenever the linker might insert stubs for the purpose of supporting it. Differential Revision: https://reviews.llvm.org/D27231 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291003 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-04 21:05:13 +00:00
Chandler Carruth	e87e7067ed	Revert r289638: [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility) This patch appears to result in trampolines in vtables being miscompiled when they in turn tail call a method. I've posted some preliminary details about the failure on the thread for this commit and talked to Hal. He was comfortable going ahead and reverting until we sort out what is wrong. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289928 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-16 07:31:20 +00:00
Hal Finkel	e573748031	[PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility) This change aims to unify and correct our logic for when we need to allow for the possibility of the linker adding a TOC restoration instruction after a call. This comes up in two contexts: 1. When determining tail-call eligibility. If we make a tail call (i.e. directly branch to a function) then there is no place for the linker to add a TOC restoration. 2. When determining when we need to add a nop instruction after a call. Likewise, if there is a possibility that the linker might need to add a TOC restoration after a call, then we need to put a nop after the call (the bl instruction). First problem: We were using similar, but different, logic to decide (1) and (2). This is just wrong. Both the resideInSameModule function (used when determining tail-call eligibility) and the isLocalCall function (used when deciding if the post-call nop is needed) were supposed to be determining the same underlying fact (i.e. might a TOC restoration be needed after the call). The same logic should be used in both places. Second problem: The logic in both places was wrong. We only know that two functions will share the same TOC when both functions come from the same section of the same object. Otherwise the linker might cause the functions to use different TOC base addresses (unless the multi-TOC linker option is disabled, in which case only shared-library boundaries are relevant). There are a number of factors that can cause functions to be placed in different sections or come from different objects (-ffunction-sections, explicitly-specified section names, COMDAT, weak linkage, etc.). All of these need to be checked. The existing logic only checked properties of the callee, but the properties of the caller must also be checked (for example, calling from a function in a COMDAT section means calling between sections). There was a conceptual error in the resideInSameModule function in that it allowed tail calls to functions with weak linkage and protected/hidden visibility. While protected/hidden visibility does prevent the function implementation from being replaced at runtime (via interposition), it does not prevent the linker from using an alternate implementation at link time (i.e. using some strong definition to replace the provided weak one during linking). If this happens, then we're still potentially looking at a required TOC restoration upon return. Otherwise, in general, the post-call nop is needed wherever ELF interposition needs to be supported. We don't currently support ELF interposition at the IR level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html for more information), and I don't think we should try to make it appear to work in the backend in spite of that fact. This will yield subtle bugs if interposition is attempted. As a result, regardless of whether we're in PIC mode, we don't assume that we need to add the nop to support the possibility of ELF interposition. However, the necessary check is in place (i.e. calling GV->isInterposable and TM.shouldAssumeDSOLocal) so when we have functions for which interposition is allowed at the IR level, we'll add the nop as necessary. In the mean time, we'll generate more tail calls and fewer nops when compiling position-independent code. Differential Revision: https://reviews.llvm.org/D27231 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289638 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-14 07:24:50 +00:00
Guozhi Wei	9def2dd316	[PPC] Prefer direct move on power8 if load 1 or 2 bytes to VSR Power8 has MTVSRWZ but no LXSIBZX/LXSIHZX, so move 1 or 2 bytes to VSR through MTVSRWZ is much faster than store the extended value into stack and load it with LXSIWZX. This patch fixes pr31144. Differential Revision: https://reviews.llvm.org/D27287 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289473 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-12 22:09:02 +00:00
Nemanja Ivanovic	502534bc2a	[PowerPC] Improvements for BUILD_VECTOR Vol. 2 This patch corresponds to review: https://reviews.llvm.org/D26023 This patch adds support for converting a vector of loads into a single load if the loads are consecutive (in either direction). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288219 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-29 23:57:54 +00:00
Nemanja Ivanovic	17f24fa4e1	[PowerPC] Improvements for BUILD_VECTOR Vol. 2 This patch corresponds to review: https://reviews.llvm.org/D25980 This is the 2nd patch in a series of 4 that improve the lowering and combining for BUILD_VECTOR nodes on PowerPC. This particular patch combines a build vector of fp-to-int conversions into an fp-to-int conversion of a build vector of fp values. For example: Converts (build_vector (fp_to_[su]i $A), (fp_to_[su]i $B), ...) Into (fp_to_[su]i (build_vector $A, $B, ...))). Which is a natural match for much cleaner code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288218 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-29 23:36:03 +00:00
Nemanja Ivanovic	0b07a5ae00	Revert https://reviews.llvm.org/rL287679 This commit caused some miscompiles that did not show up on any of the bots. Reverting until we can investigate the cause of those failures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288214 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-29 23:00:33 +00:00
Nemanja Ivanovic	1ec5b2fb75	[PowerPC] Improvements for BUILD_VECTOR Vol. 1 This patch corresponds to review: https://reviews.llvm.org/D25912 This is the first patch in a series of 4 that improve the lowering and combining for BUILD_VECTOR nodes on PowerPC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288152 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-29 16:11:34 +00:00
Nemanja Ivanovic	aa687a6ca9	[PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LE This patch corresponds to review: https://reviews.llvm.org/D26861 It also fixes PR30730. Committing on behalf of Lei Huang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287679 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-22 19:02:07 +00:00
Ehsan Amiri	072e86da0c	[PPC][DAGCombine] Convert SETCC to subtract when the result is zero extended When we see a SETCC whose only users are zero extend operations, we can replace it with a subtraction. This results in doing all calculations in GPRs and avoids CR use. Currently we do this only for ULT, ULE, UGT and UGE condition codes. There are ways that this can be extended. For example for signed condition codes. In that case we will be introducing additional sign extend instructions, so more careful profitability analysis may be required. Another direction to extend this is for equal, not equal conditions. Also when users of SETCC are any_ext or sign_ext, we might be able to do something similar. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287329 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-18 10:41:44 +00:00
Joerg Sonnenberger	8833323011	Always use relative jump table encodings on PowerPC64. For the default, small and medium code model, use the existing difference from the jump table towards the label. For all other code models, setup the picbase and use the difference between the picbase and the block address. Overall, this results in smaller data tables at the expensive of one or two more arithmetic operation at the jump site. Given that we only create jump tables with a lot more than two entries, it is a net win in size. For larger code models the assumption remains that individual functions are no larger than 2GB. Differential Revision: https://reviews.llvm.org/D26336 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287059 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-16 00:37:30 +00:00
Tony Jiang	6ad6c513ee	[PowerPC] Implement BE VSX load/store builtins - llvm portion. This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE, they behaves exactly the same with vec_xl and vec_xst, therefore they are simply implemented by defining a matching macro. On LE, they are implemented by defining new builtins and intrinsics. For int/float/long long/double, it is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short, we also need some extra shuffling before or after call the builtins to get the desired BE order. For int128, simply call vec_xl or vec_xst. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286967 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-15 14:25:56 +00:00
Evandro Menezes	3f647d62d1	[DAG Combiner] Fix the native computation of the Newton series for reciprocals The generic infrastructure to compute the Newton series for reciprocal and reciprocal square root was conceived to allow a target to compute the series itself. However, the original code did not properly consider this condition if returned by a target. This patch addresses the issues to allow a target to compute the series on its own. Differential revision: https://reviews.llvm.org/D22975 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286523 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-10 23:31:06 +00:00
Ehsan Amiri	300e976507	[PPC] Generate positive FP zero using xor insn instead of loading from constant area https://reviews.llvm.org/D23614 Currently we load +0.0 from constant area. That can change to be generated using XOR instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284995 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-24 17:31:09 +00:00
Sanjay Patel	928f047b68	[Target] remove TargetRecip class; 2nd try This is a retry of r284495 which was reverted at r284513 due to use-after-scope bugs caused by faulty usage of StringRef. This version also renames a pair of functions: getRecipEstimateDivEnabled() getRecipEstimateSqrtEnabled() as suggested by Eric Christopher. original commit msg: [Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering This is a follow-up to https://reviews.llvm.org/D24816 - where we changed reciprocal estimates to be function attributes rather than TargetOptions. This patch is intended to be a structural, but not functional change. By moving all of the TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate state, shield the callers from the string format implementation, and simplify/localize the logic needed for a target to enable this. If a function has a "reciprocal-estimates" attribute, those settings may override the target's default reciprocal preferences for whatever operation and data type we're trying to optimize. If there's no attribute string or specific setting for the op/type pair, just use the target default settings. As noted earlier, a better solution would be to move the reciprocal estimate settings to IR instructions and SDNodes rather than function attributes, but that's a multi-step job that requires infrastructure improvements. I intend to work on that, but it's not clear how long it will take to get all the pieces in place. Differential Revision: https://reviews.llvm.org/D25440 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284746 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-20 16:55:45 +00:00
Sanjay Patel	bbcb21daf0	revert r284495: [Target] remove TargetRecip class There's something wrong with the StringRef usage while parsing the attribute string. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284513 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-18 18:36:49 +00:00
Sanjay Patel	5800d6e9a7	[Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering This is a follow-up to D24816 - where we changed reciprocal estimates to be function attributes rather than TargetOptions. This patch is intended to be a structural, but not functional change. By moving all of the TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate state, shield the callers from the string format implementation, and simplify/localize the logic needed for a target to enable this. If a function has a "reciprocal-estimates" attribute, those settings may override the target's default reciprocal preferences for whatever operation and data type we're trying to optimize. If there's no attribute string or specific setting for the op/type pair, just use the target default settings. As noted earlier, a better solution would be to move the reciprocal estimate settings to IR instructions and SDNodes rather than function attributes, but that's a multi-step job that requires infrastructure improvements. I intend to work on that, but it's not clear how long it will take to get all the pieces in place. Differential Revision: https://reviews.llvm.org/D25440 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284495 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-18 17:05:05 +00:00
Sanjay Patel	b60ab5d110	[Target] move reciprocal estimate settings from TargetOptions to TargetLowering The motivation for the change is that we can't have pseudo-global settings for codegen living in TargetOptions because that doesn't work with LTO. Ideally, these reciprocal attributes will be moved to the instruction-level via FMF, metadata, or something else. But making them function attributes is at least an improvement over the current state. The ingredients of this patch are: Remove the reciprocal estimate command-line debug option. Add TargetRecip to TargetLowering. Remove TargetRecip from TargetOptions. Clean up the TargetRecip implementation to work with this new scheme. Set the default reciprocal settings in TargetLoweringBase (everything is off). Update the PowerPC defaults, users, and tests. Update the x86 defaults, users, and tests. Note that if this patch needs to be reverted, the related clang patch checked in at r283251 should be reverted too. Differential Revision: https://reviews.llvm.org/D24816 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283252 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-04 20:46:43 +00:00
Nemanja Ivanovic	d0e875cdad	[Power9] Part-word VSX integer scalar loads/stores and sign extend instructions This patch corresponds to review: https://reviews.llvm.org/D23155 This patch removes the VSHRC register class (based on D20310) and adds exploitation of the Power9 sub-word integer loads into VSX registers as well as vector sign extensions. The new instructions are useful for a few purposes: Int to Fp conversions of 1 or 2-byte values loaded from memory Building vectors of 1 or 2-byte integers with values loaded from memory Storing individual 1 or 2-byte elements from integer vectors This patch implements all of those uses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283190 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-04 06:59:23 +00:00
Hal Finkel	4c305bebf0	[PowerPC] Refactor soft-float support, and enable PPC64 soft float This change enables soft-float for PowerPC64, and also makes soft-float disable all vector instruction sets for both 32-bit and 64-bit modes. This latter part is necessary because the PPC backend canonicalizes many Altivec vector types to floating-point types, and so soft-float breaks scalarization support for many operations. Both for embedded targets and for operating-system kernels desiring soft-float support, it seems reasonable that disabling hardware floating-point also disables vector instructions (embedded targets without hardware floating point support are unlikely to have Altivec, etc. and operating system kernels desiring not to use floating-point registers to lower syscall cost are unlikely to want to use vector registers either). If someone needs this to work, we'll need to change the fact that we promote many Altivec operations to act on v4f32. To make it possible to disable Altivec when soft-float is enabled, hardware floating-point support needs to be expressed as a positive feature, like the others, and not a negative feature, because target features cannot have dependencies on the disabling of some other feature. So +soft-float has now become -hard-float. Fixes PR26970. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283060 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-02 02:10:20 +00:00
Nemanja Ivanovic	7a5ffa3882	[Power9] Builtins for ELF v.2 API conformance - back end portion This patch corresponds to review: https://reviews.llvm.org/D24396 This patch adds support for the "vector count trailing zeroes", "vector compare not equal" and "vector compare not equal or zero instructions" as well as "scalar count trailing zeroes" instructions. It also changes the vector negation to use XXLNOR (when VSX is enabled) so as not to increase register pressure (previously this was done with a splat immediate of all ones followed by an XXLXOR). This was done because the altivec.h builtins (patch to follow) use vector negation and the use of an additional register for the splat immediate is not optimal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282478 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-27 08:42:12 +00:00
Nemanja Ivanovic	a04f9019ef	[Power9] Exploit move and splat instructions for build_vector improvement This patch corresponds to review: https://reviews.llvm.org/D21135 This patch exploits the following instructions: mtvsrws lxvwsx mtvsrdd mfvsrld In order to improve some build_vector and extractelement patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282246 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-23 13:25:31 +00:00
Nemanja Ivanovic	f2f9e2bcc5	[PowerPC] Sign extend sub-word values for atomic comparisons Atomic comparison instructions use the sub-word load instruction on Power8 and up but the value is not sign extended prior to the signed word compare instruction. This patch adds that sign extension. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282182 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-22 19:06:38 +00:00
Nemanja Ivanovic	a941fe247e	[Power9] Add exploitation of non-permuting memory ops This patch corresponds to review: https://reviews.llvm.org/D19825 The new lxvx/stxvx instructions do not require the swaps to line the elements up correctly. In order to select them over the lxvd2x/lxvw4x instructions which require swaps, the patterns for the old instruction have a predicate that ensures they won't be selected on Power9 and newer CPUs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282143 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-22 09:52:19 +00:00
Sanjay Patel	a7c48ccd3f	getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281493 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-14 16:05:51 +00:00
Sanjay Patel	04e0167eac	getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281490 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-14 15:43:44 +00:00
Sanjay Patel	f4559b5e2c	getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281489 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-14 15:21:00 +00:00
Nemanja Ivanovic	7328eb7558	Fix code-gen crash on Power9 for insert_vector_elt with variable index (PR30189) This patch corresponds to review: https://reviews.llvm.org/D24021 In the initial implementation of this instruction, I forgot to account for variable indices. This patch fixes PR30189 and should probably be merged into 3.9.1 (I'll open a bug according to the new instructions). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281479 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-14 14:19:09 +00:00
Justin Lebar	c71d5b41ef	[CodeGen] Split out the notions of MI invariance and MI dereferenceability. Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281151 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-11 01:38:58 +00:00
Hal Finkel	77579f5cd8	Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on PowerPC LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is lowered using: ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET) where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86, FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not work for PowerPC. Because of the way that the stack layout works, the canonical frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC (there is a lower save-area offset as well), so it is not just a matter of implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its semantics -- We can do that, since it is currently used only for @llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips currently does this, but by using a custom lowering for ADD that specifically recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern. This change introduces a ISD::EH_DWARF_CFA node, which by default expands using the existing logic, but can be directly lowered by the target. Mips is updated to use this method (which simplifies its implementation, and I suspect makes it more robust), and updates PowerPC to do the same. Fixes PR26761. Differential Revision: https://reviews.llvm.org/D24038 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280350 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-01 10:28:47 +00:00
Hal Finkel	557221a0e1	[PowerPC] Add support for -mlongcall The "long call" option forces the use of the indirect calling sequence for all calls (even those that don't really need it). GCC provides this option; This is helpful, under certain circumstances, for building very-large binaries, and some other specialized use cases. Fixes PR19098. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280040 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-30 00:59:23 +00:00
Hal Finkel	e060ffb4b2	[PowerPC] Fix i8/i16 atomics for little-Endian targets without partword atomics For little-Endian PowerPC, we generally target only P8 and later by default. However, generic (older) 64-bit configurations are still an option, and in that case, partword atomics are not available (e.g. stbcx.). To lower i8/i16 atomics without true i8/i16 atomic operations, we emulate using i32 atomics in combination with a bunch of shifting and masking, etc. The amount by which to shift in little-Endian mode is different from the amount in big-Endian mode (it is inverted -- meaning we can leave off the xor when computing the amount). Fixes PR22923. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280022 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-29 22:25:36 +00:00
Hal Finkel	afa0d1049b	[PowerPC] Implement lowering for atomicrmw min/max/umin/umax Implement lowering for atomicrmw min/max/umin/umax. Fixes PR28818. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279933 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-28 16:17:58 +00:00
Justin Bogner	6673ea81f6	Replace "fallthrough" comments with LLVM_FALLTHROUGH This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278902 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-17 05:10:15 +00:00
Chuang-Yu Cheng	7177ff558c	[ppc64] Don't apply sibling call optimization if callee has any byval arg This is a quick work around, because in some cases, e.g. caller's stack size > callee's stack size, we are still able to apply sibling call optimization even callee has any byval arg. This patch fix: https://llvm.org/bugs/show_bug.cgi?id=28328 Reviewers: hfinkel kbarton nemanjai amehsan Subscribers: hans, tjablin https://reviews.llvm.org/D23441 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278900 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-17 03:17:44 +00:00
Pierre Gousseau	349838b560	[x86] Refactor a PowerPC specific ctlz/srl transformation (NFC). Following the discussion on D22038, this refactors a PowerPC specific setcc -> srl(ctlz) transformation so it can be used by other targets. Differential Revision: https://reviews.llvm.org/D23445 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278799 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-16 13:53:53 +00:00

1 2 3 4 5 ...

1171 Commits