archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Tom Stellard	33c352b3ed	Merging r247435: ------------------------------------------------------------------------ r247435 \| david.majnemer \| 2015-09-11 13:34:34 -0400 (Fri, 11 Sep 2015) \| 8 lines [X86] Make sure startproc/endproc are paired We used different conditions to determine if we should emit startproc vs endproc. Use the same condition to ensure that they will always be paired. This fixes PR24374. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@253742 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-21 01:08:13 +00:00
Tom Stellard	50dd59a89a	Merging r244728: ------------------------------------------------------------------------ r244728 \| Matthew.Arsenault \| 2015-08-12 05:04:44 -0400 (Wed, 12 Aug 2015) \| 2 lines AMDGPU: Fix assert on dbg_value instructions ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@253234 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-16 17:23:30 +00:00
Tom Stellard	ad24102800	Merging r243731: ------------------------------------------------------------------------ r243731 \| Matthew.Arsenault \| 2015-07-31 00:12:04 -0400 (Fri, 31 Jul 2015) \| 2 lines AMDGPU: Fix v16i32 to v16i8 truncstore ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@253231 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-16 17:23:25 +00:00
Tom Stellard	074105b0db	Merging r243462: ------------------------------------------------------------------------ r243462 \| Matthew.Arsenault \| 2015-07-28 14:47:00 -0400 (Tue, 28 Jul 2015) \| 5 lines AMDGPU: Don't try to use LDS/vector for private if pointer value stored If the pointer is the store's value operand, this would produce a broken module. Make sure the use is actually for the pointer operand. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@253228 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-16 17:23:20 +00:00
Tom Stellard	6a6b791e80	Merging r243461: ------------------------------------------------------------------------ r243461 \| Matthew.Arsenault \| 2015-07-28 14:29:14 -0400 (Tue, 28 Jul 2015) \| 5 lines AMDGPU: Fix crash if called function is a bitcast getCalledFunction() is null, so this would crash. Replace crash with an error on unsupported call. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@253227 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-16 17:23:17 +00:00
Tom Stellard	aac32f3441	Merging r251582: ------------------------------------------------------------------------ r251582 \| hfinkel \| 2015-10-28 19:43:00 -0400 (Wed, 28 Oct 2015) \| 11 lines [PowerPC] Recurse through constants when looking for TLS globals We cannot form ctr-based loops around function calls, including calls to __tls_get_addr used for PIC TLS variables. References to such TLS variables, however, might be buried within constant expressions, and so we need to search the entire constant expression to be sure that no references to such TLS variables exist. Fixes PR25256, reported by Eric Schweitz. This is a slightly-modified version of the patch suggested by Eric in the bug report, and a test case I created. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@253120 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-14 02:26:39 +00:00
Tom Stellard	e092157a81	Merging r245862: ------------------------------------------------------------------------ r245862 \| wschmidt \| 2015-08-24 15:27:27 -0400 (Mon, 24 Aug 2015) \| 8 lines [PPC64LE] Fix PR24546 - Swap optimization and debug values This patch fixes PR24546, which demonstrates a segfault during the VSX swap removal pass. The problem is that debug value instructions were not excluded from the list of instructions to be analyzed for webs of related computation. I've added the test case from the PR as a crash test in test/CodeGen/PowerPC. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@252850 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-12 03:05:51 +00:00
Tom Stellard	aab02dd674	Merging r247083: ------------------------------------------------------------------------ r247083 \| echristo \| 2015-09-08 18:14:58 -0400 (Tue, 08 Sep 2015) \| 3 lines Fix the PPC CTR Loop pass to look for calls to the intrinsics that read CTR and count them as reading the CTR. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@252511 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-09 21:01:15 +00:00
Tom Stellard	703d9535b6	Merging r250085: ------------------------------------------------------------------------ r250085 \| Andrea_DiBiagio \| 2015-10-12 15:22:30 -0400 (Mon, 12 Oct 2015) \| 60 lines [x86] Fix wrong lowering of vsetcc nodes (PR25080). Function LowerVSETCC (in X86ISelLowering.cpp) worked under the wrong assumption that for non-AVX512 targets, the source type and destination type of a type-legalized setcc node were always the same type. This assumption was unfortunately incorrect; the type legalizer is not always able to promote the return type of a setcc to the same type as the first operand of a setcc. In the case of a vsetcc node, the legalizer firstly checks if the first input operand has a legal type. If so, then it promotes the return type of the vsetcc to that same type. Otherwise, the return type is promoted to the 'next legal type', which, for vectors of MVT::i1 is always a 128-bit integer vector type. Example (-mattr=+avx): %0 = trunc <8 x i32> %a to <8 x i23> %1 = icmp eq <8 x i23> %0, zeroinitializer The initial selection dag for the code above is: v8i1 = setcc t5, t7, seteq:ch t5: v8i23 = truncate t2 t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %vreg1 t7: v8i32 = build_vector of all zeroes. The type legalizer would firstly check if 't5' has a legal type. If so, then it would reuse that same type to promote the return type of the setcc node. Unfortunately 't5' is of illegal type v8i23, and therefore it cannot be used to promote the return type of the setcc node. Consequently, the setcc return type is promoted to v8i16. Later on, 't5' is promoted to v8i32 thus leading to the following dag node: v8i16 = setcc t32, t25, seteq:ch where t32 and t25 are now values of type v8i32. Before this patch, function LowerVSETCC would have wrongly expanded the setcc to a single X86ISD::PCMPEQ. Surprisingly, ISel was still able to match an instruction. In our case, ISel would have matched a VPCMPEQWrr: t37: v8i16 = X86ISD::VPCMPEQWrr t36, t25 However, t36 and t25 are both VR256, while the result type is instead of class VR128. This inconsistency ended up causing the insertion of COPY instructions like this: %vreg7<def> = COPY %vreg3; VR128:%vreg7 VR256:%vreg3 Which is an invalid full copy (not a sub register copy). Eventually, the backend would have hit an UNREACHABLE "Cannot emit physreg copy instruction" in the attempt to expand the malformed pseudo COPY instructions. This patch fixes the problem adding the missing logic in LowerVSETCC to handle the corner case of a setcc with 128-bit return type and 256-bit operand type. This problem was originally reported by Dimitry as PR25080. It has been latent for a very long time. I have added the minimal reproducible from that bugzilla as test setcc-lowering.ll. Differential Revision: http://reviews.llvm.org/D13660 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@252484 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-09 16:25:17 +00:00
Tom Stellard	63abb571f8	Merging r250324: ------------------------------------------------------------------------ r250324 \| wschmidt \| 2015-10-14 16:45:00 -0400 (Wed, 14 Oct 2015) \| 10 lines [PowerPC] Fix invalid lxvdsx optimization (PR25157) PR25157 identifies a bug where a load plus a vector shuffle is incorrectly converted into an LXVDSX instruction. That optimization is only valid if the load is of a doubleword, and in the noted case, it was not. This corrects that problem. Joint patch with Eric Schweitz, who provided the bugpoint-reduced test case. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@252483 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-09 16:25:14 +00:00
Tom Stellard	f7e48cb79e	Merging r246937: ------------------------------------------------------------------------ r246937 \| hfinkel \| 2015-09-06 00:17:30 -0400 (Sun, 06 Sep 2015) \| 13 lines [PowerPC] Don't commute trivial rlwimi instructions To commute a trivial rlwimi instructions (meaning one with a full mask and zero shift), we'd need to ability to form an all-zero mask (instead of an all-one mask) using rlwimi. We can't represent this, however, and we'll miscompile code if we try. The code quality problem that this highlights (that SDAG simplification can lead to us generating an ISD::OR node with a constant zero LHS) will be fixed as a follow-up. Fixes PR24719. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@252481 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-09 16:25:11 +00:00
Tom Stellard	f7ff467559	Merging r246900: ------------------------------------------------------------------------ r246900 \| hfinkel \| 2015-09-04 20:02:59 -0400 (Fri, 04 Sep 2015) \| 14 lines [PowerPC] Fix and(or(x, c1), c2) -> rlwimi generation PPCISelDAGToDAG has a transformation that generates a rlwimi instruction from an input pattern that looks like this: and(or(x, c1), c2) but the associated logic does not work if there are bits that are 1 in c1 but 0 in c2 (these are normally canonicalized away, but that can't happen if the 'or' has other users. Make sure we abort the transformation if such bits are discovered. Fixes PR24704. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@252480 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-09 16:25:09 +00:00
Tom Stellard	1cd4501afe	Merging r246675: ------------------------------------------------------------------------ r246675 \| hfinkel \| 2015-09-02 12:52:37 -0400 (Wed, 02 Sep 2015) \| 9 lines [PowerPC] Don't always consider P8Altivec-only masks in LowerVECTOR_SHUFFLE LowerVECTOR_SHUFFLE needs to decide whether to pass a vector shuffle off to the TableGen-generated matching code, and it does this by testing the same predicates used by the TableGen files. Unfortunately, when we added new P8Altivec-only predicates, we started universally testing them in LowerVECTOR_SHUFFLE, and if then matched when targeting a system prior to a P8, we'd end up with a selection failure. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@252479 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-09 16:25:06 +00:00
Tom Stellard	6d2e5165a4	Merging r246400: ------------------------------------------------------------------------ r246400 \| hfinkel \| 2015-08-30 18:12:50 -0400 (Sun, 30 Aug 2015) \| 20 lines [PowerPC] Fixup SELECT_CC (and SETCC) patterns with i1 comparison operands There were really two problems here. The first was that we had the truth tables for signed i1 comparisons backward. I imagine these are not very common, but if you have: setcc i1 x, y, LT this has the '0 1' and the '1 0' results flipped compared to: setcc i1 x, y, ULT because, in the signed case, '1 0' is really '-1 0', and the answer is not the same as in the unsigned case. The second problem was that we did not have patterns (at all) for the unsigned comparisons select_cc nodes for i1 comparison operands. This was the specific cause of PR24552. These had to be added (and a missing Altivec promotion added as well) to make sure these function for all types. I've added a bunch more test cases for these patterns, and there are a few FIXMEs in the test case regarding code-quality. Fixes PR24552. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@252478 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-09 16:25:03 +00:00
Tom Stellard	4ab6754d0a	Merging r245907: ------------------------------------------------------------------------ r245907 \| hfinkel \| 2015-08-24 19:48:28 -0400 (Mon, 24 Aug 2015) \| 6 lines [PowerPC] PPCVSXFMAMutate should ignore trivial-copy addends We might end up with a trivial copy as the addend, and if so, we should ignore the corresponding FMA instruction. The trivial copy can be coalesced away later, so there's nothing to do here. We should not, however, assert. Fixes PR24544. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@252476 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-09 16:24:59 +00:00
Daniel Sanders	40c245c775	Merging r251622: ------------------------------------------------------------------------ r251622 \| vkalintiris \| 2015-10-29 10:17:16 +0000 (Thu, 29 Oct 2015) \| 17 lines [mips] Check the register class before replacing materializations of zero with $zero in microMIPS. Summary: The microMIPS register class GPRMM16 does not contain the $zero register. However, MipsSEDAGToDAGISel::replaceUsesWithZeroReg() would replace uses of the $dst register: [d]addiu, $dst, $zero, 0 with the $zero register, without checking for membership in the register class of the target machine operand. Reviewers: dsanders Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D13984 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@252158 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-05 13:30:33 +00:00
Tom Stellard	9364ca1727	Merging r245741: ------------------------------------------------------------------------ r245741 \| hfinkel \| 2015-08-21 17:34:24 -0400 (Fri, 21 Aug 2015) \| 8 lines [PowerPC] PPCVSXFMAMutate should not segfault on undef input registers When PPCVSXFMAMutate would look at the input addend register, it would get its input value number. This would fail, however, if the register was undef, causing a segfault. Don't segfault (just skip such FMA instructions). Fixes the test case from PR24542 (although that may have been over-reduced). ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@252132 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-05 02:05:38 +00:00
Alexei Starovoitov	0776facd2e	Merging r249718: ------------------------------------------------------------------------ r249718 \| ast \| 2015-10-08 11:52:40 -0700 (Thu, 08 Oct 2015) \| 16 lines [bpf] Do not expand UNDEF SDNode during insn selection lowering o Before this patch, BPF backend will expand UNDEF node to i64 constant 0. o For second pass of dag combiner, legalizer will run through each to-be-processed dag node. o If any new SDNode is generated and has an undef operand, dag combiner will put undef node, newly-generated constant-0 node, and any node which uses these nodes in the working list. o During this process, it is possible undef operand is generated again, and this will form an infinite loop for dag combiner pass2. o This patch allows UNDEF to be a legal type. Signed-off-by: Yonghong Song <yhs@plumgrid.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@251177 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-24 02:02:55 +00:00
Alexei Starovoitov	71b1a84518	Merging r249371: ------------------------------------------------------------------------ r249371 \| ast \| 2015-10-05 21:00:53 -0700 (Mon, 05 Oct 2015) \| 25 lines [bpf] Avoid extra pointer arithmetic for stack access For the program like below struct key_t { int pid; char name[16]; }; extern void test1(char *); int test() { struct key_t key = {}; test1(key.name); return 0; } For key.name, the llc/bpf may generate the below code: R1 = R10 // R10 is the frame pointer R1 += -24 // framepointer adjustment R1 \|= 4 // R1 is then used as the first parameter of test1 OR operation is not recognized by in-kernel verifier. This patch introduces an intermediate FI_ri instruction and generates the following code that can be properly verified: R1 = R10 R1 += -20 Patch by Yonghong Song <yhs@plumgrid.com> ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@251175 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-24 01:58:14 +00:00
Daniel Sanders	17611b180b	Merging r247128: ------------------------------------------------------------------------ r247128 \| dsanders \| 2015-09-09 10:53:20 +0100 (Wed, 09 Sep 2015) \| 31 lines Fix vector splitting for extract_vector_elt and vector elements of <8-bits. Summary: One of the vector splitting paths for extract_vector_elt tries to lower: define i1 @via_stack_bug(i8 signext %idx) { %1 = extractelement <2 x i1> <i1 false, i1 true>, i8 %idx ret i1 %1 } to: define i1 @via_stack_bug(i8 signext %idx) { %base = alloca <2 x i1> store <2 x i1> <i1 false, i1 true>, <2 x i1>* %base %2 = getelementptr <2 x i1>, <2 x i1>* %base, i32 %idx %3 = load i1, i1* %2 ret i1 %3 } However, the elements of <2 x i1> are not byte-addressible. The result of this is that the getelementptr expands to '%base + %idx * (1 / 8)' which simplifies to '%base + %idx * 0', and then simply '%base' causing all values of %idx to extract element zero. This commit fixes this by promoting the vector elements of <8-bits to i8 before splitting the vector. This fixes a number of test failures in pocl. Reviewers: pekka.jaaskelainen Subscribers: pekka.jaaskelainen, llvm-commits Differential Revision: http://reviews.llvm.org/D12591 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@247539 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-14 10:40:55 +00:00
Daniel Sanders	9b5d11e278	Merging r246990: ------------------------------------------------------------------------ r246990 \| dsanders \| 2015-09-08 10:07:03 +0100 (Tue, 08 Sep 2015) \| 9 lines [mips] Reserve address spaces 1-255 for software use. Summary: And define them to have noop casts with address spaces 0-255. Reviewers: pekka.jaaskelainen Subscribers: pekka.jaaskelainen, llvm-commits Differential Revision: http://reviews.llvm.org/D12678 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@247538 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-14 10:12:30 +00:00
Hans Wennborg	d2ccc92b44	Merging r245535: ------------------------------------------------------------------------ r245535 \| hfinkel \| 2015-08-19 20:02:02 -0700 (Wed, 19 Aug 2015) \| 6 lines [PowerPC] Fix value type on XVCMPEQDP for v2f64 comparisons XVCMPEQDP is used for VSX v2f64 equality comparisons, but the value type needs to be v2i64 (as that's the corresponding SETCC type). Fixes PR24225. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@245574 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-20 16:05:36 +00:00
Hans Wennborg	c0c1909172	Merging r245530: ------------------------------------------------------------------------ r245530 \| hfinkel \| 2015-08-19 18:18:20 -0700 (Wed, 19 Aug 2015) \| 5 lines [PowerPC] Fix the int2fp(fp2int(x)) DAGCombine to ignore ppc_fp128 This DAGCombine was creating custom SDAG nodes with an illegal ppc_fp128 operand type because it was triggering on f64/f32 int2fp(fp2int(ppc_fp128 x)), but shouldn't (it should only apply to f32/f64 types). The result was a crash. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@245573 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-20 16:03:44 +00:00
Renato Golin	eb7e3dfe33	Reapply "[SimplifyCFG] Be more aggressive" on branch_37 I have underestimated the importance of this patch, and James has got a fix for it in the making. Sorry for the noise. git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@245570 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-20 15:49:34 +00:00
Renato Golin	e7e7c38dce	Revert "[SimplifyCFG] Be more aggressive" on branch_37 This reverts commit r229099 in branch 37 only, because it caused PR24292. I'll continue investigating and will fix on trunk, but being an optimization change, we can let the rest of the release go without this one. git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@245568 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-20 15:05:48 +00:00
Hans Wennborg	f07217161a	Merging r244889: ------------------------------------------------------------------------ r244889 \| uweigand \| 2015-08-13 06:37:06 -0700 (Thu, 13 Aug 2015) \| 22 lines [SystemZ] Support large LLVM IR struct return values Recent mesa/llvmpipe crashes on SystemZ due to a failed assertion when attempting to compile a routine with a return type of { <4 x float>, <4 x float>, <4 x float>, <4 x float> } on a system without vector instruction support. This is because after legalizing the vector type, we get a return value consisting of 16 floats, which cannot all be returned in registers. Usually, what should happen in this case is that the target's CanLowerReturn routine rejects the return type, in which case SelectionDAG falls back to implementing a structure return in memory via implicit reference. However, the SystemZ target never actually implemented any CanLowerReturn routine, and thus would accept any struct return type. This patch fixes the crash by implementing CanLowerReturn. As a side effect, this also handles fp128 return values, fixing a todo that was noted in SystemZCallingConv.td. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@244909 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 16:38:56 +00:00
Hans Wennborg	55a6a4c555	Merging r243984: ------------------------------------------------------------------------ r243984 \| vkalintiris \| 2015-08-04 07:26:35 -0700 (Tue, 04 Aug 2015) \| 11 lines Revert r229675 - [mips] Avoid redundant sign extension of the result of binary bitwise instructions. It introduced two regressions on 64-bit big-endian targets running under N32 (MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4, and MultiSource/Applications/kimwitu++/kc) The issue is that on 64-bit targets comparisons such as BEQ compare the whole GPR64 but incorrectly tell the instruction selector that they operate on GPR32's. This leads to the elimination of i32->i64 extensions that are actually required by comparisons to work correctly. There's currently a patch under review that fixes this problem. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@244096 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-05 18:46:46 +00:00
Hans Wennborg	ce540c1084	Merging r243057: ------------------------------------------------------------------------ r243057 \| spatel \| 2015-07-23 15:56:53 -0700 (Thu, 23 Jul 2015) \| 16 lines fix crash in machine trace metrics due to processing dbg_value instructions (PR24199) The test in PR24199 ( https://llvm.org/bugs/show_bug.cgi?id=24199 ) crashes because machine trace metrics was not ignoring dbg_value instructions when calculating data dependencies. The machine-combiner pass asks machine trace metrics to calculate an instruction trace, does some reassociations, and calls MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval() along with MachineTraceMetrics::invalidate(). The dbg_value instructions have their operands invalidated, but the instructions are not expected to be deleted. On a subsequent loop iteration of the machine-combiner pass, machine trace metrics would be called again and die while accessing the invalid debug instructions. Differential Revision: http://reviews.llvm.org/D11423 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243662 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-30 17:17:47 +00:00
Hans Wennborg	d0702afadf	Merging r243638 and r243640: ------------------------------------------------------------------------ r243638 \| vkalintiris \| 2015-07-30 05:39:33 -0700 (Thu, 30 Jul 2015) \| 12 lines [mips][FastISel] Remove hidden mips-fast-isel option. Summary: This hidden option would disable code generation through FastISel by default. It was removed from the available options and from the Fast-ISel tests that required it in order to run the tests. Reviewers: dsanders Subscribers: qcolombet, llvm-commits Differential Revision: http://reviews.llvm.org/D11610 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r243640 \| vkalintiris \| 2015-07-30 06:13:09 -0700 (Thu, 30 Jul 2015) \| 5 lines [mips] Fix out-of-date debug information in test file. Update the debug info in the check-lines because the change in r243638 introduced a constant initialization before the prologue's end as part of a register spill. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243650 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-30 16:18:53 +00:00
Hans Wennborg	6055b680c7	Merging r243636: ------------------------------------------------------------------------ r243636 \| vkalintiris \| 2015-07-30 04:51:44 -0700 (Thu, 30 Jul 2015) \| 34 lines [mips][FastISel] Apply only zero-extension to constants prior to their materialization. Summary: Previously, we would sign-extend non-boolean negative constants and zero-extend otherwise. This was problematic for PHI instructions with negative values that had a type with bitwidth less than that of the register used for materialization. More specifically, ComputePHILiveOutRegInfo() assumes the constants present in a PHI node are zero extended in their container and afterwards deduces the known bits. For example, previously we would materialize an i16 -4 with the following instruction: addiu $r, $zero, -4 The register would end-up with the 32-bit 2's complement representation of -4. However, ComputePHILiveOutRegInfo() would generate a constant with the upper 16-bits set to zero. The SelectionDAG builder would use that information to generate an AssertZero node that would remove any subsequent trunc & zero_extend nodes. In theory, we should modify ComputePHILiveOutRegInfo() to consult target-specific hooks about the way they prefer to materialize the given constants. However, git-blame reports that this specific code has not been touched since 2011 and it seems to be working well for every target so far. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11592 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243648 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-30 16:16:42 +00:00
Hans Wennborg	dc99a4f513	Merging r243485: ------------------------------------------------------------------------ r243485 \| vkalintiris \| 2015-07-28 14:43:31 -0700 (Tue, 28 Jul 2015) \| 12 lines [mips][FastISel] Fix call lowering by bailing out on "fastcc" calls. Summary: Currently, we support only the MIPS O32 ABI calling convention for call lowering. With this change we avoid using the O32 calling convetion for lowering calls marked as using the fast calling convention. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11515 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243647 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-30 16:15:22 +00:00
Hans Wennborg	53a63a9874	Merging r243469: ------------------------------------------------------------------------ r243469 \| vkalintiris \| 2015-07-28 12:57:25 -0700 (Tue, 28 Jul 2015) \| 12 lines [mips][FastISel] Fix generated code for IR's select instruction. Summary: Generate correct code for the select instruction by zero-extending it's boolean/condition operand to GPR-width. This is necessary because the conditional-move instructions operate on the whole register. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11506 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243646 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-30 16:14:05 +00:00
Hans Wennborg	61c6f9e581	Merging r243519: ------------------------------------------------------------------------ r243519 \| wschmidt \| 2015-07-29 07:31:57 -0700 (Wed, 29 Jul 2015) \| 14 lines [PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask Given certain shuffle-vector masks, LLVM emits splat instructions which splat the wrong bytes from the source register. The issue is that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp does not ensure that the splat pattern found is requesting bytes that are aligned on an EltSize boundary. This patch detects this situation as not a valid splat mask, resulting in a permute being generated instead of a splat. Patch and test case by Tyler Kenney, cleaned up a bit by me. This is a simple bug fix that would be good to incorporate into 3.7. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243528 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-29 15:58:34 +00:00
Hans Wennborg	1008467523	Merging r243500: (conflicts resolved manually since the branch doesn't have r243293) ------------------------------------------------------------------------ r243500 \| spatel \| 2015-07-28 16:28:22 -0700 (Tue, 28 Jul 2015) \| 16 lines ignore duplicate divisor uses when transforming into reciprocal multiplies (PR24141) PR24141: https://llvm.org/bugs/show_bug.cgi?id=24141 contains a test case where we have duplicate entries in a node's uses() list. After r241826, we use CombineTo() to delete dead nodes when combining the uses into reciprocal multiplies, but this fails if we encounter the just-deleted node again in the list. The solution in this patch is to not add duplicate entries to the list of users that we will subsequently iterate over. For the test case, this avoids triggering the combine divisors logic entirely because there really is only one user of the divisor. Differential Revision: http://reviews.llvm.org/D11345 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243524 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-29 15:38:37 +00:00
Hans Wennborg	d6482ace16	Merging r243361: ------------------------------------------------------------------------ r243361 \| spatel \| 2015-07-27 17:48:32 -0700 (Mon, 27 Jul 2015) \| 17 lines fix invalid load folding with SSE/AVX FP logical instructions (PR22371) This is a follow-up to the FIXME that was added with D7474 ( http://reviews.llvm.org/rL229531 ). I thought this load folding bug had been made hard-to-hit, but it turns out to be very easy when targeting 32-bit x86 and causes a miscompile/crash in Wine: https://bugs.winehq.org/show_bug.cgi?id=38826 https://llvm.org/bugs/show_bug.cgi?id=22371#c25 The quick fix is to simply remove the scalar FP logical instructions from the load folding table in X86InstrInfo, but that causes us to miss load folds that should be possible when lowering fabs, fneg, fcopysign. So the majority of this patch is altering those lowerings to use vector FP logical instructions (because that's all x86 gives us anyway). That lets us do the load folding legally. Differential Revision: http://reviews.llvm.org/D11477 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243435 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-28 16:20:00 +00:00
Hans Wennborg	f09b9c933a	Merging r243294: ------------------------------------------------------------------------ r243294 \| mareko \| 2015-07-27 11:16:08 -0700 (Mon, 27 Jul 2015) \| 9 lines AMDGPU: don't match vgpr loads for constant loads Author: Dave Airlie <airlied@redhat.com> In order to implement indirect sampler loads, we don't want to match on a VGPR load but an SGPR one for constants, as we cannot feed VGPRs to the sampler only SGPRs. this should be applicable for llvm 3.7 as well. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243317 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-27 20:19:04 +00:00
Hans Wennborg	f4c16a9237	Merging r243263: ------------------------------------------------------------------------ r243263 \| mareko \| 2015-07-27 04:37:42 -0700 (Mon, 27 Jul 2015) \| 3 lines AMDGPU/SI: Fix the V_FRACT_F64 SI bug workaround This is a candidate for 3.7. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243316 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-27 20:17:19 +00:00
Hans Wennborg	ab8b2f1f64	Merge r242733, r242734, r242735 and r242742 (r242742 is the interesting patch here, but I picked the others too to get a clean merge since there's been some back-and-forth on this file.) ------------------------------------------------------------------------ r242733 \| matze \| 2015-07-20 16:17:14 -0700 (Mon, 20 Jul 2015) \| 3 lines Revert "ARM: Use SpecificBumpPtrAllocator to fix leak introduced in r241920" This reverts commit r241951. It caused http://llvm.org/PR24190 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r242734 \| matze \| 2015-07-20 16:17:16 -0700 (Mon, 20 Jul 2015) \| 3 lines Revert "ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code" This reverts commit r241928. This caused http://llvm.org/PR24190 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r242735 \| matze \| 2015-07-20 16:17:20 -0700 (Mon, 20 Jul 2015) \| 3 lines Revert "ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2" This reverts commit r241926. This caused http://llvm.org/PR24190 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r242742 \| matze \| 2015-07-20 17:18:59 -0700 (Mon, 20 Jul 2015) \| 7 lines ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2 Re-apply r241926 with an additional check that r13 and r15 are not used for LDRD/STRD. See http://llvm.org/PR24190. This also already includes the fix from r241951. Differential Revision: http://reviews.llvm.org/D10623 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242907 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-22 16:13:29 +00:00
Hans Wennborg	db5f467332	Merging r242680: ------------------------------------------------------------------------ r242680 \| wschmidt \| 2015-07-20 08:43:21 -0700 (Mon, 20 Jul 2015) \| 1 line Add missing test for r242296 (vec_sld) ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242686 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-20 16:47:36 +00:00
Hans Wennborg	b26a5f5b1e	Merging r242673: ------------------------------------------------------------------------ r242673 \| tstellar \| 2015-07-20 07:28:41 -0700 (Mon, 20 Jul 2015) \| 11 lines AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops Summary: The MUBUF addr64 bit has been removed on VI, so we must use FLAT instructions when the pointer is stored in VGPRs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11067 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242685 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-20 16:46:01 +00:00
Hans Wennborg	dffc572cbe	Merging r242434: ------------------------------------------------------------------------ r242434 \| tstellar \| 2015-07-16 12:40:09 -0700 (Thu, 16 Jul 2015) \| 7 lines AMDPGU/SI: Negative offsets aren't allowed in MUBUF's vaddr operand Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11226 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242684 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-20 16:43:33 +00:00
Hans Wennborg	bd53caa737	Merging r242442: ------------------------------------------------------------------------ r242442 \| wschmidt \| 2015-07-16 14:14:07 -0700 (Thu, 16 Jul 2015) \| 14 lines [PowerPC] v4i32 is a VSRCRegClass I was looking at some vector code generation and kept seeing unnecessary vector copies into the Altivec half of the VSX registers. I discovered that we overlooked v4i32 when adding the register classes for VSX; we only added v4f32 and v2f64. This means that anything that canonicalizes into v4i32 (which is a LOT of stuff) ends up being forced into VRRC on its way to VSRC. The fix is one line. The rest of the patch is fixing up some test cases whose code generation has changed as a result. This seems like it would be a good candidate for backport to 3.7. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242447 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-16 21:41:21 +00:00
Hans Wennborg	445ea38ee9	Merging r242239: ------------------------------------------------------------------------ r242239 \| hfinkel \| 2015-07-14 15:53:11 -0700 (Tue, 14 Jul 2015) \| 4 lines [PowerPC] Support symbolic targets in patchpoints Follow-up r235483, with the corresponding support in PPC. We use a regular call for symbolic targets (because they're much cheaper than indirect calls). ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242325 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-15 20:27:43 +00:00
Hal Finkel	a8eaf29f90	[PowerPC] Use the ABI indirect-call protocol for patchpoints We used to take the address specified as the direct target of the patchpoint and did no TOC-pointer handling. This, however, as not all that useful, because MCJIT tends to create a lot of modules, and they have their own TOC sections. Thus, to call from the generated code to other generated code, you really need to switch TOC pointers. Make this work as expected, and under ELFv1, tread the address as the function descriptor address so that the correct TOC pointer can be loaded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242217 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 22:26:06 +00:00
Alex Lorenz	6e50c921d0	MIR Serialization: Serialize the machine basic block live in registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242204 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 21:24:41 +00:00
Hal Finkel	13141f04d3	[PowerPC] Fix the PPCInstrInfo::getInstrLatency implementation PowerPC uses itineraries to describe processor pipelines (and dispatch-group restrictions for P7/P8 cores). Unfortunately, the target-independent implementation of TII.getInstrLatency calls ItinData->getStageLatency, and that looks for the largest cycle count in the pipeline for any given instruction. This, however, yields the wrong answer for the PPC itineraries, because we don't encode the full pipeline. Because the functional units are fully pipelined, we only model the initial stages (there are no relevant hazards in the later stages to model), and so the technique employed by getStageLatency does not really work. Instead, we should take the maximum output operand latency, and that's what PPCInstrInfo::getInstrLatency now does. This caused some test-case churn, including two unfortunate side effects. First, the new arrangement of copies we get from function parameters now sometimes blocks VSX FMA mutation (a FIXME has been added to the code and the test cases), and we have one significant test-suite regression: SingleSource/Benchmarks/BenchmarkGame/spectral-norm 56.4185% +/- 18.9398% In this benchmark we have a loop with a vectorized FP divide, and it with the new scheduling both divides end up in the same dispatch group (which in this case seems to cause a problem, although why is not exactly clear). The grouping structure is hard to predict from the bottom of the loop, and there may not be much we can do to fix this. Very few other test-suite performance effects were really significant, but almost all weakly favor this change. However, in light of the issues highlighted above, I've left the old behavior available via a command-line flag. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242188 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 20:02:02 +00:00
Krzysztof Parzyszek	d496e176f0	[Hexagon] Generate instructions for operations on predicate registers Convert logical operations on general-purpose registers to the correspon- ding operations on predicate registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242186 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 19:30:21 +00:00
Keno Fischer	890c16626f	[CodeGen] Force emission of personality directive if explicitly specified Summary: Before this change, personality directives were not emitted if there was no invoke left in the function (of course until recently this also meant that we couldn't know what the personality actually was). This patch forces personality directives to still be emitted, unless it is known to be a noop in the absence of invokes, or the user explicitly specified `nounwind` (and not `uwtable`) on the function. Reviewers: majnemer, rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D10884 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242185 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 19:22:51 +00:00
Matt Arsenault	ba38e6c2ae	AMDGPU: Avoid using 64-bit shift for i64 (shl x, 32) This can be done only with moves which theoretically will optimize better later. Although this transform increases the instruction count, it should be code size / cycle count neutral in the worst VALU case. It also seems to slightly improve a couple of testcases due to other DAG combines this exposes. This is probably slightly worse for the SALU case, so it might be better to handle this during moveToVALU, although then you lose some simplifications like the load width reducing in the simple testcase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242177 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 18:20:33 +00:00
Matt Arsenault	3aa0d7cb53	AMDGPU/SI: Fix read2 merging into a super register. If the read2 produced was supposed to be writing into a super register, it would use the wrong subregister indices. Fix this by inserting copies, so we only ever write to a vreg_64. Run the register coalescer again to clean this up, although this isn't ideal and often does result in an extra move. Also remove the assert that offset1 > offset0. There isn't a real reason to not allow this other than a minor convenience in the compiler, and it doesn't seem worth the effort of avoiding it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242174 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 17:57:36 +00:00

1 2 3 4 5 ...

14033 Commits