archived-llvm

mirror of https://github.com/RPCSX/llvm.git synced 2026-01-31 01:05:23 +01:00

Author	SHA1	Message	Date
Matthias Braun	0cc137e051	RegisterScavenging: Followup to r305625 This does some improvements/cleanup to the recently introduced scavengeRegisterBackwards() functionality: - Rewrite findSurvivorBackwards algorithm to use the existing LiveRegUnit::accumulateBackward() code. This also avoids the Available and Candidates bitset and just need 1 LiveRegUnit instance (= 1 bitset). - Pick registers in allocation order instead of register number order. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305817 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 18:43:14 +00:00
Sanjay Patel	7a0e66cc56	[CGP, PowerPC] try to constant fold before creating loads for memcmp expansion This is the last step needed to avoid regressions for x86 before we flip the switch to allow expansion of the smallest set of memcpy() via CGP. The DAG version checks for constant strings, so we need to do that here too. FWIW, the 2 constant test is not handled by LibCallSimplifier::optimizeMemCmp() because that code is limited to 8-bit constant arrays. LibCallSimplifier will also fail to optimize some 1 constant tests because its alignment requirements are too strict (shouldn't require alignment for a constant operand). Differential Revision: https://reviews.llvm.org/D34071 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305734 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-19 19:48:35 +00:00
Matthias Braun	cd03942492	RegScavenging: Add scavengeRegisterBackwards() Re-apply r276044/r279124/r305516. Fixed a problem where we would refuse to place spills as the very first instruciton of a basic block and thus artifically increase pressure (test in test/CodeGen/PowerPC/scavenging.mir:spill_at_begin) This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305625 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-17 02:08:18 +00:00
Matthias Braun	bd1a266898	Revert "RegScavenging: Add scavengeRegisterBackwards()" Revert because of reports of some PPC input starting to spill when it was predicted that it wouldn't and no spillslot was reserved. This reverts commit r305516. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305566 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-16 17:48:08 +00:00
Matthias Braun	02688b00ef	RegScavenging: Add scavengeRegisterBackwards() Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64 problems reported in the stage2 build last time, which I cannot reproduce right now. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305516 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-15 22:14:55 +00:00
Lei Huang	edd54f82b6	[MachineLICM] Hoist TOC-based address instructions Add condition for MachineLICM to safely hoist instructions that utilize non constant registers that are reserved. On PPC, global variable access is done through the table of contents (TOC) which is always in register X2. The ABI reserves this register in any functions that have calls or access global variables. A call through a function pointer involves saving, changing and restoring this register around the call and thus MachineLICM does not consider it to be invariant. We can however guarantee the register is preserved across the call and thus is invariant. Differential Revision: https://reviews.llvm.org/D33562 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305490 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-15 18:29:59 +00:00
Hiroshi Inoue	1a33599c52	[PowerPC] fix potential verification errors on CFENCE8 This patch fixes a potential verification error (64-bit register operands for cmpw) with -verify-machineinstrs. Differential Revision: https://reviews.llvm.org/D34208 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305479 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-15 16:51:28 +00:00
Nemanja Ivanovic	4d07d46bb1	Revert r304907 as it is causing some failures that I cannot reproduce. Reverting this until a test case can be provided to aid the investigation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305372 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-14 07:05:42 +00:00
Tony Jiang	759cec2ab3	[PowerPC] Match vec_revb builtins to P9 instructions. Power9 has instructions that will reverse the bytes within an element for all sizes (half-word, word, double-word and quad-word). These can be used for the vec_revb builtins in altivec.h. However, we implement these to match vector shuffle nodes as that will cover both the builtins and vector shuffles that occur in the SDAG through other means. Differential Revision: https://reviews.llvm.org/D33690 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305214 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-12 18:24:36 +00:00
Tony Jiang	e3ae196e24	[Power9] Added support for the modsw, moduw, modsd, modud hardware instructions. Note that if we need the result of both the divide and the modulo then we compute the modulo based on the result of the divide and not using the new hardware instruction. Commit on behalf of STEFAN PINTILIE. Differential Revision: https://reviews.llvm.org/D33940 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305210 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-12 17:58:42 +00:00
Sanjay Patel	a1e0378901	[PowerPC] add memcmp test with one constant operand and equality cmp; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305131 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-09 23:15:14 +00:00
Sanjay Patel	4c04c2d072	[CGP] don't expand a memcmp with nobuiltin attribute This matches the behavior used in the SDAG when expanding memcmp. For reference, we're intentionally treating the earlier fortified call transforms differently after: https://bugs.llvm.org/show_bug.cgi?id=23093 https://reviews.llvm.org/rL233776 One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers: https://reviews.llvm.org/D19781 https://reviews.llvm.org/D19801 Differential Revision: https://reviews.llvm.org/D34043 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305007 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-08 19:47:25 +00:00
Guozhi Wei	f222586866	[PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64 In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64. This patch fixed PR32442. Differential Revision: https://reviews.llvm.org/D31407 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305001 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-08 18:27:24 +00:00
Zaara Syeda	32a3852f3c	[Power9] Exploit vector integer extend instructions This patch adds build vector patterns to exploit the vector integer extend instructions: vextsb2w - Vector Extend Sign Byte To Word vextsb2d - Vector Extend Sign Byte To Doubleword vextsh2w - Vector Extend Sign Halfword To Word vextsh2d - Vector Extend Sign Halfword To Doubleword vextsw2d - Vector Extend Sign Word To Doubleword Differential Revision: https://reviews.llvm.org/D33510 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304992 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-08 17:14:36 +00:00
Sanjay Patel	06abfee96e	[PowerPC] add memcmp test with nobuiltin attr; NFC In SDAG, we don't expand libcalls with a nobuiltin attribute. It's not clear if that's correct from the existing code comment: "Don't do the check if marked as nobuiltin for some reason." ...adding a test here either way to show that there is currently a different behavior implemented in the CGP-based expansion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304991 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-08 17:09:18 +00:00
Sanjay Patel	94a001edde	[CGP / PowerPC] avoid multi-block overhead for simple memcmp expansion The test diff for PowerPC shows we can better optimize if this case is one block. For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed cheap and SDAG can't optimize across blocks. Instead of this: _cmp_eq8: movq (%rdi), %rax cmpq (%rsi), %rax je LBB23_1 ## BB#2: ## %res_block movl $1, %ecx jmp LBB23_3 LBB23_1: xorl %ecx, %ecx LBB23_3: ## %endblock xorl %eax, %eax testl %ecx, %ecx sete %al retq We get this: cmp_eq8: movq (%rdi), %rcx xorl %eax, %eax cmpq (%rsi), %rcx sete %al retq And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall(). If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we can lower the IR produced here optimally. Differential Revision: https://reviews.llvm.org/D34005 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304987 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-08 16:53:18 +00:00
Sanjay Patel	57caaecac3	[CGP] avoid zext/trunc of a memcmp expansion compare This could be viewed as another shortcoming of the DAGCombiner: when both operands of a compare are zexted from the same source type, we should be able to compare the original types. The effect on PowerPC perf is likely unnoticeable, but there's a visible regression for x86 if we feed the suboptimal IR for memcmp expansion to the DAG: _cmp_eq4_zexted_to_i64: movl (%rdi), %ecx movl (%rsi), %edx xorl %eax, %eax cmpq %rdx, %rcx sete %al _cmp_eq4_better: movl (%rdi), %ecx xorl %eax, %eax cmpl (%rsi), %ecx sete %al git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304923 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-07 16:16:45 +00:00
Nemanja Ivanovic	aa74d107f6	[PowerPC] Eliminate integer compare instructions - vol. 5 Adds handling for i64 SETNE comparison (both sign and zero extended). Differential Revision: https://reviews.llvm.org/D33720 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304907 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-07 13:18:06 +00:00
Nemanja Ivanovic	3204344f22	[PowerPC] Eliminate integer compare instructions - vol. 3 Adds handling for i32 SETNE comparison (both sign and zero extended). Differential Revision: https://reviews.llvm.org/D33718 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304901 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-07 12:23:41 +00:00
Sanjay Patel	756819c52a	[CGP / PowerPC] use direct compares if there's only one load per block in memcmp() expansion I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle. I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time limitation in the DAG. We might have to just avoid those special-cases here in CGP. But regardless of that, I think this is a win for the more general cases. http://rise4fun.com/Alive/cbQ Differential Revision: https://reviews.llvm.org/D33963 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304849 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-07 00:17:08 +00:00
Sanjay Patel	e11fbd18cc	[PowerPC] auto-generate full checks and increase test coverage 3 of the tests were testing exactly the same thing: memcmp(x, y, 16) != 0. I changed that to test 4, 7, and 16 bytes, so we can see how those differ. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304838 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-06 22:06:07 +00:00
Matthias Braun	3310b59ffc	RegisterScavenging: Add ScavengerTest pass This pass allows to run the register scavenging independently of PrologEpilogInserter to allow targeted testing. Also adds some basic register scavenging tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304606 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-02 23:01:42 +00:00
Sean Fertile	bae3d869e5	[PowerPC] Correctly specify the cache line size for Power 7, 8 and 9. Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for newer ppc processors. Commiting on behalf of Stefan Pintilie. Differential Revision: https://reviews.llvm.org/D33656 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304317 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-31 18:20:17 +00:00
Zaara Syeda	682f92f568	[PPC] Inline expansion of memcmp This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304313 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-31 17:12:38 +00:00
Tony Jiang	a688c8eaae	[PowerPC] Fix a performance bug for PPC::XXPERMDI. There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI Instruction, this patch recognizes them and does the selection to improve the PPC performance. Differential Revision: https://reviews.llvm.org/D33404 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304298 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-31 13:09:57 +00:00
Nemanja Ivanovic	573099d4c3	[PowerPC] Eliminate integer compare instructions - vol. 3 This patch builds upon https://reviews.llvm.org/rL302810 to add handling for the 64-bit SETEQ patterns. Differential Revision: https://reviews.llvm.org/D33369 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304286 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-31 08:04:07 +00:00
Nemanja Ivanovic	1c7bf566f8	[PowerPC] Eliminate integer compare instructions - vol. 2 This patch builds upon https://reviews.llvm.org/rL302810 to add handling for bitwise logical operations in general purpose registers. The idea is to keep the values in GPRs as long as possible - only extracting them to a condition register bit when no further operations are to be done. Differential Revision: https://reviews.llvm.org/D31851 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304282 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-31 05:40:25 +00:00
Tim Shen	400ba83237	[AntiDepBreaker] Revert r299124 and add a test. Summary: AntiDepBreaker intends to add all live-outs, including the implicit CSRs, in StartBlock. r299124 was done without understanding that intention. Now with the live-ins propagated correctly (D32464), we can revert this change. Reviewers: MatzeB, qcolombet Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D33697 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304251 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-30 22:26:52 +00:00
Matthias Braun	bfcbf6ad00	LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI Re-commit r303938 and r303954 with a fix for addLiveIns(): the internal addPristines() function must be called on an empty set or it may accidentally reset saved registers. - addLiveOutsNoPristines() needs to add callee saved registers that are actually saved and restored somewhere to the set (they are not pristine). - Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines(). This fixes the problem from D32156. Differential Revision: https://reviews.llvm.org/D32464 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304001 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-26 16:23:08 +00:00
Matthias Braun	55d0a522c4	Revert "LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI" Tentatively revert this to see if it fixes the buildbot stage2 breakages. This reverts commit r303938. This reverts commit r303954. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303960 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-26 02:25:20 +00:00
Matthias Braun	e4bd195e02	Test for r303938 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303954 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-26 01:29:25 +00:00
Tim Shen	7a45750233	[PPC] Fix atomics lowering in DAG lowering. I forgot to forward the chain, causing some missing instruction dependencies. The test crashes the compiler without this patch. Inspired by the test case, D33519 also tries to remove the extra sync. Differential Revision: https://reviews.llvm.org/D33573 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303931 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-25 22:58:35 +00:00
Tony Jiang	133fa95ca7	[PowerPC] Fix a performance bug for PPC::XXSLDWI. There are some VectorShuffle Nodes in SDAG which can be selected to XXSLDWI instruction, this patch recognizes them and does the selection to improve the PPC performance. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303822 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-24 23:48:29 +00:00
Zaara Syeda	8abe596788	P9: D-form vector load/store. Differential Revision: https://reviews.llvm.org/D33248 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303780 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-24 17:50:37 +00:00
Hiroshi Inoue	4d0494d621	Summary PPC backend eliminates compare instructions by using record-form instructions in PPCInstrInfo::optimizeCompareInstr, which is called from peephole optimization pass. This patch improves this optimization to eliminate more compare instructions in two types of common case. - comparison against a constant 1 or -1 The record-form instructions set CR bit based on signed comparison against 0. So, the current implementation does not exploit the record-form instruction for comparison against a non-zero constant. This patch enables record-form optimization for constant of 1 or -1 if possible; it changes the condition "greater than -1" into "greater than or equal to 0" and "less than 1" into "less than or equal to 0". With this patch, compare can be eliminated in the following code sequence, as an example. uint64_t a, b; if ((a \| b) & 0x8000000000000000ull) { ... } else { ... } - andi for 32-bit comparison on PPC64 Since record-form instructions execute 64-bit signed comparison and so we have limitation in eliminating 32-bit comparison, i.e. with cmplwi, using the record-form. The original implementation already has such checks but andi. is not recognized as an instruction which executes implicit zero extension and hence safe to convert into record-form if used for equality check. %1 = and i32 %a, 10 %2 = icmp ne i32 %1, 0 br i1 %2, label %foo, label %bar In this simple example, LLVM generates andi. + cmplwi + beq on PPC64. This patch make it possible to eliminate the cmplwi for this case. I added andi. for optimization targets if it is safe to do so. Differential Revision: https://reviews.llvm.org/D30081 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303500 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-21 06:00:05 +00:00
Kyle Butt	011a826e4f	CodeGen: Power: Add lowering for shifts of v1i128. When legalizing vector operations on vNi128, they will be split to v1i128 because that is a legal type on ppc64, but then the compiler will crash in selection dag because it fails to select for these operations. This patch fixes shift operations. Logical shift right and left shift can be performed in the vector unit, but algebraic shift right requires being split. Differential Revision: https://reviews.llvm.org/D32774 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303307 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-17 21:54:41 +00:00
Krzysztof Parzyszek	98d41caf26	[PPC] Properly update register save area offsets The variables MinGPR/MinG8R were not updated properly when resetting the offsets, which in the included testcase lead to saving the CR register in the same location as R30. This fixes another issue reported in PR26519. Differential Revision: https://reviews.llvm.org/D33017 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303257 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-17 13:25:09 +00:00
Tim Shen	50ecf9b407	[PPC] Add -ppc-asm-full-reg-names to atomic-2.ll. NFC. Differential Revisions: https://reviews.llvm.org/D32763 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303209 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 20:58:55 +00:00
Tim Shen	1a2e7acb99	[PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync. Summary: This fixes pr32392. The lowering pipeline is: llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in expandPostRAPseudo. The reason why expandPostRAPseudo is chosen is because previous passes are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne- 7, .+4 (some branch pass(s)). Differential Revision: https://reviews.llvm.org/D32763 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303205 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 20:18:06 +00:00
Nirav Dave	acc2c1d71d	Elide stores which are overwritten without being observed. Summary: In SelectionDAG, when a store is immediately chained to another store to the same address, elide the first store as it has no observable effects. This is causes small improvements dealing with intrinsics lowered to stores. Test notes: * Many testcases overwrite store addresses multiple times and needed minor changes, mainly making stores volatile to prevent the optimization from optimizing the test away. * Many X86 test cases optimized out instructions associated with associated with va_start. * Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has dependencies to check and can probably be removed and potentially replaced with another test. Reviewers: rnk, john.brawn Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33206 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303198 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 19:43:56 +00:00
Kyle Butt	e6202480d9	CodeGen: BlockPlacement: Increase tail duplication size for O3. At O3 we are more willing to increase size if we believe it will improve performance. The current threshold for tail-duplication of 2 instructions is conservative, and can be relaxed at O3. Benchmark results: llvm test-suite: 6% improvement in aha, due to duplication of loop latch 3% improvement in hexxagon 2% slowdown in lpbench. Seems related, but couldn't completely diagnose. Internal google benchmark: Produces 4% improvement on internal google protocol buffer serialization benchmarks. Differential-Revision: https://reviews.llvm.org/D32324 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303084 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 17:30:47 +00:00
Guozhi Wei	d3fe6038ab	[PPC] Change the register constraint of the first source operand of instruction mtvsrdd to g8rc_nox0 According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0. This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified. Differential Revision: https://reviews.llvm.org/D32880 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302834 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 22:17:35 +00:00
Nemanja Ivanovic	0470a16690	[PowerPC] Eliminate integer compare instructions - vol. 1 This patch is the first in a series of patches to provide code gen for doing compares in GPRs when the compare result is required in a GPR. It adds the infrastructure to select GPR sequences for i1->i32 and i1->i64 extensions. This first patch handles equality comparison on i32 operands with the result sign or zero extended. Differential Revision: https://reviews.llvm.org/D31847 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302810 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 16:54:23 +00:00
Serge Pavlov	1f4a80fdc1	Add extra operand to CALLSEQ_START to keep frame part set up previously Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302527 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-09 13:35:13 +00:00
Krzysztof Parzyszek	8175a1d583	[PPC] When restoring R30 (PIC base pointer), mark it as <def> This happened on the PPC32/SVR4 path and was discovered when building FreeBSD on PPC32. It was a typo-class error in the frame lowering code. This fixes PR26519. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302183 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-04 19:14:54 +00:00
Tim Shen	85fd68bf82	[PowerPC, DAGCombiner] Fold a << (b % (sizeof(a) * 8)) back to a single instruction Summary: This is the corresponding llvm change to D28037 to ensure no performance regression. Reviewers: bogner, kbarton, hfinkel, iteratee, echristo Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28329 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301990 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-03 00:07:02 +00:00
Nemanja Ivanovic	ff5dd4527f	[PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LE Fixes PR30730. This is a re-commit of a pulled commit. The commit was pulled because some software projects contained uses of Altivec vectors that violated alignment requirements. Known issues have now been fixed. Committing on behalf of Lei Huang. Differential Revision: https://reviews.llvm.org/D26861 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301892 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-02 01:47:34 +00:00
Sanjoy Das	edb3c90b17	[StackMaps] Increase the size of the "location size" field Summary: In some cases LLVM (especially the SLP vectorizer) will create vectors that are 256 bytes (or larger). Given that this is intentional[0] is likely to get more common, this patch updates the StackMap binary format to deal with the spill locations for said vectors. This change also bumps the stack map version from 2 to 3. [0]: https://reviews.llvm.org/D32533#738350 Reviewers: reames, kavon, skatkov, javed.absar Subscribers: mcrosier, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D32629 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301615 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-28 04:48:42 +00:00
Adrian Prantl	83092adef9	Don't emit CFI instructions at the end of a function When functions are terminated by unreachable instructions, the last instruction might trigger a CFI instruction to be generated. However, emitting it would be be illegal since the function (and thus the FDE the CFI is in) has already ended with the previous instruction. Darwin's dwarfdump --verify --eh-frame complains about this and the specification supports this. Relevant bits from the DWARF 5 standard (6.4 Call Frame Information): "[The] address_range [field in an FDE]: The number of bytes of program instructions described by this entry." "Row creation instructions: [...] The new location value is always greater than the current one." The first quotation implies that a CFI cannot describe a target address outside of the enclosing FDE's range. rdar://problem/26244988 Differential Revision: https://reviews.llvm.org/D32246 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301219 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-24 18:45:59 +00:00
Sanjay Patel	27b613382c	[DAG] add splat vector support for 'xor' in SimplifyDemandedBits This allows forming more 'not' ops, so we get improvements for ISAs that have and-not. Follow-up to: https://reviews.llvm.org/rL300725 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300763 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-19 21:23:09 +00:00

1 2 3 4 5 ...

1655 Commits