RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-22 13:26:03 +00:00

Author	SHA1	Message	Date
Renato Golin	99ec266022	[ARM] Merging 64-bit divmod lib calls into one When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262738 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-04 19:19:36 +00:00
Simon Pilgrim	946e6cb363	[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Differential Revision: http://reviews.llvm.org/D17691 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262599 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-03 09:43:28 +00:00
Renato Golin	ff17c53224	Revert "[ARM] Merging 64-bit divmod lib calls into one" This reverts commit r262507, which broke some ARM buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262594 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-03 08:57:44 +00:00
Renato Golin	4d7de4fa50	[ARM] Merging 64-bit divmod lib calls into one When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. This patch fixes PR17193 (and a long time FIXME in the tests). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262507 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-02 19:35:45 +00:00
Matt Arsenault	543afc9d41	DAGCombiner: Make sure an integer is being truncated git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262446 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-02 01:36:51 +00:00
Matt Arsenault	d06f393d79	DAGCombiner: Turn truncate of a bitcasted vector to an extract On AMDGPU where operations i64 operations are often bitcasted to v2i32 and back, this pattern shows up regularly where it breaks some expected combines on i64, such as load width reducing. This fixes some test failures in a future commit when i64 loads are changed to promote. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262397 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-01 21:31:53 +00:00
Vasileios Kalintiris	6e09ce7e5f	Revert "[mips] Promote the result of SETCC nodes to GPR width." This reverts commit r262316. It seems that my change breaks an out-of-tree chromium buildbot, so I'm reverting this in order to investigate the situation further. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262387 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-01 20:25:43 +00:00
Matt Arsenault	8ba165a405	DAGCombiner: Turn extract of bitcasted integer into truncate This reduces the number of bitcast nodes and generally cleans up the DAG when bitcasting between integers and vectors everywhere. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262358 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-01 18:01:37 +00:00
Vasileios Kalintiris	ef35f84f09	[mips] Promote the result of SETCC nodes to GPR width. Summary: This patch modifies the existing comparison, branch, conditional-move and select patterns, and adds new ones where needed. Also, the updated SLT{u,i,iu} set of instructions generate a GPR width result. The majority of the code changes in the Mips back-end fix the wrong assumption that the result of SETCC nodes always produce an i32 value. The changes in the common code path account for the fact that in 64-bit MIPS targets, i1 is promoted to i32 instead of i64. Reviewers: dsanders Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D10970 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262316 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-01 10:08:01 +00:00
Matt Arsenault	2bc40a1dbd	DAGCombiner: Don't unnecessarily swap operands in ReassociateOps In the case where op = add, y = base_ptr, and x = offset, this transform: (op y, (op x, c1)) -> (op (op x, y), c1) breaks the canonical form of add by putting the base pointer in the second operand and the offset in the first. This fix is important for the R600 target, because for some address spaces the base pointer and the offset are stored in separate register classes. The old pattern caused the ISel code for matching addressing modes to put the base pointer and offset in the wrong register classes, which required no-trivial code transformations to fix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262148 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-27 19:57:45 +00:00
Matt Arsenault	f51a2196a5	DAGCombiner: Relax sqrt NaN folding check This is OK for +0 since compares to +/-0 give the same result. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262125 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-27 09:38:05 +00:00
Simon Pilgrim	01ad432bbe	[DAGCombiner] Use getBitcast helper when possible. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261437 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-20 15:05:29 +00:00
Ahmed Bougacha	606927f7bb	[CodeGen] Document and use getConstant's splat-building feature. NFC. Differential Revision: http://reviews.llvm.org/D17229 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260901 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-15 18:07:29 +00:00
Pirama Arumuga Nainar	d313df118a	Don't combine fp_round (fp_round x) if f80 to f16 is generated Summary: This patch skips DAG combine of fp_round (fp_round x) if it results in an fp_round from f80 to f16. fp_round from f80 to f16 always generates an expensive (and as yet, unimplemented) libcall to __truncxfhf2. This prevents selection of native f16 conversion instructions from f32 or f64. Moreover, the first (value-preserving) fp_round from f80 to either f32 or f64 may become a NOP in platforms like x86. Reviewers: ab Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D17221 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260769 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-13 00:08:05 +00:00
Ahmed Bougacha	3248c624fa	[CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260316 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-09 22:54:12 +00:00
Tim Shen	33bf0bd3ea	[SelectionDAG] Fix CombineToPreIndexedLoadStore O(n^2) behavior This patch consists of two parts: a performance fix in DAGCombiner.cpp and a correctness fix in SelectionDAG.cpp. The test case tests the bug that's uncovered by the performance fix, and fixed by the correctness fix. The performance fix keeps the containers required by the hasPredecessorHelper (which is a lazy DFS) and reuse them. Since hasPredecessorHelper is called in a loop, the overall efficiency reduced from O(n^2) to O(n), where n is the number of SDNodes. The correctness fix keeps iterating the neighbor list even if it's time to early return. It will return after finishing adding all neighbors to Worklist, so that no neighbors are discarded due to the original early return. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259691 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-03 20:58:55 +00:00
Balaram Makam	60101204f1	AArch64: Implement missed conditional compare sequences. Summary: This is an extension to the existing implementation of r242436 which restricts to only select inputs. This version fixes missed opportunities in pr26084 by attempting to lower conditional compare sequences of and/or trees with setcc leafs. This will additionaly handle the case when a tree with select input is not a conjunction-disjunction tree but some of the sub trees are conjunction-disjunction trees. Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry Differential Revision: http://reviews.llvm.org/D16291 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259387 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-01 19:13:07 +00:00
Matthias Braun	5e08bd340a	Avoid overly large SmallPtrSet/SmallSet These sets perform linear searching in small mode so it is never a good idea to use SmallSize/N bigger than 32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259283 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-30 01:24:31 +00:00
Junmo Park	91de9d1201	[DAGCombiner] Don't add volatile or indexed stores to ChainedStores Summary: findBetterNeighborChains does not handle volatile or indexed stores. However, it did not check when adding stores to ChainedStores. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D16463 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259024 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-28 06:23:33 +00:00
Simon Pilgrim	2a554ce9c0	Tidied up TRUNC combine code. NFC. Make use of DAG.getBitcast and use clang-format to reduce number of lines (and make it more readable). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258644 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-23 21:50:40 +00:00
Dan Gohman	f2cde91200	[SelectionDAG] Fold more offsets into GlobalAddresses This reapplies r258296 and r258366, and also fixes an existing bug in SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the offset in a GlobalAddressSDNode, which is uncovered by those patches. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258482 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-22 03:57:34 +00:00
Reid Kleckner	99fdb962e1	Revert "[SelectionDAG] Fold more offsets into GlobalAddresses" This reverts r258296 and the follow up r258366. With this change, we miscompiled the following program on Windows: #include <string> #include <iostream> static const char kData[] = "asdf jkl;"; int main() { std::string s(kData + 3, sizeof(kData) - 3); std::cout << s << '\n'; } git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258465 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-22 01:09:29 +00:00
Dan Gohman	4d7ffe9779	[SelectionDAG] Fold more offsets into GlobalAddresses SelectionDAG previously missed opportunities to fold constants into GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to create `(add GA+c, y)`. This isn't often visible on targets such as X86 which effectively reassociate adds in their complex address-mode folding logic, however it is currently visible on WebAssembly since it currently has very simple address mode folding code that doesn't reassociate anything. This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses at the same times that it folds constants together, so that it doesn't miss any opportunities to perform such folding. Differential Revision: http://reviews.llvm.org/D16090 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258296 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-20 07:03:08 +00:00
Sanjay Patel	47436fa2d1	[DAGCombiner] don't dereference an operand that doesn't exist (PR26070) The bug was introduced with changes for x86-64 fp128: http://reviews.llvm.org/rL254653 I don't know why an x86 change is here, so I'll follow up in: http://reviews.llvm.org/D15134 Should fix: https://llvm.org/bugs/show_bug.cgi?id=26070 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257200 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-08 19:53:24 +00:00
Tim Shen	d6cee943e5	Test commit access - add a blank line in comment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257192 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-08 19:20:23 +00:00
Dan Gohman	0fcad92ee3	[SelectionDAGBuilder] Set NoUnsignedWrap for inbounds gep and load/store offsets. In an inbounds getelementptr, when an index produces a constant non-negative offset to add to the base, the add can be assumed to not have unsigned overflow. This relies on the assumption that addresses can't occupy more than half the address space, which isn't possible in C because it wouldn't be possible to represent the difference between the start of the object and one-past-the-end in a ptrdiff_t. Setting the NoUnsignedWrap flag is theoretically useful in general, and is specifically useful to the WebAssembly backend, since it permits stronger constant offset folding. Differential Revision: http://reviews.llvm.org/D15544 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256890 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-06 00:43:06 +00:00
Eric Christopher	54f1a90d45	Fix (bitcast (fabs x)), (bitcast (fneg x)) and (bitcast (fcopysign cst, x)) combines for ppc_fp128, since signbit computation is more complicated. Discussion thread: http://lists.llvm.org/pipermail/llvm-dev/2015-November/092863.html Patch by Tim Shen! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255305 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-10 22:09:06 +00:00
Sanjay Patel	afd3f07154	fix return values to match bool return type; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254968 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-07 23:34:30 +00:00
Chih-Hung Hsieh	9f51f8f7e7	[X86] Part 1 to fix x86-64 fp128 calling convention. Almost all these changes are conditioned and only apply to the new x86-64 f128 type configuration, which will be enabled in a follow up patch. They are required together to make new f128 work. If there is any error, we should fix or revert them as a whole. These changes should have no impact to current configurations. * Relax type legalization checks to accept new f128 type configuration, whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has TLI.isTypeLegal true. * Relax GetSoftenedFloat to return in some cases f128 type SDValue, which is TLI.isTypeLegal but not "softened" to i128 node. * Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration, to generate optimized bitwise operators for libm functions. * Enhance related Lower* functions to handle f128 type. * Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions to keep new f128 type in register, and convert f128 operators to library calls. * Fix Combiner, Emitter, Legalizer routines that did not handle f128 type. * Add ExpandConstant to handle i128 constants, ExpandNode to handle ISD::Constant node. * Add one more parameter to getCommonSubClass and firstCommonClass, to guarantee that returned common sub class will contain the specified simple value type. This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp. * Fix infinite loop in getTypeLegalizationCost when f128 is the value type. * Fix printOperand to handle null operand. * Enhance ISD::BITCAST node to handle f128 constant. * Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes. * Enhance X86AsmPrinter to emit f128 values in comments. Differential Revision: http://reviews.llvm.org/D15134 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254653 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-03 22:02:40 +00:00
Craig Topper	96be2c60e3	Use a lambda instead of std::bind and std::mem_fn I introduced in r254242. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254260 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-29 18:05:22 +00:00
Craig Topper	ce14a216e2	[SelectionDAG] Use std::any_of instead of a manually coded loop. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254242 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-29 04:37:11 +00:00
Artyom Skrobov	824e14ddab	Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC) Summary: Many target lowerings copy-paste the code to test SDValues for known constants. This code can instead be shared in SelectionDAG.cpp, and reused in the targets. Reviewers: MatzeB, andreadb, tstellarAMD Subscribers: arsenm, jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D14945 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254085 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-25 19:41:11 +00:00
Simon Pilgrim	f4a7279ca5	Remove duplicate getValueType() calls. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253823 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-22 16:49:38 +00:00
Jonas Paulsson	546611e398	[DAGCombiner] Bugfix for lost chain depenedency. When MergeConsecutiveStores() combines two loads and two stores into wider loads and stores, the chain users of both of the original loads must be transfered to the new load, because it may be that a chain user only depends on one of the loads. New test case: test/CodeGen/SystemZ/dag-combine-01.ll Reviewed by James Y Knight. Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=25310#c6 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253779 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-21 13:25:07 +00:00
Hans Wennborg	086b179985	X86: More efficient legalization of wide integer compares In particular, this makes the code for 64-bit compares on 32-bit targets much more efficient. Example: define i32 @test_slt(i64 %a, i64 %b) { entry: %cmp = icmp slt i64 %a, %b br i1 %cmp, label %bb1, label %bb2 bb1: ret i32 1 bb2: ret i32 2 } Before this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax setae %al cmpl 16(%esp), %ecx setge %cl je .LBB2_2 movb %cl, %al .LBB2_2: testb %al, %al jne .LBB2_4 movl $1, %eax retl .LBB2_4: movl $2, %eax retl After this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax sbbl 16(%esp), %ecx jge .LBB1_2 movl $1, %eax retl .LBB1_2: movl $2, %eax retl Differential Revision: http://reviews.llvm.org/D14496 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253572 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-19 16:35:08 +00:00
Geoff Berry	e2eaa9712d	[DAGCombiner] Improve zextload optimization. Summary: Don't fold (zext (and (load x), cst)) -> (and (zextload x), (zext cst)) if (and (load x) cst) will match as a zextload already and has additional users. For example, the following IR: %load = load i32, i32* %ptr, align 8 %load16 = and i32 %load, 65535 %load64 = zext i32 %load16 to i64 store i32 %load16, i32* %dst1, align 4 store i64 %load64, i64* %dst2, align 8 used to produce the following aarch64 code: ldr w8, [x0] and w9, w8, #0xffff and x8, x8, #0xffff str w9, [x1] str x8, [x2] but with this change produces the following aarch64 code: ldrh w8, [x0] str w8, [x1] str x8, [x2] Reviewers: resistor, mcrosier Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14340 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252789 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-11 19:42:52 +00:00
Matt Arsenault	b8f7aeb218	Add target preference for GatherAllAliases max depth git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252775 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-11 18:44:33 +00:00
Sanjay Patel	512052a88e	add a SelectionDAG method to check if no common bits are set in two nodes; NFCI This was suggested in: http://reviews.llvm.org/D13956 and is a follow-on to: http://reviews.llvm.org/rL252515 http://reviews.llvm.org/rL252519 This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG. A corresponding function for IR instructions already exists in ValueTracking. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252539 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-09 23:31:38 +00:00
Tom Stellard	136bd632b6	DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> extload Reviewers: resistor, arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13805 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252349 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-06 21:58:37 +00:00
James Y Knight	74615f5548	Fix two issues in MergeConsecutiveStores: 1) PR25154. This is basically a repeat of PR18102, which was fixed in r200201, and broken again by r234430. The latter changed which of the store nodes was merged into from the first to the last. Thus, we now also need to prefer merging a later store at a given address into the target node, instead of an earlier one. 2) While investigating that, I also realized I'd introduced a bug in r236850. There, I removed a check for alignment -- not realizing that nothing except the alignment check was ensuring that none of the stores were overlapping! This is a really bogus way to ensure there's no aliased stores. A better solution to both of these issues is likely to always use the code added in the 'if (UseAA)' branches which rearrange the chain based on a more principled analysis. I'll look into whether that can be used always, but in the interest of getting things back to working, I think a minimal change makes sense. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251816 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-02 18:48:08 +00:00
Sanjay Patel	6f6bce9783	Use the 'arcp' fast-math-flag when combining repeated FP divisors This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. This was originally part of D8900. Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and possibly other changes. Differential Revision: http://reviews.llvm.org/D9708 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251450 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-27 20:27:25 +00:00
Steve King	6f257342a1	Fix llc crash processing S/UREM for -Oz builds caused by rL250825. When taking the remainder of a value divided by a constant, visitREM() attempts to convert the REM to a longer but faster sequence of instructions. This conversion calls combine() on a speculative DIV instruction. Commit rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes. Flow eventually hits unreachable(). This patch adds a test case and a check to prevent visitREM() from trying to convert the REM instruction in cases where a DIVREM is possible. See http://reviews.llvm.org/D14035 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251373 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-27 00:14:06 +00:00
Simon Pilgrim	2bc87a6f42	[DAGCombiner] Generalize masking of constant rotates. We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded. Followup to D13851. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251197 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-24 18:44:52 +00:00
Simon Pilgrim	d0ca754540	[X86][XOP] Add support for lowering vector rotations This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions. This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future. Differential Revision: http://reviews.llvm.org/D13851 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251188 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-24 13:17:26 +00:00
Zia Ansari	02da4e7721	[X86] - Catch extra combine opportunities for redundant imuls. When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple users which would result in an extra add instruction. In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add. I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works). Differential Revision: http://reviews.llvm.org/D13740 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251028 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-22 16:14:45 +00:00
Artyom Skrobov	42c2a81d51	Combining DIV+REM->DIVREM doesn't belong in LegalizeDAG; move it over into DAGCombiner. Summary: In addition to moving the code over, this patch amends the DIV,REM -> DIVREM combining to run on all affected nodes at once: if the nodes are converted to DIVREM one at a time, then the resulting DIVREM may get legalized by the backend into something target-specific that we won't be able to recognize and correlate with the remaining nodes. The motivation is to "prepare terrain" for D13862: when we set DIV and REM to be legalized to libcalls, instead of the DIVREM, we otherwise lose the ability to combine them together. To prevent this, we need to take the DIV,REM -> DIVREM combining out of the lowering stage. Reviewers: RKSimon, eli.friedman, rengolin Subscribers: john.brawn, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13733 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250825 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-20 13:06:02 +00:00
Artyom Skrobov	d27f5c0eb0	A doccomment for CombineTo, and some NFC refactorings Summary: Caching SDLoc(N), instead of recreating it in every single function call, keeps the code denser, and allows to unwrap long lines. Reviewers: sunfish, atrick, sdmitrouk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13726 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250305 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-14 17:18:35 +00:00
Artyom Skrobov	d86f867e9e	Merge DAGCombiner::visitSREM and DAGCombiner::visitUREM (NFC) Summary: The two implementations had more code in common than not. Reviewers: sunfish, MatzeB, sdmitrouk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13724 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250302 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-14 16:54:14 +00:00
Matt Arsenault	1349eb7659	DAGCombiner: Don't stop finding better chain on 2 aliases The comment says this was stopped because it was unlikely to be profitable. This is not true if you want to combine vector loads with multiple components. For a simple case that looks like t0 = load t0 ... t1 = load t0 ... t2 = load t0 ... t3 = load t0 ... t4 = store t0:1, t0:1 t5 = store t4, t1:0 t6 = store t5, t2:0 t7 = store t6, t3:0 We want to get all of these stores onto a chain that is a TokenFactor of these N loads. This mostly solves the AMDGPU merge-stores.ll regressions with -combiner-alias-analysis for merging vector stores of vector loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250138 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-13 00:49:00 +00:00
Matt Arsenault	c08fe15c4f	DAGCombiner: Combine extract_vector_elt from build_vector This basic combine was surprisingly missing. AMDGPU legalizes many operations in terms of 32-bit vector components, so not doing this results in many extra copies and subregister extracts that need to be cleaned up later. InstCombine already does this for the hasOneUse case. The target hook is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn from a vector materialize repeated immediate instruction to a constant vector load with more scalar copies from it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250129 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-12 23:59:50 +00:00

... 13 14 15 16 17 ...

2217 Commits