archived-llvm

mirror of https://github.com/RPCSX/llvm.git synced 2026-01-31 01:05:23 +01:00

Author	SHA1	Message	Date
Elena Demikhovsky	2fd63302fe	Fixed FMA + FNEG combine. Masked form of FMA should be omitted in this optimization. Differential Revision: https://reviews.llvm.org/D25984 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285492 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-29 08:44:46 +00:00
Matt Arsenault	ac5efca3f0	AMDGPU: Use 1/2pi inline imm on VI I'm guessing at how it is supposed to be printed git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285490 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-29 04:05:06 +00:00
Davide Italiano	c5763946b3	[DAGCombiner] Fix a crash visiting `AND` nodes. Instead of asserting that the shift count is != 0 we just bail out as it's not profitable trying to optimize a node which will be removed anyway. Differential Revision: https://reviews.llvm.org/D26098 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285480 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 23:55:32 +00:00
Tom Stellard	b15bbca10c	AMDGPU/SI: Don't use non-0 waitcnt values when waiting on Flat instructions Summary: Flat instruction can return out of order, so we need always need to wait for all the outstanding flat operations. Reviewers: tony-tye, arsenm Subscribers: kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D25998 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285479 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 23:53:48 +00:00
Matt Arsenault	d6028cdcc7	AMDGPU: Add definitions for scalar store instructions Also add glc bit to the scalar loads since they exist on VI and change the caching behavior. This currently has an assembler bug where the glc bit is incorrectly accepted on SI/CI which do not have it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285463 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 21:55:15 +00:00
Justin Lebar	30c499dfda	[NVPTX] Compute 'rem' using the result of 'div', if possible. Summary: In isel, transform Num % Den into Num - (Num / Den) * Den if the result of Num / Den is already available. Reviewers: tra Subscribers: hfinkel, llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D26090 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285461 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 21:44:00 +00:00
Matt Arsenault	0e18bbf16a	AMDGPU: Change check prefix in test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285449 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 20:33:01 +00:00
Matt Arsenault	6cabc8f486	AMDGPU: Diagnose using too many SGPRs This is possible when using inline asm. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285447 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 20:31:47 +00:00
Krzysztof Parzyszek	1adbd7e4f5	Handle non-~0 lane masks on live-in registers in LivePhysRegs When LivePhysRegs adds live-in registers, it recognizes ~0 as a special lane mask indicating the entire register. If the lane mask is not ~0, it will only add the subregisters that overlap the specified lane mask. The problem is that if a live-in register does not have subregisters, and the lane mask is not ~0, it will not be added to the live set. (The given lane mask may simply be the lane mask of its register class.) If a register does not have subregisters, add it to the live set if the lane mask is non-zero. Differential Revision: https://reviews.llvm.org/D26094 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285440 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 20:06:37 +00:00
Matt Arsenault	2d7bc6b1e1	AMDGPU: Fix using incorrect private resource with no allocation It's possible to have a use of the private resource descriptor or scratch wave offset registers even though there are no allocated stack objects. This would result in continuing to use the maximum number reserved registers. This could go over the number of SGPRs available on VI, or violate the SGPR limit requested by the function attributes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285435 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 19:43:31 +00:00
Nemanja Ivanovic	0b61b12b8c	Implement vector count leading/trailing bytes with zero lsb and vector parity builtins - llvm portion This patch corresponds to review https://reviews.llvm.org/D26003. Committing on behalf of Zaara Syeda. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285434 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 19:38:24 +00:00
Arnold Schwaighofer	9943293184	Make swift calling convention test specific to armv7 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285431 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 19:18:09 +00:00
Sanjay Patel	3dc20a0272	[x86] add tests for missed umin/umax This is actually a deficiency in ValueTracking's matchSelectPattern(), but a codegen test is the simplest way to expose the bug. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285429 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 19:08:20 +00:00
Arnold Schwaighofer	05af2b25b3	More swift calling convention tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285417 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 17:21:05 +00:00
Krzysztof Parzyszek	fe1e3ecadd	[Hexagon] Maintain kill flags through splitting in expand-condsets Do not use LiveIntervals to recalculate kills, because that cannot be done accurately without implicit uses on predicated instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285409 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 15:50:22 +00:00
Juergen Ributzka	67d80b9ced	Revert "[DAGCombiner] Add vector demanded elements support to computeKnownBits" This seems to have increased LTO compile time bejond 2x of previous builds. See http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto/10676/ git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285381 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 04:01:12 +00:00
Tom Stellard	a911f5ff01	AMDGPU/SI: Handle hazard with s_rfe_b64 Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25638 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285368 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 23:50:21 +00:00
Tom Stellard	8434132101	AMDGPU/SI: Handle hazard with sgpr lane selects for v_{read,write}lane Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25637 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285367 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 23:42:29 +00:00
Tom Stellard	5480a2423d	AMDGPU/SI: Handle hazard with > 8 byte VMEM stores Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25577 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285359 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 23:05:31 +00:00
Tom Stellard	79758d450e	AMDGPU/SI: Handle s_setreg hazard in GCNHazardRecognizer Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25528 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285338 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 20:39:09 +00:00
Ehsan Amiri	34a73b3124	[PPC] Adding the removed testcase again This testcase was originally part of r284995, but I put it in a wrong directory. So I removed it. Before adding it back I did some small enhancements. Also I changed the assertions a little bit, to take into account the impact of some changes performed since code review is done. This is similar to changes done for another testcase in the original commit. See: https://reviews.llvm.org/D23614#577749 Basically for instead of vxor we now generate xxlxor in some cases, which is better. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285333 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 19:10:09 +00:00
Saleem Abdulrasool	b5143b06a1	ARM: ensure that the Windows DBZ check is in range The Windows ARM target expects the compiler to emit a division-by-zero check. The check would use the form of: cmp r?, #0 cbz .Ltrap b .Lbody .Lbody: ... .Ltrap: udf #249 @ __brkdiv0 This works great most of the time. However, if the body of the function is greater than 127 bytes, the branch target limitation of cbz becomes an issue. This occurs in the unoptimized code generation cases sometimes (like in compiler-rt). Since this is a matter of correctness, possibly pay a small penalty instead. We now form this slightly differently: cbnz .Lbody udf #249 @ __brkdiv0 .Lbody: ... The positive case is through the branch instead of being the next instruction. However, because of the basic block layout, the negated branch is going to be a short distance always (2 bytes away, after the inserted __brkdiv0). The new t__brkdiv0 instruction is required to explicitly mark the instruction as a terminator as the generic UDF instruction is not a terminator. Addresses PR30532! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285312 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 16:59:22 +00:00
Vasileios Kalintiris	1de03247dc	[mips] Do not allow -opt-bisect-limit to skip the PIC call optimization pass. r282428 added the MipsOptimizePICCall as an opt-in pass that can be skipped when using the -opt-bisect-limit option. However, this pass is needed because it generates code that conforms to the o32 ABI specification by using the $t9 register for PIC calls with JALR instructions. This bug was exposed by the fact that skipFunction() also checks for the "optnone" attribute. This caused functions with that attribute to break the requirements of the o32 ABI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285305 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 15:50:36 +00:00
Simon Pilgrim	0de3e81c28	[X86][AVX512DQ] Improve lowering of MUL v2i64 and v4i64 With DQI but without VLX, lower v2i64 and v4i64 MUL operations with v8i64 MUL (vpmullq). Updated cost table accordingly. Differential Revision: https://reviews.llvm.org/D26011 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285304 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 15:27:00 +00:00
Krzysztof Parzyszek	444277c658	[Hexagon] Do not expand ISD::SELECT for HVX vectors git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285297 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 14:30:16 +00:00
Simon Pilgrim	5579104d09	[DAGCombiner] Add vector demanded elements support to computeKnownBits Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements. This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1. The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used. I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course. DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit. Differential Revision: https://reviews.llvm.org/D25691 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285296 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 14:29:28 +00:00
Sam Parker	1341f74f93	[ARM] Add newline char to test. Missed a newline in the previous commit. Differential Revision: https://reviews.llvm.org/D26027 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285280 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 10:43:02 +00:00
Sam Parker	a6ec572d31	[ARM] Predicate UMAAL selection on hasDSP. UMAAL is a DSP instruction and it is not available on thumbv7m (Cortex-M3) and thumbv6m (Cortex-M0+1) targets. Also fix wrong CHECK prefix in longMAC.ll test. Patch by Vadzim Dambrouski. Differential Revision: https://reviews.llvm.org/D25890 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285278 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 09:47:10 +00:00
Nicolai Haehnle	ce0a5230f5	AMDGPU: Fix SILoadStoreOptimizer when writes cannot be merged due register dependencies Summary: When finding a match for a merge and collecting the instructions that must be moved, keep in mind that the instruction we merge might actually use one of the defs that are being moved. Fixes piglit spec/arb_enhanced_layouts/execution/component-layout/vs-tcs-load-output[-indirect]. The fact that the ds_read in the test case is not eliminated suggests that there might be another problem related to alias analysis, but that's a separate problem: this pass should still work correctly even when earlier optimization passes missed something or were disabled. Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25829 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285273 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 08:15:07 +00:00
Nemanja Ivanovic	e9fdaa1bbb	[PowerPC] - No SExt/ZExt needed for count trailing zeros This patch corresponds to review: https://reviews.llvm.org/D25896 It just eliminates the redundant ZExt after a count trailing zeros instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285267 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 05:17:58 +00:00
Tim Northover	1ff4f02fe6	ARM: don't rely on push/pop reglists being in order when folding SP adjust. It would be a very nice invariant to rely on, but unfortunately it doesn't necessarily hold (and the causes of mis-sorted reglists appear to be quite varied) so to be robust the frame lowering code can't assume that the first register in the list is also the first one that actually gets pushed. Should fix an issue where we were turning something like: push {r8, r4, r7, lr} sub sp, #24 into nonsense like: push {r2, r3, r4, r5, r6, r7, r8, r4, r7, lr} git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285232 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 20:01:00 +00:00
Nemanja Ivanovic	73235dc8dc	Do not assume that FP vector operands are never legalized by expanding This patch ensures that if a floating point vector operand is legalized by expanding, it is legalized through the stack rather than by calling DAGTypeLegalizer::IntegerToVector which will cause a failure since the operand is a non-integer type. This fixes PR 30715. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285231 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 19:51:35 +00:00
Nemanja Ivanovic	4e7356cfaf	[PowerPC] Implement vec_insert_exp builtins - llvm portion This revision corresponds to review: https://reviews.llvm.org/D25957. Committing on behalf of Zaara Syeda. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285225 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 19:03:40 +00:00
Chad Rosier	39d39677e6	Fix test from r285217. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285222 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 18:49:16 +00:00
Chad Rosier	1d0dc0fb0b	[AArch64] Avoid materializing constant 1 when generating cneg instructions. Instead of cmp w0, #1 orr w8, wzr, #0x1 cneg w0, w8, ne we now generate cmp w0, #1 csinv w0, w0, wzr, eq PR28965 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285217 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 18:15:32 +00:00
Yaxun Liu	c93e472923	AMDGPU: Refactor processor definition to use ISA version features Add missing ISA versions 7.0.2/8.0.4/8.1.0. to backend. Refactor processor definition to use ISA version features. Fixed ISA version for stoney. Based on Laurent Morichetti's patch. Differential Revision: https://reviews.llvm.org/D25919 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285210 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 16:37:56 +00:00
Matt Arsenault	f63894ba9e	Reapply "AMDGPU: Don't use offen if it is 0" This reverts r283003 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285203 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 15:08:16 +00:00
Tom Stellard	1a633d108a	AMDGPU/SI: Don't emit multi-dword flat memory ops when they might access scratch Summary: A single flat memory operations that might access the scratch buffer can only access MaxPrivateElementSize bytes. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25788 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285198 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 14:38:47 +00:00
Tom Stellard	e2f7559049	AMDGPU/SI: Remove unnecessary run lines from test Summary: This test had run lines disabling/enabling the promote alloca pass, but enabling/disabling promote alloca had no impact on the output. Reviewers: arsenm Subscribers: mgrang, kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25787 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285197 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 14:21:09 +00:00
Zvi Rackover	a1f05a248d	[X86] AVX512 fallback for floating-point scalar selects Summary: In the case where of 'select i1 , f32, f32' or select i1, f64, f64 prefer lowering to masked-moves over branches. Fixes pr30561 Reviewers: igorb, aymanmus, delena Differential Revision: https://reviews.llvm.org/D25310 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285196 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 14:12:46 +00:00
Craig Topper	9fc96e5646	[AVX-512] Add scalar vfmsub/vfnmsub mask3 intrinsics Summary: Clang's intrinsic header currently tries to negate the third operand of a vfmadd mask3 in order to create vfmsub, but this fails isel. This patch adds scalar vfmsub and vfnmsub mask3 that we can use instead to avoid the negate. This is consistent with the packed instructions. Reviewers: igorb, delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25933 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285173 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 04:59:58 +00:00
James Y Knight	ed5107d663	[Sparc] Don't overlap variable-sized allocas with other stack variables. On SparcV8, it was previously the case that a variable-sized alloca might overlap by 4-bytes the last fixed stack variable, effectively because 92 (the number of bytes reserved for the register spill area) != 96 (the offset added to SP for where to start a DYNAMIC_STACKALLOC). It's not as simple as changing 96 to 92, because variables that should be 8-byte aligned would then be misaligned. For now, simply increase the allocation size by 8 bytes for each dynamic allocation -- wastes space, but at least doesn't overlap. As the large comment says, doing this more efficiently will require larger changes in llvm. Also adds some test cases showing that we continue to not support dynamic stack allocation and over-alignment in the same function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285131 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-25 22:13:28 +00:00
Simon Pilgrim	d9bc309e9c	[DAGCombiner] Enable (urem x, (shl pow2, y)) -> (and x, (add (shl pow2, y), -1)) combine for splatted vectors git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285129 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-25 22:01:09 +00:00
Simon Pilgrim	908c768e1c	[X86][SSE] Regenerated known-bits test with srem->urem fix git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285124 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-25 21:24:33 +00:00
Simon Pilgrim	b03ba30cbf	[DAGCombiner] Enable srem(x.y) -> urem(x,y) combine for vectors SelectionDAG::SignBitIsZero (via SelectionDAG::computeKnownBits) has supported vectors since rL280927 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285123 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-25 21:20:18 +00:00
Simon Pilgrim	4864d89aff	[X86][SSE] Added vector srem combine tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285121 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-25 21:14:11 +00:00
Simon Pilgrim	b4f0b6e6d3	[X86][SSE] Added vector urem combine tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285119 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-25 21:10:12 +00:00
Simon Pilgrim	baad275225	[DAGCombiner] Enable sdiv(x.y) -> udiv(x,y) combine for vectors SelectionDAG::SignBitIsZero (via SelectionDAG::computeKnownBits) has supported vectors since rL280927 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285118 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-25 20:56:42 +00:00
Simon Pilgrim	09845fdead	[X86][SSE] Added vector sdiv combine tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285112 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-25 20:25:47 +00:00
Evandro Menezes	9d6f1231e9	Add option to specify minimum number of entries for jump tables Add an option to allow easier experimentation by target maintainers with the minimum number of entries to create jump tables. Also clarify the name of the other existing option governing the creation of jump tables. Differential revision: https://reviews.llvm.org/D25883 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285104 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-25 19:53:51 +00:00

1 2 3 4 5 ...

18693 Commits