archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Craig Topper	180beb4eef	[AVX-512] Add shuffle comments for vbroadcast instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284305 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-15 16:26:07 +00:00
Tom Stellard	6b339ba8ac	AMDGPU/SI: Handle s_getreg hazard in GCNHazardRecognizer Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25526 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284298 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-15 00:58:14 +00:00
Tim Northover	ee325b9e96	GlobalISel: rename legalizer components to match others. The previous names were both misleading (the MachineLegalizer actually contained the info tables) and inconsistent with the selector & translator (in having a "Machine") prefix. This should make everything sensible again. The only functional change is the name of a couple of command-line options. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284287 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 22:18:18 +00:00
Tim Northover	31164bc2c9	PowerPC: specify full triple to avoid different Darwin asm syntax. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284281 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 21:25:29 +00:00
Sanjay Patel	6262732897	[ARM] add tests for PR30660 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284280 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 20:52:43 +00:00
Sanjay Patel	65a7ee43be	[PowerPC] add tests for PR30661 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284279 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 20:51:41 +00:00
Guozhi Wei	8bb12b9f5e	[PPC] Shorter sequence to load 64bit constant with same hi/lo words This is a patch to implement pr30640. When a 64bit constant has the same hi/lo words, we can use rldimi to copy the low word into high word of the same register. This optimization caused failure of test case bperm.ll because of not optimal heuristic in function SelectAndParts64. It chooses AND or ROTATE to extract bit groups from a register, and OR them together. This optimization lowers the cost of loading 64bit constant mask used in AND method, and causes different code sequence. But actually ROTATE method is better in this test case. The reason is in ROTATE method the final OR operation can be avoided since rldimi can insert the rotated bits into target register directly. So this patch also enhances SelectAndParts64 to prefer ROTATE method when the two methods have same cost and there are multiple bit groups need to be ORed together. Differential Revision: https://reviews.llvm.org/D25521 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284276 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 20:41:50 +00:00
Tom Stellard	18cea770a0	AMDGPU/SI: Use new SimplifyDemandedBits helper for multi-use operations Summary: We are using this helper for our 24-bit arithmetic combines, so we are now able to eliminate multi-use operations that mask the high-bits of 24-bit inputs (e.g. and x, 0xffffff) Reviewers: arsenm, nhaehnle Subscribers: tony-tye, arsenm, kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D24672 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284267 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 19:14:29 +00:00
David L Kreitzer	4475acba12	Add a pass to optimize patterns of vectorized interleaved memory accesses for X86. The pass optimizes as a unit the entire wide load + shuffles pattern produced by interleaved vectorization. This initial patch optimizes one pattern (64-bit elements interleaved by a factor of 4). Future patches will generalize to additional patterns. Patch by Farhana Aleen Differential revision: http://reviews.llvm.org/D24681 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284260 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 18:20:41 +00:00
Tom Stellard	a9c6165732	AMDGPU/SI: Don't allow unaligned scratch access Summary: The hardware doesn't support this. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25523 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284257 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 18:10:39 +00:00
Pierre Gousseau	4594395329	[X86] Take advantage of the lzcnt instruction on btver2 architectures when ORing comparisons to zero. This change adds transformations such as: zext(or(setcc(eq, (cmp x, 0)), setcc(eq, (cmp y, 0)))) To: srl(or(ctlz(x), ctlz(y)), log2(bitsize(x)) This optimisation is beneficial on Jaguar architecture only, where lzcnt has a good reciprocal throughput. Other architectures such as Intel's Haswell/Broadwell or AMD's Bulldozer/PileDriver do not benefit from it. For this reason the change also adds a "HasFastLZCNT" feature which gets enabled for Jaguar. Differential Revision: https://reviews.llvm.org/D23446 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284248 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 16:41:38 +00:00
Sanjay Patel	66c0dee698	[DAG] add folds for negated shifted sign bit The same folds exist in InstCombine already. This came up as part of: https://reviews.llvm.org/D25485 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284239 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 14:26:47 +00:00
Sanjay Patel	7f74decc60	[x86] add tests to show missing folds for negated shifted sign bit git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284238 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 14:14:40 +00:00
Nicolai Haehnle	877e3beed6	AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodes Summary: This will be used for 64-bit MULHU, which is in turn used for the 64-bit divide-by-constant optimization (see D24822). Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25289 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284224 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 10:30:00 +00:00
Diana Picus	d1a990ddbb	[GlobalISel] Get the AArch64 tests to work on Linux Mostly this just means changing the triple from aarch64-apple-ios to the generic aarch64--. Only one test needs more significant changes, but GlobalISel already does the right thing so it's ok to just change the checks. Differential Revision: https://reviews.llvm.org/D25532 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284223 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 10:19:40 +00:00
Craig Topper	cfa4f53d33	[DAGCombiner] Teach createBuildVecShuffle to handle cases where input vectors are less than half of the output vector size. This will be needed by a future commit to support sign/zero extending from v8i8 to v8i64 which requires a sign/zero_extend_vector_inreg to be created which requires v8i8 to be concatenated upto v64i8 and goes through this code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284204 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 06:00:42 +00:00
Konstantin Zhuravlyov	a91924b28a	[AMDGPU] Emit 32-bit lo/hi got and pc relative variant kinds for external and global address space variables Differential Revision: https://reviews.llvm.org/D25562 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284196 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 04:37:34 +00:00
Saleem Abdulrasool	6ec5391920	CodeGen: use MSVC division on windows itanium Windows itanium is identical to MSVC when dealing with everything but C++. Lower the math routines into msvcrt rather than compiler-rt. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284175 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 23:00:11 +00:00
Saleem Abdulrasool	b0c1779a79	CodeGen: adjust floating point operations in Windows itanium Windows itanium is equivalent to MSVC except in C++ mode. Ensure that the promote the 32-bit floating point operations to their 64-bit equivalences. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284173 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 22:38:15 +00:00
Sriraman Tallam	fcfb682297	New llc option pie-copy-relocations to optimize access to extern globals. This option indicates copy relocations support is available from the linker when building as PIE and allows accesses to extern globals to avoid the GOT. Differential Revision: https://reviews.llvm.org/D24849 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284160 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 20:54:39 +00:00
Nirav Dave	080559c6d3	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r284151 which appears to be triggering a LTO failures on Hexagon git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284157 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 20:23:25 +00:00
Nirav Dave	19dc709f4b	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after upstream changes. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284151 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 19:20:16 +00:00
Igor Breger	d942282a2f	[X86][AVX512] Fix sext v32i1 -> v32i8 lowering. Fix PR30600. Differential Revision: https://reviews.llvm.org/D25554 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284134 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 17:20:38 +00:00
Reid Kleckner	dc99d00be5	Fix for PR30687. Avoid dereferencing MBB.end(). We don't need to return a MachineInstr* from these stack probe insertion calls anyway. If we ever need to add it back, we can return an iterator instead. Based on a patch by David Kreitzer This bug is a consequence of r279314 \| dexonsmith \| 2016-08-19 13:40:12 -0700 (Fri, 19 Aug 2016) \| 110 lines We hit the "Assertion `!NodePtr->isKnownSentinel()' failed" assertion, but only when inserting a stack probe call at the end of an MBB, which isn't necessarily a common situation. Differential Revision: https://reviews.llvm.org/D25566 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284130 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 15:48:48 +00:00
Javed Absar	172d6c078b	[ARM]: Assign cost of scaling used in addressing mode for ARM cores This patch assigns cost of the scaling used in addressing. On many ARM cores, a negated register offset takes longer than a non-negated register offset, in a register-offset addressing mode. For instance: LDR R0, [R1, R2 LSL #2] LDR R0, [R1, -R2 LSL #2] Above, (1) takes less cycles than (2). By assigning appropriate scaling factor cost, we enable the LLVM to make the right trade-offs in the optimization and code-selection phase. Differential Revision: http://reviews.llvm.org/D24857 Reviewers: jmolloy, rengolin git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284127 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 14:57:43 +00:00
Sanjay Patel	2061c51e0e	[x86] add negate-i1 run for 32-bit target git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284124 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 14:27:08 +00:00
Simon Pilgrim	97ca021c6f	[DAGCombiner] Add vector support to (mul (shl X, Y), Z) -> (shl (mul X, Z), Y) style combines git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284122 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 14:04:35 +00:00
Matt Arsenault	c31d80dbf5	AMDGPU: Assume spilling will occur at -O0 Because everything live is spilled at the end of a block by fast regalloc, assume this will happen and avoid the copies of the resource descriptor. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284119 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 13:10:00 +00:00
Simon Pilgrim	8517cfc997	Copy+pasts typo in comment describing combine test Repeated the "fold (mul x, 0) -> 0" instead of "fold (mul x, 1) -> x" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284118 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 12:54:32 +00:00
Simon Pilgrim	9dbdf67986	[DAGCombiner] Add vector support to C2-(A+C1) -> (C2-C1)-A folding git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284117 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 12:49:31 +00:00
Simon Pilgrim	00d9dddfae	[DAGCombiner] Add vector support to (sub -1, x) -> (xor x, -1) canonicalization Improves commutation potential git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284113 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 12:05:20 +00:00
Oren Ben Simhon	4b6c339e4a	[X86] Basic additions to support RegCall Calling Convention. The Register Calling Convention (RegCall) was introduced by Intel to optimize parameter transfer on function call. This calling convention ensures that as many values as possible are passed or returned in registers. This commit presents the basic additions to LLVM CodeGen in order to support RegCall in X86. Differential Revision: http://reviews.llvm.org/D25022 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284108 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 07:53:43 +00:00
Craig Topper	1c62059ce7	[AVX-512] Fix v16i32 zero extending shuffle test case so it's really zero extend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284106 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 05:41:01 +00:00
Craig Topper	6c06855a1a	[AVX-512] Teach shuffle lowering to recognize 512-bit zero extends. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284105 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 05:29:41 +00:00
Craig Topper	c42ce5a0ba	[AVX-512] Add tests for basic 512-bit zero extending shuffle patterns. Code will be improved in a future commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284104 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 05:29:37 +00:00
Quentin Colombet	10c508497f	[AArch64][RegisterBankInfo] Provide alternative mappings for 64-bit load This allows RegBankSelect in greedy mode to get rid some of the cross register bank copies when loads are involved in the chain of computation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284097 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 01:01:23 +00:00
Reid Kleckner	db2e3264a9	Correct PrivateLinkage for COFF - Use storage class C_STAT for 'PrivateLinkage' The storage class for PrivateLinkage should equal to the Internal Linkage. - Set 'PrivateGlobalPrefix' from "L" to ".L" for MM_WinCOFF (includes x86_64) MM_WinCOFF has empty GlobalPrefix '\0' so PrivateGlobalPrefix "L" may conflict to the normal symbol name starting with 'L'. Based on a patch by Han Sangjin! Manually updated test cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284096 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 00:55:24 +00:00
Quentin Colombet	6837af0c59	[AArch64][RegisterBankInfo] Provide alternative mappings for G_BITCASTs. Thanks to this patch, RegBankSelect is able to get rid of some register bank copies as demonstrated in the test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284094 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 00:34:48 +00:00
Quentin Colombet	d8bc7a16c7	[AArch64][RegisterBankInfo] Use static mapping for same bank G_BITCAST. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284090 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 00:12:04 +00:00
Quentin Colombet	df3941c1ca	[AArch64][MachineLegalizer] Mark more G_BITCAST as legal. Basically any vector types that fits in a 32-bit register is also valid as far as copies are concerned. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284089 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 00:12:01 +00:00
Albert Gutowski	c0fa9afe60	fix function label name in addressofreturnaddress test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284085 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 23:58:45 +00:00
Krzysztof Parzyszek	6dcccf4d12	Handle lane masks in LivePhysRegs when adding live-ins Differential Revision: https://reviews.llvm.org/D25533 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284076 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 22:53:41 +00:00
Tim Northover	5c4187e750	GlobalISel: support G_TRUNC selection on AArch64. Ahmed's patch again. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284075 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 22:49:15 +00:00
Tim Northover	8394d5db03	GlobalISel: support int <-> float conversions on AArch64. More of Ahmed's work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284074 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 22:49:11 +00:00
Tim Northover	e205ee4f2c	GlobalISel: select G_FCMP instructions on AArch64. Another of Ahmed's patches. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284073 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 22:49:07 +00:00
Tim Northover	7bd256df7f	GlobalISel: support selection of G_ICMP on AArch64. Patch from Ahmed Bougaca again. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284072 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 22:49:04 +00:00
Tim Northover	8e4c0619c7	GlobalISel: select G_BRCOND instructions on AArch64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284071 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 22:49:01 +00:00
Tim Northover	110db83eac	GlobalISel: mark G_BRCOND on s1 as legal. It's going to be a TBNZ (at -O0) anyway, so the high bits don't matter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284070 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 22:48:36 +00:00
Albert Gutowski	16bf208ba8	Create llvm.addressofreturnaddress intrinsic Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows. Reviewers: majnemer, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25293 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284061 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 22:13:19 +00:00
Krzysztof Parzyszek	b15c25855e	[MIRParser] Parse lane masks for register live-ins Differential Revision: https://reviews.llvm.org/D25530 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284052 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 21:06:45 +00:00

1 2 3 4 5 ...

18539 Commits