RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-02-14 07:29:01 +00:00

Author	SHA1	Message	Date
Jason Liu	ea8ee651a9	Implement call lowering without parameters on AIX Summary:dd This patch implements call lowering for calls without parameters on AIX as initial support. Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma Differential Revision: https://reviews.llvm.org/D61948 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361669 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 20:54:35 +00:00
Jessica Paquette	161c305315	[GlobalISel][AArch64] Improve register bank mappings for G_SELECT The fcsel and csel instructions differ in only the register banks they work on. So, they're entirely interchangeable otherwise. With this in mind, this does two things: - Teach AArch64RegisterBankInfo to consider the inputs to G_SELECT as well as the outputs. - Teach it to choose the best register bank mapping based off the constraints of the inputs and outputs. The "best" in this case means the one that requires the smallest number of copies to properly emit a fcsel/csel. For example, if the inputs are all already going to be on FPRs, we should emit a fcsel, even if the output is a GPR. This costs one copy to produce the result, but saves us from copying the inputs into GPRs. Also update the regbank-select.mir to check that we end up with the right select instruction. Differential Revision: https://reviews.llvm.org/D62267 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361665 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 19:35:25 +00:00
Nick Desaulniers	5d5f1f607e	[AArch64] check for INLINEASM_BR along w/ INLINEASM Summary: It looks like since INLINEASM_BR was created off of INLINEASM, a few checks for INLINEASM needed to be updated to check for either case. pr/41999 Reviewers: t.p.northover, peter.smith Reviewed By: peter.smith Subscribers: craig.topper, javed.absar, kristof.beyls, hiraditya, llvm-commits, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D62402 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361661 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 19:00:13 +00:00
Nick Desaulniers	d965de356d	[ARM] additionally check for ARM::INLINEASM_BR w/ ARM::INLINEASM Summary: We were observing failures for arm32 allyesconfigs of the Linux kernel with the asm goto Clang patch, where ldr's were being generated to offsets too far away to encode in imm12. It looks like since INLINEASM_BR was created off of INLINEASM, a few checks for INLINEASM needed to be updated to check for either case. pr/41999 Link: https://github.com/ClangBuiltLinux/linux/issues/490 Reviewers: peter.smith, kristof.beyls, ostannard, rengolin, t.p.northover Reviewed By: peter.smith Subscribers: jyu2, javed.absar, hiraditya, llvm-commits, nathanchance, craig.topper, kees, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D62400 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361659 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 18:58:21 +00:00
Matt Arsenault	332260473f	AMDGPU: Activate all lanes when spilling CSR VGPR for SGPR spills If some lanes weren't active on entry to the function, this could clobber their VGPR values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361655 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 18:18:51 +00:00
Matt Arsenault	bbaa274fa9	AMDGPU: Boost inline threshold with addrspacecasted alloca arguments This was skipping GetUnderlyingObject for nonprivate addresses, but an alloca could also be found through an addrspacecast if it's flat. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361649 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 16:52:35 +00:00
Alexander Timofeev	6a29119c95	[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361644 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 15:32:18 +00:00
Stefan Pintilie	c60409f121	[PowerPC] Remove CRBits Copy Of Unset/set CBit For the situation, where we generate the following code: crxor 8, 8, 8 < Some instructions> .LBB0_1: < Some instructions> cror 1, 8, 8 cror (COPY of CRbit) depends on the result of the crxor instruction. CR8 is known to be zero as crxor is equivalent to CRUNSET. We can simply use crxor 1, 1, 1 instead to zero out CR1, which does not have any dependency on any previous instruction. This patch will optimize it to: < Some instructions> .LBB0_1: < Some instructions> cror 1, 1, 1 Patch By: Victor Huang (NeHuang) Differential Revision: https://reviews.llvm.org/D62044 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361632 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 12:05:37 +00:00
Cullen Rhodes	35c5c80df4	[AArch64][SVE2] Asm: support SVE2 String Processing Group Summary: Patch adds support for the SVE2 character match instructions MATCH and NMATCH. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62206 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361627 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 10:32:01 +00:00
Cullen Rhodes	d5314754c6	[AArch64][SVE2] Asm: support SVE2 Narrowing Group Summary: Patch adds support for the following instructions: SVE2 bitwise shift right narrow: * SQSHRUNB, SQSHRUNT, SQRSHRUNB, SQRSHRUNT, SHRNB, SHRNT, RSHRNB, RSHRNT, SQSHRNB, SQSHRNT, SQRSHRNB, SQRSHRNT, UQSHRNB, UQSHRNT, UQRSHRNB, UQRSHRNT SVE2 integer add/subtract narrow high part: * ADDHNB, ADDHNT, RADDHNB, RADDHNT, SUBHNB, SUBHNT, RSUBHNB, RSUBHNT SVE2 saturating extract narrow: * SQXTNB, SQXTNT, UQXTNB, UQXTNT, SQXTUNB, SQXTUNT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62205 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361624 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 10:22:30 +00:00
Cullen Rhodes	7f2c0b7855	[AArch64][SVE2] Asm: support SVE2 Accumulate Group Summary: Patch adds support for the following instructions: SVE2 bitwise shift and insert: * SRI, SLI SVE2 bitwise shift right and accumulate: * SSRA, USRA, SRSRA, URSRA SVE2 complex integer add: * CADD, SQCADD SVE2 integer absolute difference and accumulate: * SABA, UABA SVE2 integer absolute difference and accumulate long: * SABALB, SABALT, UABALB, UABALT SVE2 integer add/subtract long with carry: * ADCLB, ADCLT, SBCLB, SBCLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62204 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361622 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 10:10:34 +00:00
Simon Pilgrim	128852a784	[SelectionDAG] computeKnownBits - support constant pool values from target This patch adds the overridable TargetLowering::getTargetConstantFromLoad function which allows targets to return any constant value loaded by a LoadSDNode node - only X86 makes use of this so far but everything should be in place for other targets. computeKnownBits then uses this function to improve codegen, notably vector code after legalization. A future commit will do the same for ComputeNumSignBits but computeKnownBits sees the bigger benefit. This required a couple of fixes: * SimplifyDemandedBits must early-out for getTargetConstantFromLoad cases to prevent infinite loops of constant regeneration (similar to what we already do for BUILD_VECTOR). * Fix a DAGCombiner::visitTRUNCATE issue as we had trunc(shl(v8i32),v8i16) <-> shl(trunc(v8i16),v8i32) infinite loops after legalization on AVX512 targets. Differential Revision: https://reviews.llvm.org/D61887 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361620 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 10:03:11 +00:00
Cullen Rhodes	85b221a6c2	[AArch64][SVE2] Asm: add PMULLB/PMULLT instructions Summary: This patch adds support for the polynomial multiplication instructions PMULLB/PMULLT. The 64-bit source and 128-bit destination element variants are enabled with crypto extensions (+sve2-aes), similar to the NEON PMULL2 instruction. All other variants are enabled with +sve2. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62145 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361619 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 09:56:23 +00:00
Cullen Rhodes	28f2dc5070	[AArch64][SVE2] Asm: add integer add/sub long/wide instructions Summary: Patch adds support for the following instructions: SVE2 integer add/subtract long: * SADDLB, SADDLT, UADDLB, UADDLT, SSUBLB, SSUBLT, USUBLB, USUBLT, SABDLB, SABDLT, UABDLB, UABDLT SVE2 integer add/subtract wide: * SADDWB, SADDWT, UADDWB, UADDWT, SSUBWB, SSUBWT, USUBWB, USUBWT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62142 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361615 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 09:28:27 +00:00
Bjorn Pettersson	d7d580485b	Use the DataLayout::typeSizeEqualsStoreSize helper. NFC Just a minor refactoring to use the new helper method DataLayout::typeSizeEqualsStoreSize(). This is done when checking if getTypeSizeInBits is equal/non-equal to getTypeStoreSizeInBits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361613 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 09:20:20 +00:00
Cullen Rhodes	d0eae6cedf	[AArch64][SVE2] Asm: add various bitwise shift instructions Summary: This patch adds support for the SVE2 saturating/rounding bitwise shift left (predicated) group of instructions: * SRSHL, URSHL, SRSHLR, URSHLR, SQSHL, UQSHL, SQRSHL, UQRSHL, SQSHLR, UQSHLR, SQRSHLR, UQRSHLR Immediate forms of the SQSHL and UQSHL instructions are also added to the existing SVE bitwise shift by immediate (predicated) group, as well as three new instructions SRSHR/URSHR/SQSHLU. The new instructions in this group are encoded similarly and are implemented using the same TableGen class with a minimal change (1 bit in encoding). The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62140 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361612 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 09:17:23 +00:00
Cullen Rhodes	942e9a5501	[AArch64][SVE2] Asm: add saturating add/sub instructions Summary: Patch adds support for the following instructions: * SQADD, UQADD, SUQADD, USQADD * SQSUB, UQSUB, SQSUBR, UQSUBR The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62130 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361611 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 09:06:37 +00:00
Neil Henning	5dde296faa	StructurizeCFG: Relax uniformity checks. This change relaxes the checks for hasOnlyUniformBranches such that our region is uniform if: 1. All conditional branches that are direct children are uniform. 2. And either: a. All sub-regions are uniform. b. There is one or less conditional branches among the direct children. Differential Revision: https://reviews.llvm.org/D62198 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361610 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 08:59:17 +00:00
Cullen Rhodes	6a95aa40a9	[AArch64][SVE2] Asm: fix overlapping bit Summary: Bit 20 in sve2_int_arith_pred TableGen class was overlapping. The encodings are not affected as bit 20 is defined by the opc bits and this was overwriting the earlier error of setting bit 20 to 0. Raised by Momchil: https://reviews.llvm.org/D62130 Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62292 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361609 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 08:45:37 +00:00
Tim Northover	63ef5c068b	GlobalISel: support swifterror attribute on AArch64. swifterror marks an argument as a register pretending to be a pointer, so we need a guaranteed mem2reg-like analysis of its uses. Fortunately most of the infrastructure can be reused from the DAG world. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361608 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 08:40:13 +00:00
Tim Northover	14ed588ce0	CodeGen: factor out swifterror value tracking. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361607 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 08:39:43 +00:00
Simon Atanasyan	81f980568b	[mips] Always check that `shift and add` optimization is efficient. The D45316 introduced the `shouldTransformMulToShiftsAddsSubs` function to check that breaking down constant multiplications into a series of shifts, adds, and subs is efficient. Unfortunately, this function does not check maximum number of steps on all paths of the algorithm. This patch fixes this bug. Fix for PR41929. Differential Revision: https://reviews.llvm.org/D62166 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361606 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 08:39:40 +00:00
Bjorn Pettersson	033d04ba3c	[DSE] Bugfix to avoid PartialStoreMerging involving non byte-sized stores Summary: The DeadStoreElimination pass now skips doing PartialStoreMerging when stores overlap according to OW_PartialEarlierWithFullLater and at least one of the stores is having a store size that is different from the size of the type being stored. This solves problems seen in https://bugs.llvm.org/show_bug.cgi?id=41949 for which we in the past could end up with mis-compiles or assertions. The content and location of the padding bits is not formally described (or undefined) in the LangRef at the moment. So the solution is chosen based on that we cannot assume anything about the padding bits when having a store that clobbers more memory than indicated by the type of the value that is stored (such as storing an i6 using an 8-bit store instruction). Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949 Reviewers: spatel, efriedma, fhahn Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62250 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361605 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 08:32:02 +00:00
Sjoerd Meijer	c9505f181c	[ARM] ARMExpandPseudoInsts: add debug messages This pass wasn't printing any messages at all, which I find really inconvenient while debugging/tracing things. It now dumps the before and after of expanded instructions. It doesn't do this yet for all instructions, but this is a good start I guess. Differential Revision: https://reviews.llvm.org/D62297 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361604 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 08:25:02 +00:00
QingShan Zhang	f6b47d2a4f	[Power9] Add a specific heuristic to schedule the addi before the load When we are scheduling the load and addi, if all other heuristic didn't take effect, we will try to schedule the addi before the load, to hide the latency, and avoid the true dependency added by RA. And this only take effects for Power9. Differential Revision: https://reviews.llvm.org/D61930 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361600 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 05:30:09 +00:00
Yevgeny Rouban	277c55c7bc	[NFC] SwitchInst: Introduce wrapper for prof branch_weights handling This patch introduces a wrapper class that re-implements several mutator methods of SwitchInst to handle changes of prof branch_weights metadata along with remove/add switch case methods. Subsequent patches will use this wrapper to implement prof branch_weights metadata handling for SwitchInst. Reviewers: davidx, eraman, reames, chandlerc Reviewed By: davidx Differential Revision: https://reviews.llvm.org/D62122 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361596 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 04:34:23 +00:00
David Blaikie	ee1b58a962	dwarfdump: Deterministically... determine whether parsing a DWARF32 or DWARF64 str_offsets header Rather than trying one and then the other - use the kind of the CU to select which kind of header to parse. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361589 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 01:41:58 +00:00
Reid Kleckner	85ea819c59	[AArch64] Preserve X8 for thunks ending in variadic musttail calls Summary: On Windows, X8 may be used to pass in the address of an aggregate that is returned indirectly. Therefore, it should be forwarded to variadic musttail calls and preserved in thunks. Fixes PR41997 Reviewers: mgrang, efriedma Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62344 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361585 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 01:27:20 +00:00
Serge Pavlov	c2bb63e2a3	[AArch64] Add nvcast patterns for v2f32 -> v1f64 Summary: Constant stores of f32 values can create such NvCast nodes. Reviewers: t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62285 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361584 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 01:20:34 +00:00
David Blaikie	deeb5d51e3	dwarfdump: Add a bit more DWARF64 support This test case was incorrect because it mixed DWARF32 and DWARF64 for a single unit (DWARF32 unit referencing a DWARF64 str_offsets section). So fix enough of the unit parsing for DWARF64 and make the test valid. (not sure if anyone needs DWARF64 support though - support in libDebugInfoDWARF has been added piecemeal and LLVM doesn't produce it at all) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361582 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 01:05:52 +00:00
Eli Friedman	5842fc763e	Revert r361460 It regresses https://bugs.llvm.org/show_bug.cgi?id=38309 (represented by the testcase test/Transforms/GlobalOpt/globalsra-multigep.ll). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361581 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 01:03:51 +00:00
Thomas Lively	608234d380	[WebAssembly] Expand more SIMD float ops Summary: These were previously causing ISel failures. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62354 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361577 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 00:15:04 +00:00
Sanjay Patel	4888407568	[InstSimplify] fold insertelement-of-extractelement This was partly handled in InstCombine (only the constant index case), so delete that and zap it more generally in InstSimplify. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361576 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 00:13:58 +00:00
Sanjay Patel	3cce33d6aa	[InstCombine] remove redundant fold for extractelement; NFC The out-of-bounds index pattern is handled by InstSimplify, so the extractelement should be eliminated next time it is visited. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361570 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 23:33:38 +00:00
Sanjay Patel	71b5f9d74a	[InstCombine] remove redundant fold for insertelement; NFC The out-of-bounds index pattern is handled by InstSimplify. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361569 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 23:33:34 +00:00
Alina Sbirlea	08cb19517b	[NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC]. Summary: Mirror tuning option from old pass manager in new pass manager. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61612 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361560 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 21:52:59 +00:00
Sanjay Patel	8dc3a075f3	[InstSimplify] insertelement V, undef, ? --> V This was part of InstCombine, but it's better placed in InstSimplify. InstCombine also had an unreachable but weaker fold for insertelement with undef index, so that is deleted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361559 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 21:49:47 +00:00
Kit Barton	6b7198af28	Revert [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch. This reverts r361517 (git commit 2049e4dd8f61100f88f14db33bd95d197bcbfbbc) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361553 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 20:53:05 +00:00
Sanjay Patel	52cdffd499	[DAGCombiner] make folds of binops safe for opcodes that produce >1 value This is no-functional-change-intended currently because the definition of isBinOp() only includes opcodes that produce 1 value. But if we share that implementation with isCommutativeBinOp() as proposed in D62191, then we need to make sure that the callers bail out for opcodes that they are not prepared to handle correctly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361547 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 20:17:25 +00:00
Matt Arsenault	32a10ab5dd	AMDGPU: Correct maximum possible private allocation size We were assuming a much larger possible per-wave visible stack allocation than is possible: `faa3ae5138/src/core/runtime/amd_gpu_agent.cpp (L70)` Based on this, we can assume the high 15 bits of a frame index or sret are 0. The frame index value is the per-lane offset, so the maximum frame index value is MAX_WAVE_SCRATCH / wavesize. Remove the corresponding subtarget feature and option that made this configurable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361541 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 19:38:14 +00:00
Alina Sbirlea	89828fde7f	[NewPassManager] Add tuning option: LoopUnrolling [NFC]. Summary: Mirror tuning option from old pass manager in new pass manager. Reviewers: chandlerc Subscribers: jlebar, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61618 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361540 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 19:35:40 +00:00
Alina Sbirlea	a437b5d09b	[SLPVectorizer] Set flag to previous default. Summary: The refactoring in r360276 moved the `RunSLPVectorization` flag and added the default explicitly. The default should have been `false`, as before. The new pass manager used to have SLPVectorization on by default, now it's off in opt, and needs D61617 checked in to enable it in clang. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, eraman, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61955 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361537 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 19:07:41 +00:00
Sanjay Patel	a7df0fa47f	[InstCombine] be more careful when transforming a shuffle mask This is reduced from a fuzzer test: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14890 Usually, demanded elements should be able to simplify shuffle mask elements that are pointing to undef elements of its source operands, but that doesn't happen in the test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361533 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 18:46:03 +00:00
Robert Lougher	3891f7294d	Resubmit r360436 "[X86] Avoid SFB - Fix inconsistent codegen with/without debug info" Fixes https://bugs.llvm.org/show_bug.cgi?id=40969 The functions findPotentiallyBlockedCopies and buildCopy are currently not accounting for the presence of debug instructions. In the former this results in the optimization not being trigerred, and in the latter results in inconsistent codegen. This patch enables the optimization to be performed in a debug build and ensures the codegen is consistent with non-debug builds. Patch by Chris Dawson. Differential Revision: https://reviews.llvm.org/D61680 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361527 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 18:15:12 +00:00
Thomas Lively	a94cd9166b	[WebAssembly] Implement ReplaceNodeResults to fix a SIMD crash Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61037 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361526 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 18:09:26 +00:00
Matt Arsenault	3dd7804824	AMDGPU/GlobalISel: Legality for integer min/max git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361519 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 17:58:48 +00:00
Kit Barton	b13ba2bf8c	[LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch. Summary: This PR extends the loop object with more utilities to get loop bounds, step, induction variable, and guard branch. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. Moreover, loop fusion (https://reviews.llvm.org/D55851) is planning to use getGuard() to extend the kind of loops it is able to fuse, e.g. rotated loop with non-constant upper bound, which would have a loop guard. /// Example: /// for (int i = lb; i < ub; i+=step) /// <loop body> /// --- pseudo LLVMIR --- /// beforeloop: /// guardcmp = (lb < ub) /// if (guardcmp) goto preheader; else goto afterloop /// preheader: /// loop: /// i1 = phi[{lb, preheader}, {i2, latch}] /// <loop body> /// i2 = i1 + step /// latch: /// cmp = (i2 < ub) /// if (cmp) goto loop /// exit: /// afterloop: /// /// getBounds /// getInitialIVValue --> lb /// getStepInst --> i2 = i1 + step /// getStepValue --> step /// getFinalIVValue --> ub /// getCanonicalPredicate --> '<' /// getDirection --> Increasing /// getGuard --> if (guardcmp) goto loop; else goto afterloop /// getInductionVariable --> i1 /// getAuxiliaryInductionVariable --> {i1} /// isCanonical --> false Committed on behalf of @Whitney (Whitney Tsang). Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn Reviewed By: kbarton Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60565 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361517 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 17:56:35 +00:00
Thomas Lively	dd5b336ef6	[WebAssembly] Add multivalue and tail-call target features Summary: These features will both be implemented soon, so I thought I would save time by adding the boilerplate for both of them at the same time. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D62047 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361516 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 17:26:47 +00:00
Thomas Preud'homme	5e1efa61df	[FileCheck] Remove llvm:: prefix Summary: Remove all llvm:: prefixes in FileCheck library header and implementation except for calls to make_unique and make_shared since both files already use the llvm namespace. Reviewers: jhenderson, jdenny, probinson, arichardson Subscribers: hiraditya, arichardson, probinson, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62323 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361515 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 17:19:36 +00:00
Saleem Abdulrasool	b1cc82a249	Transforms: lower fadd and fsub atomicrmw instructions `fadd` and `fsub` have recently (r351850) been added as `atomicrmw` operations. This diff adds lowering cases for them to the LowerAtomic transform. Patch by Josh Berdine! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361512 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-23 17:03:43 +00:00

1 2 3 4 5 ...

123142 Commits