archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Diogo N. Sampaio	87ba52d9d8	[ARM] Add bitcast/extract_subvec. of fp16 vectors Summary: This patch adds some basic operations for fp16 vectors, such as bitcast from fp16 to i16, required to perform extract_subvector (also added here) and extract_element. Reviewers: SjoerdMeijer, DavidSpickett, t.p.northover, ostannard Reviewed By: ostannard Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60618 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359433 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-29 10:28:07 +00:00
Diogo N. Sampaio	bb84648816	[ARM] Add v4f16 and v8f16 types to the CallingConv Summary: The Procedure Call Standard for the Arm Architecture states that float16x4_t and float16x8_t behave just as uint16x4_t and uint16x8_t for argument passing. This patch adds the fp16 vectors to the ARMCallingConv.td file. Reviewers: miyuki, ostannard Reviewed By: ostannard Subscribers: ostannard, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60720 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359431 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-29 10:10:37 +00:00
Nick Desaulniers	b3cb8ab451	[AsmPrinter] refactor to support %c w/ GlobalAddress' Summary: Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when printing the address of a MachineOperand::MO_GlobalAddress. Move that handling into a new overriden method in each base class. A virtual method was added to the base class for handling the generic case. Refactors a few subclasses to support the target independent %a, %c, and %n. The patch also contains small cleanups for AVRAsmPrinter and SystemZAsmPrinter. It seems that NVPTXTargetLowering is possibly missing some logic to transform GlobalAddressSDNodes for TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended inline assembly asm constraints. Fixes: - https://bugs.llvm.org/show_bug.cgi?id=41402 - https://github.com/ClangBuiltLinux/linux/issues/449 Reviewers: echristo, void Reviewed By: void Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60887 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359337 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-26 18:45:04 +00:00
Diogo N. Sampaio	4204ce0fa7	[ARM][FIX] Add missing f16.lane.vldN/vstN lowering Summary: Add missing D and Q lane VLDSTLane lowering for fp16 elements. Reviewers: efriedma, kosarev, SjoerdMeijer, ostannard Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60874 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358962 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-23 09:36:39 +00:00
Nick Desaulniers	4fe807c808	[AsmPrinter] defer %c to base class for ARM, PPC, and Hexagon. NFC Summary: None of these derived classes do anything that the base class cannot. If we remove these case statements, then the base class can handle them just fine. Reviewers: peter.smith, echristo Reviewed By: echristo Subscribers: nemanjai, javed.absar, eraman, kristof.beyls, hiraditya, kbarton, jsji, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60803 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358603 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-17 18:22:48 +00:00
Sanjay Patel	ee018500ab	[ARM] tighten test checks; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358594 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-17 16:51:09 +00:00
Sanjay Patel	78395fb2fe	[ARM] make test checks more thorough; NFC This will change with the proposal in D60214. Unfortunately, the triple is not supported for auto-generation via script, and the multiple RUN lines have diffs on this test, but I can't tell exactly what is required by this test. PR7162 was an assert/crash, so hopefully, this is good enough. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358587 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-17 16:02:07 +00:00
Hans Wennborg	a2389c7d86	Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259) The original commit caused false positives from AddressSanitizer's use-after-scope checks, which have now been fixed in r358478. > The code was previously checking that candidates for sinking had exactly > one use or were a store instruction (which can't have uses). This meant > we could sink call instructions only if they had a use. > > That limitation seemed a bit arbitrary, so this patch changes it to > "instruction has zero or one use" which seems more natural and removes > the need to special-case stores. > > Differential revision: https://reviews.llvm.org/D59936 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358483 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-16 12:13:25 +00:00
Amara Emerson	040f61e117	[GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 `8033492` -0.6% test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358369 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-15 05:04:20 +00:00
Sanjay Patel	e5a55cebc1	[DAGCombiner] narrow shuffle of concatenated vectors // shuffle (concat X, undef), (concat Y, undef), Mask --> // concat (shuffle X, Y, Mask0), (shuffle X, Y, Mask1) The ARM changes with 'vtrn' and narrowed 'vuzp' are improvements. The x86 changes look neutral or better. There's one test with an extra instruction, but that could be reversed for a subtarget with the right attributes. But by default, we want to avoid the 256-bit op when possible (in my motivating benchmark, a handful of ymm ops sprinkled into a sequence of xmm ops are triggering frequency throttling on Haswell resulting in significantly worse perf). Differential Revision: https://reviews.llvm.org/D60545 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358291 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-12 16:31:56 +00:00
David Green	45a375eb6b	Revert rL357745: [SelectionDAG] Compute known bits of CopyFromReg Certain optimisations from ConstantHoisting and CGP rely on Selection DAG not seeing through to the constant in other blocks. Revert this patch while we come up with a better way to handle that. I will try to follow this up with some better tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358113 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 18:00:41 +00:00
Diogo N. Sampaio	ba3f83af6f	[ARM] [FIX] Add missing f16 vector operations lowering Summary: Add missing <8xhalf> shufflevectors pattern, when using concat_vector dag node. As well, allows <8xhalf> and <4xhalf> vldup1 operations. These instructions are required for v8.2a fp16 lowering of vmul_n_f16, vmulq_n_f16 and vmulq_lane_f16 intrinsics. Reviewers: olista01, pbarrio, LukeGeeson, efriedma Reviewed By: efriedma Subscribers: efriedma, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60319 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358081 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 13:28:06 +00:00
Diana Picus	8c8976533c	[ARM GlobalISel] Select G_FCONSTANT for VFP3 Make it possible to TableGen code for FCONSTS and FCONSTD. We need to make two changes to the TableGen descriptions of vfp_f32imm and vfp_f64imm respectively: * add GISelPredicateCode to check that the immediate fits in 8 bits; * extract the SDNodeXForms into separate definitions and create a GISDNodeXFormEquiv and a custom renderer function for each of them. There's a lot of boilerplate to get the actual value of the immediate, but it basically just boils down to calling ARM_AM::getFP32Imm or ARM_AM::getFP64Imm. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358063 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 09:14:32 +00:00
Diana Picus	d8ccf753e3	[ARM GlobalISel] Select G_FCONSTANT into pools Put all floating point constants into constant pools and load their values from there. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358062 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 09:14:24 +00:00
Diana Picus	735eff80dd	[ARM GlobalISel] Map G_FCONSTANT git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358061 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 09:14:16 +00:00
Amara Emerson	e5bb56cb3c	[GlobalISel][AArch64] Allow CallLowering to handle types which are normally required to be passed as different register types. E.g. <2 x i16> may need to be passed as a larger <2 x i32> type, so formal arg lowering needs to be able truncate it back. Likewise, when dealing with returns of these types, they need to be widened in the appropriate way back. Differential Revision: https://reviews.llvm.org/D60425 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358032 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-09 21:22:33 +00:00
Simon Pilgrim	4cd136da59	[SelectionDAG] Add fcmp UNDEF handling to SelectionDAG::FoldSetCC Second half of PR40800, this patch adds DAG undef handling to fcmp instructions to match the behavior in llvm::ConstantFoldCompareInstruction, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.). This involves a lot of tweaking to reduced tests as bugpoint loves to reduce fcmp arguments to undef........ Differential Revision: https://reviews.llvm.org/D60006 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357765 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-05 14:56:21 +00:00
Piotr Sobczak	959b42493f	[SelectionDAG] Compute known bits of CopyFromReg Summary: Teach SelectionDAG how to compute known bits of ISD::CopyFromReg if the virtual reg used has one def only. This can be particularly useful when calling isBaseWithConstantOffset() with the ISD::CopyFromReg argument, as more optimizations may get enabled in the result. Also add a missing truncation on X86, found by testing of this patch. Change-Id: Id1c9fceec862d118c54a5b53adf72ada5d6daefa Reviewers: bogner, craig.topper, RKSimon Reviewed By: RKSimon Subscribers: lebedev.ri, nemanjai, jvesely, nhaehnle, javed.absar, jsji, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59535 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357745 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-05 07:44:09 +00:00
Diana Picus	80733cc0e9	[ARM GlobalISel] Support DBG_VALUE Make sure we can map and select DBG_VALUE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357681 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-04 10:24:51 +00:00
David L. Jones	a8e2ae4a63	Revert r357452 - 'SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)' This revision causes tests to fail under ASAN. Since the cause of the failures is not clear (could be ASAN, could be a Clang bug, could be a bug in this revision), the safest course of action seems to be to revert while investigating. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357667 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-04 02:27:57 +00:00
Sanjay Patel	e0e9078d54	[DAGCombiner] loosen restrictions for moving shuffles after vector binop There are 3 changes to make this correspond to the same transform in instcombine: 1. Remove the legality check - we can't create anything less legal than we started with. 2. Ease the use restriction, so we only bail out if both operands have >1 use. 3. Ease the use restriction for binops with a repeated operand (eg, mul x, x). As discussed in D60150, there's a scalarization opportunity that will be made easier by allowing this transform more generally. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357580 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-03 13:42:06 +00:00
Hans Wennborg	8abcd76492	SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259) The code was previously checking that candidates for sinking had exactly one use or were a store instruction (which can't have uses). This meant we could sink call instructions only if they had a use. That limitation seemed a bit arbitrary, so this patch changes it to "instruction has zero or one use" which seems more natural and removes the need to special-case stores. Differential revision: https://reviews.llvm.org/D59936 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357452 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-02 08:01:38 +00:00
Eli Friedman	95e6c7dcce	[ARM] Optimize expressions like "return x != 0;" for Thumb1. There's an existing optimization for x != C, but somehow it was missing a special case for 0. While I'm here, also cleaned up the code/comments a bit: the second value produced by the MERGE_VALUES was actually dead, since a CMOV only produces one result. Differential Revision: https://reviews.llvm.org/D59616 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357437 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-02 00:01:23 +00:00
Eli Friedman	271788f596	[ARM] Don't try to create "push {r12, lr}" in Thumb1 at -Oz. It's a little tricky to make this issue show up because prologue/epilogue emission normally likes to push at least two registers... but it doesn't when lr is force-spilled due to function length. Not sure if that really makes sense, but I decided not to touch it for now. Differential Revision: https://reviews.llvm.org/D59385 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357436 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-01 23:55:57 +00:00
Simon Pilgrim	7de2c6d90d	[ARM] Regenerate execute-only float comparison tests Prep work for PR40800 (Add UNDEF handling to SelectionDAG::FoldSetCC) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357293 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-29 18:21:19 +00:00
Nirav Dave	d3c5ebd041	[DAGCombine] Prune unnused nodes. Summary: Nodes that have no uses are eventually pruned when they are selected from the worklist. Record nodes newly added to the worklist or DAG and perform pruning after every combine attempt. Reviewers: efriedma, RKSimon, craig.topper, spatel, jyknight Reviewed By: jyknight Subscribers: jdoerfert, jyknight, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58070 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357283 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-29 17:35:56 +00:00
Simon Pilgrim	2a7633ae4d	[ARM] Regenerate vector comparison tests Prep work for PR40800 (Add UNDEF handling to SelectionDAG::FoldSetCC) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357281 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-29 17:35:11 +00:00
Diana Picus	5dc7f24f6a	[ARM GlobalISel] Run regbankselect test for Thumb. NFCI This should just work, since ARM mode and Thumb2 mode are at the same level of support now and should map the same to GPR and FPR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357159 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-28 10:57:29 +00:00
Diana Picus	b94dc88e01	[ARM GlobalISel] Fix G_STORE with s1 G_STORE for 1-bit values uses a STRBi12, which stores the whole byte. Zero out the undefined bits before writing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357154 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-28 09:09:36 +00:00
Diana Picus	51a43ae6d5	[ARM GlobalISel] Fix selection of G_SELECT G_SELECT uses a 1-bit scalar for the condition, and is currently implemented with a plain CMPri against 0. This means that values such as 0x1110 are interpreted as true, when instead the higher bits should be treated as undefined and therefore ignored. Replace the CMPri with a TSTri against 0x1, which performs an implicit AND, yielding the expected result. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357153 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-28 09:09:27 +00:00
Nirav Dave	b4adfc21eb	Revert r356996 "[DAG] Avoid smart constructor-based dangling nodes." This patch appears to trigger very large compile time increases in halide builds. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357116 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 19:54:41 +00:00
Eli Friedman	953adb2fe4	[ARM] Don't confuse the scheduler for very large VLDMDIA etc. ARMBaseInstrInfo::getNumLDMAddresses is making bad assumptions about the memory operands of load and store-multiple operations. This doesn't really fix the problem properly, but it's enough to prevent crashing, at least. Fixes https://bugs.llvm.org/show_bug.cgi?id=41231 . Differential Revision: https://reviews.llvm.org/D59834 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357109 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 18:33:30 +00:00
Nirav Dave	de6ac6d211	[DAG] Avoid smart constructor-based dangling nodes. Various SelectionDAG non-combine operations (e.g. the getNode smart constructor and legalization) may leave dangling nodes by applying optimizations or not fully pruning unused result values. This can result in nodes that are never added to the worklist and therefore can not be pruned. Add a node inserter as the current node deleter to make sure such nodes have the chance of being pruned. Many minor changes, mostly positive. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356996 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-26 15:08:14 +00:00
Diana Picus	fa60fe0a8f	[ARM GlobalISel] 64-bit memops should be aligned We currently use only VLDR/VSTR for all 64-bit loads/stores, so the memory operands must be word-aligned. Mark aligned operations as legal and narrow non-aligned ones to 32 bits. While we're here, also mark non-power-of-2 loads/stores as unsupported. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356872 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-25 08:54:29 +00:00
Eli Friedman	925f59138d	[ARM] Don't form "ands" when it isn't scheduled correctly. In r322972/r323136, the iteration here was changed to catch cases at the beginning of a basic block... but we accidentally deleted an important safety check. Restore that check to the way it was. Fixes https://bugs.llvm.org/show_bug.cgi?id=41116 Differential Revision: https://reviews.llvm.org/D59680 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356809 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-22 20:49:15 +00:00
Evandro Menezes	f00c21b122	[AArch64, ARM] Add support for Exynos M5 Add Exynos M5 support and test cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356793 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-22 18:42:14 +00:00
Eli Friedman	283e227ba6	[ARM] Eliminate redundant "mov rN, sp" instructions in Thumb1. This takes sequences like "mov r4, sp; str r0, [r4]", and optimizes them to something like "str r0, [sp]". For regular stack variables, this optimization was already implemented: we lower loads and stores using frame indexes, which are expanded later. However, when constructing a call frame for a call with more than four arguments, the existing optimization doesn't apply. We need to use stores which are actually relative to the current value of sp, and don't have an associated frame index. This patch adds a special case to handle that construct. At the DAG level, this is an ISD::STORE where the address is a CopyFromReg from SP (plus a small constant offset). This applies only to Thumb1: in Thumb2 or ARM mode, a regular store instruction can access SP directly, so the COPY gets eliminated by existing code. The change to ARMDAGToDAGISel::SelectThumbAddrModeSP is a related cleanup: we shouldn't pretend that it can select anything other than frame indexes. Differential Revision: https://reviews.llvm.org/D59568 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356601 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-20 19:40:45 +00:00
Matt Arsenault	51c2ad77cd	RegAllocFast: Remove early selection loop, the spill calculation will report cost 0 anyway for free regs The 2nd loop calculates spill costs but reports free registers as cost 0 anyway, so there is little benefit from having a separate early loop. Surprisingly this is not NFC, as many register are marked regDisabled so the first loop often picks up later registers unnecessarily instead of the first one available in the allocation order... Patch by Matthias Braun git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356499 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-19 19:01:34 +00:00
Eli Friedman	70a59574dd	[ARM] Add MachineVerifier logic for some Thumb1 instructions. tMOVr and tPUSH/tPOP/tPOP_RET have register constraints which can't be expressed in TableGen, so check them explicitly. I've unfortunately run into issues with both of these recently; hopefully this saves some time for someone else in the future. Differential Revision: https://reviews.llvm.org/D59383 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356303 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-15 21:44:49 +00:00
Sam Parker	7eaac1c18c	[ARM] Remove EarlyCSE from backend There is an issue with early CSE hitting an assert, so temporarily remove the pass from the Arm backend. Bug: https://bugs.llvm.org/show_bug.cgi?id=41081 Differential Revision: https://reviews.llvm.org/D59410 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356259 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-15 13:36:37 +00:00
Simon Pilgrim	13f8e3c482	[ARM] Remove icmp undef from reduced tests Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC) Approved by @efriedma (Eli Friedman) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356252 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-15 11:14:59 +00:00
Sam Parker	42dcf56122	[ARM][ParallelDSP] Disable for big-endian Bail early when we don't have a preheader and also if the target is big endian because it's written with only little endian in mind! Differential Revision: https://reviews.llvm.org/D59368 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356243 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-15 10:19:32 +00:00
Sam Parker	acbed856d2	[NFC][ARM] Update test Change some regex to handle commutable instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356159 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-14 15:36:54 +00:00
Matt Arsenault	496c1dd07c	ARM: Add ImmArg to intrinsics I found these by asserting in clang for any GCCBuiltin that doesn't require mangling and requires a constant for the builtin. This means that intrinsics are missing which don't use GCCBuiltin, don't have builtins defined in clang, or were missing the constant annotation in the builtin definition. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356144 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-14 13:46:14 +00:00
Sam Parker	571398105e	[ARM][ParallelDSP] Enable multiple uses of loads When choosing whether a pair of loads can be combined into a single wide load, we check that the load only has a sext user and that sext also only has one user. But this can prevent the transformation in the cases when parallel macs use the same loaded data multiple times. To enable this, we need to fix up any other uses after creating the wide load: generating a trunc and a shift + trunc pair to recreate the narrow values. We also need to keep a record of which loads have already been widened. Differential Revision: https://reviews.llvm.org/D59215 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356132 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-14 11:14:13 +00:00
Sam Parker	01f20a4ee2	[ARM] Run ARMParallelDSP in the IRPasses phase Run EarlyCSE before ParallelDSP and do this in the backend IR opt phase. Differential Revision: https://reviews.llvm.org/D59257 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356130 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-14 10:57:40 +00:00
Nirav Dave	154874adc5	[DAGCombiner] If a TokenFactor would be merged into its user, consider the user later. Summary: A number of optimizations are inhibited by single-use TokenFactors not being merged into the TokenFactor using it. This makes we consider if we can do the merge immediately. Most tests changes here are due to the change in visitation causing minor reorderings and associated reassociation of paired memory operations. CodeGen tests with non-reordering changes: X86/aligned-variadic.ll -- memory-based add folded into stored leaq value. X86/constant-combiners.ll -- Optimizes out overlap between stores. X86/pr40631_deadstore_elision -- folds constant byte store into preceding quad word constant store. Reviewers: RKSimon, craig.topper, spatel, efriedma, courbet Reviewed By: courbet Subscribers: dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, eraman, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59260 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356068 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-13 17:07:09 +00:00
Nikita Popov	f7a652c414	[SDAG] Expand pow2 mulo using shifts Expand MULO with constant power of two operand into a shift. The overflow is checked with (x << shift) >> shift == x, where the right shift will be logical for umulo and arithmetic for smulo (with exception for multiplications by signed_min). Differential Revision: https://reviews.llvm.org/D59041 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355937 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 16:57:25 +00:00
Sam Parker	88f0c562f5	[ARM][NFC] Delete original smlad tests Because I don't understand svn. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355908 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 11:06:15 +00:00
Sam Parker	a7682f4fed	[ARM][NFC] Move smlad tests Created a test/CodeGen/ARM/ParallelDSP folder. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355907 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 11:01:11 +00:00

1 2 3 4 5 ...

3787 Commits