archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Amara Emerson	f69362835a	Re-commit r302678, fixing PR33053. The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions which didn't have a lowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303211 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 21:29:22 +00:00
Tim Shen	50ecf9b407	[PPC] Add -ppc-asm-full-reg-names to atomic-2.ll. NFC. Differential Revisions: https://reviews.llvm.org/D32763 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303209 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 20:58:55 +00:00
Tim Shen	1a2e7acb99	[PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync. Summary: This fixes pr32392. The lowering pipeline is: llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in expandPostRAPseudo. The reason why expandPostRAPseudo is chosen is because previous passes are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne- 7, .+4 (some branch pass(s)). Differential Revision: https://reviews.llvm.org/D32763 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303205 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 20:18:06 +00:00
Reid Kleckner	c4cdac05ab	Revert "[X86] Replace slow LEA instructions in X86" This reverts commit r303183, it broke various buildbots and introduced sanitizer errors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303199 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 19:55:03 +00:00
Nirav Dave	acc2c1d71d	Elide stores which are overwritten without being observed. Summary: In SelectionDAG, when a store is immediately chained to another store to the same address, elide the first store as it has no observable effects. This is causes small improvements dealing with intrinsics lowered to stores. Test notes: * Many testcases overwrite store addresses multiple times and needed minor changes, mainly making stores volatile to prevent the optimization from optimizing the test away. * Many X86 test cases optimized out instructions associated with associated with va_start. * Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has dependencies to check and can probably be removed and potentially replaced with another test. Reviewers: rnk, john.brawn Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33206 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303198 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 19:43:56 +00:00
Renato Golin	15886e1270	Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove" Revert "[ARM] Mark LEApcrel as not having side effects" This reverts commit r303054 and r303053, as they broke the ARM self-hosting buildbots: http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/1550 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/1349 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/1845 Offline investigation on course. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303193 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 17:59:07 +00:00
Lama Saba	371c802090	[X86] Replace slow LEA instructions in X86 According to Intel's Optimization Reference Manual for SNB+: " For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must dispatch via port 1: - LEA that has all three source operands: base, index, and offset - LEA that uses base and index registers where the base is EBP, RBP,or R13 - LEA that uses RIP relative addressing mode - LEA that uses 16-bit addressing mode " This patch currently handles the first 2 cases only. Differential Revision: https://reviews.llvm.org/D32277 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303183 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 16:01:36 +00:00
Igor Breger	6ab1190a77	[GlobalISel][X86] Split memop test file. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303169 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 13:37:31 +00:00
Francis Visoiu Mistrih	cc8486f611	[ShrinkWrapping] Handle restores on no-return paths Shrink-wrapping uses post-dominators to find a restore point that post-dominates all the uses of CSR / stack. The way dominator trees are modeled in LLVM today is that unreachable blocks are not present in a generic dominator tree, so, an unreachable node is dominated by anything: include/llvm/Support/GenericDomTree.h:467. Since for post-dominators, a no-return block is considered "unreachable", calling findNearestCommonDominator on an unreachable node A and a non-unreachable node B, will return B, which can be false. If we find such node, we bail out since there is no good restore point available. rdar://problem/30186931 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303130 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 23:13:35 +00:00
Tim Northover	87dff1dfcd	AArch64: use linker-private symbols for globals in MachO. We don't use section-relative relocations on AArch64, so all symbols must be at least visible to the linker (i.e. properly global or l_whatever, but not L_whatever). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303118 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 21:51:38 +00:00
Hans Wennborg	05f671ecfa	Revert r302678 "[AArch64] Enable use of reduction intrinsics." This caused PR33053. Original commit message: > The new experimental reduction intrinsics can now be used, so I'm enabling this > for AArch64. We will need this for SVE anyway, so it makes sense to do this for > NEON reductions as well. > > The existing code to match shufflevector patterns are replaced with a direct > lowering of the reductions to AArch64-specific nodes. Tests updated with the > new, simpler, representation. > > Differential Revision: https://reviews.llvm.org/D32247 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303115 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 20:59:32 +00:00
Kyle Butt	e6202480d9	CodeGen: BlockPlacement: Increase tail duplication size for O3. At O3 we are more willing to increase size if we believe it will improve performance. The current threshold for tail-duplication of 2 instructions is conservative, and can be relaxed at O3. Benchmark results: llvm test-suite: 6% improvement in aha, due to duplication of loop latch 3% improvement in hexxagon 2% slowdown in lpbench. Seems related, but couldn't completely diagnose. Internal google benchmark: Produces 4% improvement on internal google protocol buffer serialization benchmarks. Differential-Revision: https://reviews.llvm.org/D32324 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303084 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 17:30:47 +00:00
Simon Pilgrim	2223371da5	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) Follow up to D33147 NVPTXTargetLowering::LowerCall was trusting the default argument values. Fixes another 17 of the NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146. Differential Revision: https://reviews.llvm.org/D33189 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303082 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 17:17:44 +00:00
Florian Hahn	eb48e7d58f	[AArch64] Enable FeatureFuseAES on Cortex-A72. This patch enables fusing dependent AESE/AESMC and AESD/AESIMC instruction pairs on Cortex-A72, as recommended in the Software Optimization Guide, section 4.10. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303073 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 15:15:22 +00:00
Dmitry Preobrazhensky	232c3d52ea	[AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64 See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33123 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303070 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 14:28:23 +00:00
Dinar Temirbulatov	8632b74d7d	Test commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303059 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 13:14:04 +00:00
John Brawn	57bb7925b5	[ARM] Mark LEApcrel instructions as isAsCheapAsAMove Doing this means that if an LEApcrel is used in two places we will rematerialize instead of generating two MOVs. This is particularly useful for printfs using the same format string, where we want to generate an address into a register that's going to get corrupted by the call. Differential Revision: https://reviews.llvm.org/D32858 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303054 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 11:57:54 +00:00
John Brawn	91719efd8f	[ARM] Mark LEApcrel as not having side effects Doing this lets us hoist it out of loops, and I've also marked it as rematerializable the same as the thumb1 and thumb2 counterparts. It looks like it being marked as such was just a mistake, as the commit that made that change only mentions LEApcrelJT and in thumb1 and thumb2 only the LEApcrelJT instructions were marked as having side-effects, so it looks like the intent was to only mark LEApcrelJT as having side-effects but LEApcrel was accidentally marked as such also. Differential Revision: https://reviews.llvm.org/D32857 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303053 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 11:50:21 +00:00
Ayman Musa	eadb58fda7	[X86] Relocate code of replacement of subtarget unsupported masked memory intrinsics to run also on -O0 option. Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional). CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0). Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation. Differential Revision: https://reviews.llvm.org/D32487 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303050 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 11:30:54 +00:00
Igor Breger	4448b5e925	[GlobalISel][X86] G_BR instruction select test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303036 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 07:03:38 +00:00
Craig Topper	ac7c8ceef7	[X86] Add avx512vl command lines to the 128/256-bit vector-lzcnt tests so we can see what compare instructions are being used in the lookup table code. I noticed the 512-bit lzcnts don't use the X86 specific lookup table code and instead use the EXPAND case in LegalizeDAG. I was toying around with fixing this and noticed it would require compare instructions that generate i1 masks and then converting from mask to vector. Then I noticed that we don't test which compares are used with avx512vl and no avx512cd. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303020 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-14 19:38:11 +00:00
Craig Topper	eba544c71c	[X86] Cleanup some of the check-prefixes in the vector-lzcnt tests. Remove an unneeded prefix from the 32-bit command line. Make all the 64-bit triples match. Replace ALL with X64 and remove it from the 32-bit test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303019 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-14 19:38:09 +00:00
Simon Pilgrim	8390fe6ccf	[X86][AVX] Allow 32-bit targets to peek through subvectors to extract constant splats for vXi64 shifts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303009 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-14 11:46:26 +00:00
Simon Pilgrim	dc5d066e45	[X86][AVX] Add additional 32-bit target vector shift tests Shows issue with 32-bits not being able to peek through subvectors to extract constant splats git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303008 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-14 11:13:03 +00:00
Simon Pilgrim	72a3a14d8b	[SelectionDAG] Added support for EXTRACT_SUBVECTOR/CONCAT_VECTORS demandedelts in ComputeNumSignBits git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302997 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-13 22:10:58 +00:00
Simon Pilgrim	e7198dbba5	[X86][SSE] Test showing missing EXTRACT_SUBVECTOR/CONCAT_VECTORS demandedelts support in ComputeNumSignBits git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302994 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-13 21:50:18 +00:00
Simon Pilgrim	bacfc66c2e	[SelectionDAG] Add VECTOR_SHUFFLE support to ComputeNumSignBits git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302993 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-13 19:57:10 +00:00
Simon Pilgrim	ba64adafb1	[X86][SSE] Test showing inability of ComputeNumSignBits to resolve shuffles git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302992 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-13 17:41:07 +00:00
Simon Pilgrim	3c8af5480a	[x86, SSE] AVX1 PR28129 (256-bit all-ones rematerialization) Further perf tests on Jaguar indicate that: vxorps %ymm0, %ymm0, %ymm0 vcmpps $15, %ymm0, %ymm0, %ymm0 is consistently faster (by about 9%) than: vpcmpeqd %xmm0, %xmm0, %xmm0 vinsertf128 $1, %xmm0, %ymm0, %ymm0 Testing equivalent code on a SandyBridge (E5-2640) puts it slightly (~3%) faster as well. Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D32416 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302989 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-13 13:42:35 +00:00
Dylan McKay	af4bf77761	[AVR] When lowering Select8/Select16, put newly generated MBBs in the same spot Contributed by Dr. Gergő Érdi. Fixes a bug. Raised from (https://github.com/avr-rust/rust/issues/49). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302973 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-13 00:22:34 +00:00
Sanjay Patel	62352f820d	[x86] add vector tests for demanded bits; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302949 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 20:53:48 +00:00
Changpeng Fang	ed4c8077b0	AMDGPU/SI: Don't promote to vector if the load/store is volatile. Summary: We should not change volatile loads/stores in promoting alloca to vector. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D33107 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302943 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 20:31:12 +00:00
Simon Pilgrim	4633bbb4c7	[NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146) This fixes 47 of the 75 NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146. Differential Revision: https://reviews.llvm.org/D33147 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302942 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 19:56:43 +00:00
Dehao Chen	0faf9ed31e	Add LiveRangeShrink pass to shrink live range within BB. Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB. Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb Reviewed By: MatzeB, andreadb Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D32563 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302938 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 19:29:27 +00:00
Tom Stellard	593d52aaad	AMDGPU: Add lit.local.cfg to disable global-isel tests when global-isel is disabled This should fix bots broken by r302919. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302928 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 17:59:30 +00:00
Tom Stellard	f366f4cc57	AMDGPU/GlobalISel: Mark 32-bit integer constants as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33115 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302919 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 16:46:46 +00:00
James Y Knight	c5d0c88a98	[SPARC] Support 'f' and 'e' inline asm constraints. Based on patch by Patrick Boettcher and Chris Dewhurst. Differential Revision: https://reviews.llvm.org/D29116 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302911 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 15:59:10 +00:00
Sanjay Patel	0ee4a484f7	[x86] add tests for potential vector narrowing optimization (PR32790) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302910 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 15:56:39 +00:00
Jonas Paulsson	668e541eed	Handle a COPY with undef source operand in LowerCopy() Llvm-stress discovered that a COPY may end up in ExpandPostRA::LowerCopy() with an undef source operand. It is not possible for the target to handle this, as this flag is not passed to TII->copyPhysReg(). This patch solves this by treating such a COPY as an identity COPY. Review: Matthias Braun https://reviews.llvm.org/D32892 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302877 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 06:32:03 +00:00
Mikael Holmen	e43c10d201	[IfConversion] Keep the CFG updated incrementally in IfConvertTriangle Summary: Instead of using RemoveExtraEdges (which uses analyzeBranch, which cannot always be trusted) at the end to fixup the CFG we keep the CFG updated as we go along and remove or add branches and merge blocks. This way we won't have any problems if the involved MBBs contain unanalyzable instructions. This fixes PR32721. In that case we had a triangle EBB \| \ \| \| \| TBB \| / FBB where FBB didn't have any successors at all since it ended with an unconditional return. Then TBB and FBB were be merged into EBB, but EBB would still keep its successors, and the use of analyzeBranch and CorrectExtraCFGEdges wouldn't help to remove them since the return instruction is not analyzable (at least not on ARM). Reviewers: kparzysz, iteratee, MatzeB Reviewed By: iteratee Subscribers: aemerson, rengolin, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33037 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302876 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 06:28:58 +00:00
Guozhi Wei	d3fe6038ab	[PPC] Change the register constraint of the first source operand of instruction mtvsrdd to g8rc_nox0 According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0. This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified. Differential Revision: https://reviews.llvm.org/D32880 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302834 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 22:17:35 +00:00
Chad Rosier	2d534b8ab2	[AArch64][MachineCombine] Fold FNMUL+FSUB -> FNMADD. Differential Revision: http://reviews.llvm.org/D33101. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302822 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 20:07:24 +00:00
Vadzim Dambrouski	29165da1cd	[MSP430] Generate EABI-compliant libcalls Updates the MSP430 target to generate EABI-compatible libcall names. As a byproduct, adjusts the hardware multiplier options available in the MSP430 target, adds support for promotion of the ISD::MUL operation for 8-bit integers, and correctly marks R11 as used by call instructions. Patch by Andrew Wygle. Differential Revision: https://reviews.llvm.org/D32676 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302820 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 19:56:14 +00:00
Matt Arsenault	9f4e5a06c6	AMDGPU: Remove tfe bit from flat instruction definitions We don't use it and it was removed in gfx9, and the encoding bit repurposed. Additionally actually using it requires changing the output register class, which wasn't done anyway. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302814 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 17:38:33 +00:00
Matt Arsenault	2bbb56fd75	AMDGPU: Pull fneg out of extract_vector_elt This allows folding source modifiers in more f16 cases. Makes it easier to select per-component packed neg modifiers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302813 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 17:26:25 +00:00
Nemanja Ivanovic	0470a16690	[PowerPC] Eliminate integer compare instructions - vol. 1 This patch is the first in a series of patches to provide code gen for doing compares in GPRs when the compare result is required in a GPR. It adds the infrastructure to select GPR sequences for i1->i32 and i1->i64 extensions. This first patch handles equality comparison on i32 operands with the result sign or zero extended. Differential Revision: https://reviews.llvm.org/D31847 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302810 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 16:54:23 +00:00
Simon Pilgrim	9a01b517fb	[X86][AVX] Added zeroall/zeroupper scheduler tests Missing on SandyBridge and Btver2 models git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302804 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 15:02:49 +00:00
Chandler Carruth	c4b356568b	[x86] Fix a failure to select with AVX-512 when the type legalizer manages to form a VSELECT with a non-i1 element type condition. Those are technically allowed in SDAG (at least, the generic type legalization logic will form them and I wouldn't want to try to audit everything te preclude forming them) so we need to be able to lower them. This isn't too hard to implement. We mark VSELECT as custom so we get a chance in C++, add a fast path for i1 conditions to get directly handled by the patterns, and a fallback when we need to manually force the condition to be an i1 that uses the vptestm instruction to turn a non-mask into a mask. This, unsurprisingly, generates awful code. But it at least doesn't crash. This was actually impacting open source packages built with LLVM for AVX-512 in the wild, so quickly landing a patch that at least stops the immediate bleeding. I think I've found where to fix the codegen quality issue, but less confident of that change so separating it out from the thing that doesn't change the result of any existing test case but causes mine to not crash. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302785 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 10:52:16 +00:00
Diana Picus	c978c0ff91	[ARM][GlobalISel] Legalize narrow scalar ops by widening This is the same as r292827 for AArch64: we widen 8- and 16-bit ADD, SUB and MUL to 32 bits since we only have TableGen patterns for 32 bits. See the commit message for r292827 for more details. At this point we could just remove some of the tests for regbankselect and instruction-select, since we're not going to see any narrow operations at those levels anymore. Instead I decided to update them with G_ANYEXT/G_TRUNC operations, so we can validate the full sequences generated by the legalizer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302782 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 09:45:57 +00:00
Diana Picus	62bc4f0ac3	[ARM][GlobalISel] Support for G_ANYEXT G_ANYEXT can be introduced by the legalizer when widening scalars. Add support for it in the register bank info (same mapping as everything else) and in the instruction selector. When selecting it, we treat it as a COPY, just like G_TRUNC. On this occasion we get rid of some assertions in selectCopy so we can reuse it. This shouldn't be a problem at the moment since we're not supporting any complicated cases (e.g. FPR, different register banks). We might want to separate the paths when we do. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302778 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 08:28:31 +00:00

1 2 3 4 5 ...

21008 Commits