archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Craig Topper	2129bbc974	[AVX-512] Fix a couple test cases to not pass an undef mask to gather intrinsic. This could break if any future optimizations taken advantage of the undef. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292585 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 07:12:30 +00:00
Greg Parker	41b2c9b1f5	[test] Remove a unwanted match for `XFAIL:`. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292567 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 02:01:04 +00:00
Ahmed Bougacha	51348febc6	[AArch64][GlobalISel] Widen scalar int->fp conversions. It's incorrect to ignore the higher bits of the integer source. Teach the legalizer how to widen it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292563 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 01:37:24 +00:00
Stanislav Mekhanoshin	f304f044ed	[AMDGPU] Prevent spills before exec mask is restored Inline spiller can decide to move a spill as early as possible in the basic block. It will skip phis and label, but we also need to make sure it skips instructions in the basic block prologue which restore exec mask. Added isPositionLike callback in TargetInstrInfo to detect instructions which shall be skipped in addition to common phis, labels etc. Differential Revision: https://reviews.llvm.org/D27997 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292554 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 00:44:31 +00:00
Ahmed Bougacha	c4512366a3	[AArch64][GlobalISel] Split FP conversion legalizer tests. NFC. Big functions with large vreg # are quite unwieldy to update. Change it to have one function per test (it does increase boilerplate, but makes the core hopefully more readable and maintanable). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292552 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 00:30:09 +00:00
Ahmed Bougacha	6d3baec9d6	[AArch64][GlobalISel] Split legalizer combine tests. NFC. Big functions with large vreg # are quite unwieldy to update. This test also relied on legal s8 operations which we're considering removing. Change it to have one function per test (it does increase boilerplate, but makes the core hopefully more readable and maintanable), and use 100% legal operations throughout. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292551 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 00:30:06 +00:00
Ahmed Bougacha	cbd2ff78c0	[MIRParser] Allow generic register specification on operand. This completes r292321 by adding support for generic registers, e.g.: %2:_(s32) = G_ADD %0, %1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292550 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 00:29:59 +00:00
Tim Northover	dfbb55fc0c	AArch64: fall back to DAG ISel for inline assembly. We can't currently handle "calls" to inlineasm strings so it's better to let the DAG handle it than generate rubbish. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292540 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 23:59:35 +00:00
Simon Pilgrim	64f91a5757	[SelectionDAG] Improve knownbits handling of UMIN/UMAX (PR31293) This patch improves the knownbits logic for unsigned integer min/max opcodes. For UMIN we know that the result will have the maximum of the inputs' known leading zero bits in the result, similarly for UMAX the maximum of the inputs' leading one bits. This is particularly useful for simplifying clamping patterns,. e.g. as SSE doesn't have a uitofp instruction we want to use sitofp instead where possible and for that we need to confirm that the top bit is not set. Differential Revision: https://reviews.llvm.org/D28853 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292528 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 22:41:22 +00:00
Serge Rogatch	9a63b871bd	[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier Summary: Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future. This patch is one of a series, see also - https://reviews.llvm.org/D28623 Reviewers: rengolin, dberris Reviewed By: dberris Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown Differential Revision: https://reviews.llvm.org/D28624 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292516 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 20:24:23 +00:00
Simon Pilgrim	2d628eed7f	[X86][SSE] Attempt to pre-truncate arithmetic operations that have already been extended As discussed on D28219 - it is profitable to combine trunc(binop (s/zext(x), s/zext(y)) to binop(trunc(s/zext(x)), trunc(s/zext(y))) assuming the trunc(ext()) will simplify further git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292493 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 16:25:02 +00:00
Simon Pilgrim	b3f7c2ec02	[X86][SSE] Added tests for pre-truncating arithmetic operations that have already been extended As discussed on D28219 - it is profitable to combine trunc(binop (s/zext(x), s/zext(y)) to binop(trunc(s/zext(x)), trunc(s/zext(y))) assuming the trunc(ext()) will simplify further git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292487 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 15:03:00 +00:00
Mikael Holmen	7f6f824722	[DAG] Don't increase SDNodeOrder for dbg.value/declare. Summary: The SDNodeOrder is saved in the IROrder field in the SDNode, and this field may affects scheduling. Thus, letting dbg.value/declare increase the order numbers may in turn affect scheduling. Because of this change we also need to update the code deciding when dbg values should be output, in ScheduleDAGSDNodes.cpp/ProcessSDDbgValues. Dbg values now have the same order as the SDNode they are connected to, not the following orders. Test cases provided by Florian Hahn. Reviewers: bogner, aprantl, sunfish, atrick Reviewed By: atrick Subscribers: fhahn, probinson, andreadb, llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D25318 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292485 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 13:55:55 +00:00
Kristof Beyls	56c4b1ef06	[GlobalISel] Pointers are legal operands for G_SELECT on AArch64 Differential Revision: https://reviews.llvm.org/D28805 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292481 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 13:32:14 +00:00
Elena Demikhovsky	f7484a051a	Recommiting unsigned saturation with a bugfix. A test case that crached is added to avx512-trunc.ll. (PR31589) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292479 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 12:08:21 +00:00
Justin Bogner	e3ad0db135	GlobalISel: Implement widening for shifts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292476 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 07:51:17 +00:00
Craig Topper	9720a787a1	[AVX-512] Add test cases that show where we are using two subvector inserts to broadcast a 128-bit subvector into a 512-bit vector. We'd be better off using something like SHUFF32X4. If the subvector comes from a load, we convert to SUBV_BROADCAST and use a broadcast instruction. But if there is no load we keep the inserts. I think we should create the SUBV_BROADCAST even without the load and let isel use the fallback patterns that are used if the load can't be folded. This will use the SHUFF32X4 or similar instruction for the 128-bit into 512-bit case and a single insert for 128 into 256 or 256 into 512. This should be fixed so subvector broadcast intrinsics can be replaced with native IR since some of those currently lower directly to SHUFF32X4. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292475 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 07:37:45 +00:00
Craig Topper	f1fe387ada	[AVX-512] Support ADD/SUB/MUL of mask vectors Summary: Currently we expand and scalarize these operations, but I think we should be able to implement ADD/SUB with KXOR and MUL with KAND. We already do this for scalar i1 operations so I just extended it to vectors of i1. Reviewers: zvi, delena Reviewed By: delena Subscribers: guyblank, llvm-commits Differential Revision: https://reviews.llvm.org/D28888 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292474 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 07:12:35 +00:00
Matt Arsenault	261f60f486	AMDGPU: Disable some fneg combines unless nsz For -(x + y) -> (-x) + (-y), if x == -y, this would change the result from -0.0 to 0.0. Since the fma/fmad combine is an extension of this problem it also applies there. fmul should be fine, and I don't think any of the unary operators or conversions should be a problem either. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292473 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 06:35:27 +00:00
Matt Arsenault	cfe56d7c95	AMDGPU: Remove modifiers from v_div_scale_* They seem to produce nonsense results when used. This should be applied to the release branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292472 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 06:04:12 +00:00
Craig Topper	2a42c3b9a1	[AVX-512] Use VSHUF instructions instead of two inserts as fallback for subvector broadcasts that can't fold the load. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292466 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 02:34:29 +00:00
Craig Topper	3b137a67b9	[AVX-512] Add additional test cases for broadcast intrinsics that demonstates that we don't fold the loads to use a broadcast instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292465 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 02:34:25 +00:00
Justin Bogner	3552215d7c	GlobalISel: Implement narrowing for G_LOAD git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292461 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 01:05:48 +00:00
Matthias Braun	cc10c1ec95	Use an actual valid register in test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292459 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 01:04:08 +00:00
Artem Belevich	df5ffd6153	[NVPTX] Fix lowering of fp16 ISD::FNEG. There's no neg.f16 instruction, so negation has to be done via subtraction from zero. Differential Revision: https://reviews.llvm.org/D28876 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292452 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 00:14:45 +00:00
Krzysztof Parzyszek	a8b917bef3	Treat segment [B, E) as not overlapping block with boundaries [A, B) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292446 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 23:12:19 +00:00
Krzysztof Parzyszek	621d1f4149	[Hexagon] Remove dead defs from the live set when expanding wstores git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292445 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 23:11:40 +00:00
Michael Kuperstein	57f0668781	Revert r291670 because it introduces a crash. r291670 doesn't crash on the original testcase from PR31589, but it crashes on a slightly more complex one. PR31589 has the new reproducer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292444 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 23:05:58 +00:00
Evandro Menezes	f5697dce74	[AArch64] Generate literals by the little end ARM seems to prefer that long literals be formed from their little end in order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on Cortex A57 and others (v. "Cortex A57 Software Optimisation Guide", section 4.14). Differential revision: https://reviews.llvm.org/D28697 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292422 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 18:57:08 +00:00
Stanislav Mekhanoshin	d78f00a4d1	[AMDGPU] Do not allow register coalescer to create big superregs Limit register coalescer by not allowing it to artificially increase size of registers beyond dword. Such super-registers are in fact register sequences and not distinct HW registers. With more super-regs we would need to allocate adjacent registers and constraint regalloc more than needed. Moreover, our super registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2, VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers allocation even more, resulting in excessive spilling. Differential Revision: https://reviews.llvm.org/D28782 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292413 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 17:30:05 +00:00
Justin Bogner	4a6dec6408	GlobalISel: Implement narrowing for G_STORE Legalize stores of types that are too wide by breaking them up into sequences of smaller stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292412 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 17:29:54 +00:00
Teresa Johnson	6f03dce4dd	Don't create a comdat group for a dropped def with initializer Non-prevailing weak/linkonce odr symbols will be dropped by ThinLTO to available_externally when possible. If they had an initializer in the global_ctors list, a comdat group was being created. This code already had logic to skip available_externally defs, but now the EliminateAvailableExternally pass will drop these symbols to declarations earlier. Change the check to skip all declarations for linker (which includes available_externally along with declarations). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28737 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292408 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 16:58:43 +00:00
Simon Pilgrim	e3d484c1a0	Fixed parser error on windows shell evaluation of RUN script line git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292363 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 11:40:28 +00:00
Simon Pilgrim	00b2f8ccbf	[X86][SSE] Simplify umax knownbits test combineSRA doesn't detect sign bits splats that it does itself so just use -1 as the demanded input so that its already splatted git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292361 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 11:20:31 +00:00
Michael Zuckerman	476a6c5119	[X86] Improve mul combine for negative multiplayer (2^c - 1) This patch improves the mul instruction combine function (combineMul) by adding new layer of logic. In this patch, we are adding the ability to fold (mul x, -((1 << c) -1)) or (mul x, -((1 << c) +1)) into (neg(X << c) -x) or (neg((x << c) + x) respective. Differential Revision: https://reviews.llvm.org/D28232 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292358 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 09:31:13 +00:00
Renato Golin	6994e2312e	Revert "[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier" This reverts commit r292210, as it broke the Thumb buldbot with: clang-5.0: error: the clang compiler does not support '-fxray-instrument on thumbv7-unknown-linux-gnueabihf'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292357 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 09:08:43 +00:00
Jonas Paulsson	ce4eec587e	[SystemZ] Proper handling of undef flag while expanding pseudo. During post-RA pseudo expansion, an 'undef' flag of the source operand should be propagated by emitGRX32Move(). Review: Ulrich Weigand git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292353 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 08:32:54 +00:00
Matt Arsenault	4cddac93ec	DAG: Consider nnan in isKnownNeverNaN git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292328 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 02:10:08 +00:00
Wei Mi	f2f70f10c2	Revert rL292292 since it causes a SEGV on sanitizer-x86_64-linux-fuzzer build bot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292327 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 01:53:53 +00:00
Dan Gohman	97c35d16c6	[WebAssembly] Update grow_memory's return type. The grow_memory instruction now returns the previous memory size. Add the return type to the LLVM intrinsic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292322 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 01:02:45 +00:00
Matthias Braun	c1fa0731c3	MIRParser: Allow regclass specification on operand You can now define the register class of a virtual register on the operand itself avoiding the need to use a "registers:" block. Example: "%0:gr64 = COPY %rax" Differential Revision: https://reviews.llvm.org/D22398 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292321 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 00:59:19 +00:00
Justin Lebar	dc30ded6fb	[NVPTX] Support global variables of integer type larger than i64. Reviewers: tra, majnemer Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28825 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292316 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 00:29:53 +00:00
Justin Lebar	db8ecafe57	[NVPTX] Implement min/max in tablegen, rather than with custom DAGComine logic. Summary: This change also lets us use max.{s,u}16. There's a vague warning in a test about this maybe being less efficient, but I could not come up with a case where the resulting SASS (sm_35 or sm_60) was different with or without max.{s,u}16. It's true that nvcc seems to emit only max.{s,u}32, but even ptxas 7.0 seems to have no problem generating efficient SASS from max.{s,u}16 (the casts up to i32 and back down to i16 seem to be implicit and nops, happening via register aliasing). In the absence of evidence, better to have fewer special cases, emit more straightforward code, etc. In particular, if a new GPU has 16-bit min/max instructions, we want to be able to use them. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28732 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292304 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 00:09:01 +00:00
Justin Lebar	f78819e87f	[NVPTX] Lower integer absolute value idiom to abs instruction. Summary: Previously we lowered it literally, to shifts and xors. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28722 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292303 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 00:08:44 +00:00
Justin Lebar	38c1089801	[NVPTX] Improve lowering of llvm.ctpop. Summary: Avoid an unnecessary conversion operation when using the result of ctpop.i32 or ctpop.i16 as an i32, as in both cases the ptx instruction we run returns an i32. (Previously if we used the value as an i32, we'd do an unnecessary zext+trunc.) Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28721 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292302 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 00:08:27 +00:00
Justin Lebar	7470b2cf62	[NVPTX] Add lowering for llvm.bitreverse. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28720 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292301 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 00:08:10 +00:00
Justin Lebar	9ff5ce49a6	[NVPTX] Fix function names in ctlz.ll test. Test-only change. Looks like a copy/paste mistake, all the functions in ctlz.ll were named "ctpop". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292300 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 00:07:52 +00:00
Justin Lebar	fd46a6b819	[NVPTX] Improve lowering of llvm.ctlz. Summary: * Disable "ctlz speculation", which inserts a branch on every ctlz(x) which has defined behavior on x == 0 to check whether x is, in fact zero. * Add DAG patterns that avoid re-truncating or re-expanding the result of the 16- and 64-bit ctz instructions. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28719 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292299 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 00:07:35 +00:00
Wei Mi	caab16e99a	[RegisterCoalescing] Remove partial redundent copy. The patch is to solve the performance problem described in PR27827. Register coalescing sometimes cannot remove a copy because of interference. But if we can find a reverse copy in one of the predecessor block of the copy, the copy is partially redundent and we may remove the copy partially by moving it to the predecessor block without the reverse copy. Differential Revision: https://reviews.llvm.org/D28585 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292292 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-17 23:39:07 +00:00
Tim Northover	f4d04d46f8	GlobalISel: fix comparison order for G_FCMP As with G_ICMP we'd written the CSET instructions backwards. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292285 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-17 23:04:01 +00:00

1 2 3 4 5 ...

19595 Commits