archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Greg Parker	41b2c9b1f5	[test] Remove a unwanted match for `XFAIL:`. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292567 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 02:01:04 +00:00
Stanislav Mekhanoshin	f304f044ed	[AMDGPU] Prevent spills before exec mask is restored Inline spiller can decide to move a spill as early as possible in the basic block. It will skip phis and label, but we also need to make sure it skips instructions in the basic block prologue which restore exec mask. Added isPositionLike callback in TargetInstrInfo to detect instructions which shall be skipped in addition to common phis, labels etc. Differential Revision: https://reviews.llvm.org/D27997 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292554 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 00:44:31 +00:00
Matt Arsenault	261f60f486	AMDGPU: Disable some fneg combines unless nsz For -(x + y) -> (-x) + (-y), if x == -y, this would change the result from -0.0 to 0.0. Since the fma/fmad combine is an extension of this problem it also applies there. fmul should be fine, and I don't think any of the unary operators or conversions should be a problem either. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292473 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 06:35:27 +00:00
Matt Arsenault	cfe56d7c95	AMDGPU: Remove modifiers from v_div_scale_* They seem to produce nonsense results when used. This should be applied to the release branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292472 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 06:04:12 +00:00
Stanislav Mekhanoshin	d78f00a4d1	[AMDGPU] Do not allow register coalescer to create big superregs Limit register coalescer by not allowing it to artificially increase size of registers beyond dword. Such super-registers are in fact register sequences and not distinct HW registers. With more super-regs we would need to allocate adjacent registers and constraint regalloc more than needed. Moreover, our super registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2, VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers allocation even more, resulting in excessive spilling. Differential Revision: https://reviews.llvm.org/D28782 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292413 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 17:30:05 +00:00
Matt Arsenault	4cddac93ec	DAG: Consider nnan in isKnownNeverNaN git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292328 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 02:10:08 +00:00
Matt Arsenault	62b3258a7c	AMDGPU: Add replacement export intrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292205 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-17 07:26:53 +00:00
Jan Vesely	6d821c2f7c	ADMGPU/EG,CM: Implement _noret global atomics _RTN versions will be a lot more complicated Differential Revision: https://reviews.llvm.org/D28067 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292162 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-16 21:20:13 +00:00
Konstantin Zhuravlyov	999a6572f3	[AMDGPU] Implement f16 fcopysign and fcopysign(f32, f64) Differential Revision: https://reviews.llvm.org/D28496 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291954 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-13 19:49:25 +00:00
Matt Arsenault	cd002582ba	AMDGPU: Skip fneg/select combine if it can fold into other git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291792 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 18:58:15 +00:00
Matt Arsenault	9db1ec3d4d	AMDGPU: Fold free fneg into sin git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291790 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 18:48:09 +00:00
Matt Arsenault	49dd8fcb21	AMDGPU: Fold fneg into fmul_legacy git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291784 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 18:26:30 +00:00
Matt Arsenault	bd870734a5	AMDGPU: Fold fneg into rcp git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291779 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 17:46:35 +00:00
Matt Arsenault	cca494fd03	AMDGPU: Fold fneg into fp_round git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291778 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 17:46:33 +00:00
Matt Arsenault	e652041f69	AMDGPU: Fold fneg into fp_extend git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291777 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 17:46:28 +00:00
Matt Arsenault	94bf68d551	AMDGPU: Fold fneg into fma or fmad Patch mostly by Fiona Glaser git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291733 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 00:32:16 +00:00
Matt Arsenault	ef33822be5	AMDGPU: Fold fneg into fmul Patch mostly by Fiona Glaser git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291732 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 00:23:20 +00:00
Matt Arsenault	bcf34bbbdd	AMDGPU: Fold fneg into fadd Patch mostly by Fiona Glaser git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291731 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 00:09:34 +00:00
Matt Arsenault	8694e2f853	AMDGPU: Pull fneg/fabs out of a select Allows better source modifier usage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291729 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 23:57:38 +00:00
Matt Arsenault	f1e95d3604	AMDGPU: Fix shrinking of addc/subb. To shrink to VOP2 the input carry must also be VCC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291720 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 22:58:12 +00:00
Matt Arsenault	fac51240d9	AMDGPU: Fix sext_inreg for i1 in i16 This produces worse code when i16 is legal, mostly due to combines getting confused by conversions inserted for uniform 16-bit operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291717 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 22:35:22 +00:00
Matt Arsenault	8c7e9845cf	AMDGPU: Fix breaking VOP3 v_add_i32s This was shrinking the instruction even though the carry output register was a virtual register, not known VCC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291716 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 22:35:17 +00:00
Matt Arsenault	c6b1aed80d	AMDGPU: Fix folding immediates into mac src2 Whether it is legal or not needs to check for the instruction it will be replaced with. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291711 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 22:00:02 +00:00
Kyle Butt	0aa7497cd7	Revert "CodeGen: Allow small copyable blocks to "break" the CFG." This reverts commit `ada6595a52`. This needs a simple probability check because there are some cases where it is not profitable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291695 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 19:55:19 +00:00
Matt Arsenault	a40945ed88	DAGCombiner: Add hasOneUse checks to fadd/fma combine Even with aggressive fusion enabled, this requires duplicating the fmul, or increases an fadd to another fma which is not an improvement. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291642 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 02:02:12 +00:00
Jan Vesely	53dcfdf89b	AMDGPU/EG,CM: Add fp16 conversion instructions Differential Revision: https://reviews.llvm.org/D28164 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291622 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 00:12:39 +00:00
Matt Arsenault	1639229587	AMDGPU: Constant fold when immediate is materialized In future commits these patterns will appear after moveToVALU changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291615 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-10 23:32:04 +00:00
Kyle Butt	ada6595a52	CodeGen: Allow small copyable blocks to "break" the CFG. When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well. Differential revision: https://reviews.llvm.org/D27742 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291609 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-10 23:04:30 +00:00
Matt Arsenault	fa5aafaac2	DAG: Avoid OOB when legalizing vector indexing If a vector index is out of bounds, the result is supposed to be undefined but is not undefined behavior. Change the legalization for indexing the vector on the stack so that an out of bounds index does not create an out of bounds memory access. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291604 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-10 22:02:30 +00:00
Matt Arsenault	da59cd0847	AMDGPU: Add tests for HasMultipleConditionRegisters This was enabled without many specific tests or the comment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291586 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-10 19:08:15 +00:00
Matt Arsenault	f7c0d4013c	AMDGPU: Add Assert[SZ]Ext during argument load creation For i16 zeroext arguments when i16 was a legal type, the known bits information from the truncate was lost. Insert a zeroext so the known bits optimizations work with the 32-bit loads. Fixes code quality regressions vs. SI in min.ll test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291461 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-09 18:52:39 +00:00
Bjorn Pettersson	26c3968163	[SelectionDAG] Fix in legalization of UMAX/SMAX/UMIN/SMIN. Solves PR31486. Summary: Originally i64 = umax t8, Constant:i64<4> was expanded into i32,i32 = umax Constant:i32<0>, Constant:i32<0> i32,i32 = umax t7, Constant:i32<4> Now instead the two produced umax:es return i32 instead of i32, i32. Thanks to Jan Vesely for help with the test case. Patch by mikael.holmen at ericsson.com Reviewers: bogner, jvesely, tstellarAMD, arsenm Subscribers: test, wdng, RKSimon, arsenm, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D28135 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291441 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-09 12:03:50 +00:00
Jan Vesely	0835374acb	AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodes This will make transition to SCRATCH_MEMORY easier Differential Revision: https://reviews.llvm.org/D24746 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291279 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-06 21:00:46 +00:00
Konstantin Zhuravlyov	9060577664	[AMDGPU] Do not emit .AMDGPU.config section for amdhsa Differential Revision: https://reviews.llvm.org/D27732 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291245 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-06 17:02:10 +00:00
Jan Vesely	bf64cb107c	AMDGPU/SI: Implement sendmsghalt intrinsic v2: expose using amdgcn prefix Differential Revision: https://reviews.llvm.org/D23511 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290977 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-04 18:06:55 +00:00
Matt Arsenault	e2b3286a26	AMDGPU: Invert cmp + select with constant Canonicalize a select with a constant to the false side. This enables more instruction shrinking opportunities since an inline immediate can be used for the false side of v_cndmask_b32_e32. This seems to usually be better but causes some code size regressions in some tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290372 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 21:40:08 +00:00
Matt Arsenault	ad47821c65	AMDGPU: Use i16 for i16 shift amount git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290351 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 16:36:25 +00:00
Matt Arsenault	8d973070a0	AMDGPU: Use i16 comparison instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290348 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 16:27:11 +00:00
Matt Arsenault	46e5f1c88d	AMDGPU: Swap order of operands in fadd/fsub combine FMA is canonicalized to constant in the middle operand. Do the same so fmad matches and avoid an extra combine step. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290313 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 04:03:40 +00:00
Matt Arsenault	121f8654d3	AMDGPU: Check fast math flags in fadd/fsub combines git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290312 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 04:03:35 +00:00
Matt Arsenault	ff4096b8f8	AMDGPU: Form more FMAs if fusion is allowed Extend the existing fadd/fsub->fmad combines to produce FMA if allowed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290311 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 03:55:35 +00:00
Matt Arsenault	75c32f5150	AMDGPU: Enable some f32 fadd/fsub combines for f16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290308 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 03:40:39 +00:00
Matt Arsenault	cee1c4614a	AMDGPU: Implement isFMAFasterThanFMulAndFAdd for f16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290307 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 03:21:48 +00:00
Matt Arsenault	998b18c570	AMDGPU: setcc test cleanup git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290306 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 03:21:45 +00:00
Matt Arsenault	a8dff18ebc	AMDGPU: Allow rcp and rsq usage with f16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290302 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 03:05:44 +00:00
Matt Arsenault	4bb99910b0	AMDGPU: Custom lower f16 fdiv git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290301 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 03:05:41 +00:00
Matt Arsenault	0bb2ef4a14	AMDGPU: Implement f16 fcanonicalize git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290300 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-22 03:05:37 +00:00
Matt Arsenault	256f8018fa	AMDGPU: Allow 16-bit types in inline asm constraints git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290193 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-20 19:06:12 +00:00
Matt Arsenault	4bcae756d4	AMDGPU: Run fp combine tests on VI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290192 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-20 18:55:11 +00:00
Matt Arsenault	44e57608d4	AMDGPU: Don't add same instruction multiple times to worklist When the instruction is processed the first time, it may be deleted resulting in crashes. While the new test adds the same user to the worklist twice, this particular case doesn't crash but I'm not sure why. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290191 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-20 18:55:06 +00:00

1 2 3 4 5 ...

782 Commits