archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Scott Linder	7b19ab70e3	[CodeGen] Fix assert in SelectionDAG::computeKnownBits Fix SelectionDAG::computeKnownBits asserting when handling EXTRACT_SUBVECTOR when zero extending the demanded elements mask if it is already as long as the source vector. Differential Revision: https://reviews.llvm.org/D49574 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339600 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-13 18:44:21 +00:00
Matt Arsenault	1f25a887f6	AMDGPU: Cleanup min/max legacy tests Also add some more tests in preparation for a future patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339526 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-12 19:29:53 +00:00
Matt Arsenault	f0912abc34	DAG: Check no-signed-zeros instead of unsafe-fp-math Addresses fixme, although this should still be checking individual operand flags. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339525 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-12 19:09:12 +00:00
Matt Arsenault	10398322af	AMDGPU: Check NSZ MI flag when folding omod I'm not sure the exact nsz flag combination that is OK. I think as long as it's on either, this is OK. For now just check it on the omod multiply. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339513 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-12 08:44:25 +00:00
Matt Arsenault	3b2fa4ee59	AMDGPU: Use splat vectors for undefs when folding canonicalize If one of the elements is undef, use the canonicalized constant from the other element instead of 0. Splat vectors are more useful for other optimizations, such as matching vector clamps. This was breaking on clamps of half3 from the undef 4th component. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339512 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-12 08:42:54 +00:00
Matt Arsenault	8750be505d	AMDGPU: Fix packing undef parts of build_vector git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339511 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-12 08:42:46 +00:00
Tom Stellard	202efa7409	AMDGPU/GlobalISel: Define instruction mapping for G_INSERT Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49625 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339491 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-11 00:51:54 +00:00
Matt Arsenault	7f32e5e190	AMDGPU: More canonicalized operations git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339464 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-10 19:20:17 +00:00
Matt Arsenault	a13d395b9e	AMDGPU: Combine and of seto/setuo and fp_class Clear the nan (or non-nan) test bits from the mask. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339462 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-10 18:58:56 +00:00
Matt Arsenault	4d8cda85ad	AMDGPU: Match isfinite pattern to class instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339460 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-10 18:58:41 +00:00
Matt Arsenault	f1757fd807	AMDGPU: Error more gracefully on libcalls I think this is the only situation where the callsite will have a null instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339271 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-08 16:58:39 +00:00
Matt Arsenault	35d3bbfa09	AMDGPU: Fix shifts for i128 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339270 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-08 16:58:33 +00:00
Jan Vesely	dcf686e1e7	AMDGPU: Remove broken i16 ternary patterns Fixup test to check for GCN prefix These patterns always zero extend the result even though it might need sign extension. This has been broken since the addition of i16 support. It has popped up in mad_sat(char) test since min(max()) combination is turned into v_med3, resulting in the following (incorrect) sequence: v_mad_i16 v2, v10, v9, v11 v_med3_i32 v2, v2, v8, v7 Fixes mad_sat(char) piglit on VI. Differential Revision: https://reviews.llvm.org/D49836 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339190 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-07 21:54:37 +00:00
Matt Arsenault	f966a40853	AMDGPU: cvt_pk_rtz_f16 canonicalizes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339078 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-06 23:01:31 +00:00
Matt Arsenault	6ae1bfa7a5	AMDGPU: Handle some vector operations in isCanonicalized git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339077 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-06 22:45:51 +00:00
Matt Arsenault	a8868c6067	AMDGPU: Push fcanonicalize through partially constant build_vector This usually avoids some re-packing code, and may help find canonical sources. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339072 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-06 22:30:44 +00:00
Matt Arsenault	f583815bba	AMDGPU: Treat more custom operations as canonicalizing Everything should quiet, and I think everything should flush. I assume the min3/med3/max3 follow the same rules as regular min/max for flushing, which should at least be conservatively correct. There are still more operations that need to be handled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339065 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-06 21:58:11 +00:00
Matt Arsenault	a0ad797381	AMDGPU: Conversions always produce canonical results Not sure why this was checking for denormals for f16. My interpretation of the IEEE standard is conversions should produce a canonical result, and the ISA manual says denormals are created when appropriate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339064 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-06 21:51:52 +00:00
Matt Arsenault	f0fa788ac7	AMDGPU: Fix implementation of isCanonicalized If denormals are enabled, denormals are canonical. Also fix a few other issues. minnum/maxnum are supposed to canonicalize. Temporarily improve workaround for the instruction behavior change in gfx9. Handle selects and fcopysign. The tests were also largely broken, since they were checking for a flush used on some targets after the store of the result. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339061 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-06 21:38:27 +00:00
Matt Arsenault	273374717e	AMDGPU: Fold v_lshl_or_b32 with 0 src0 Appears from expansion of some packed cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339025 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-06 15:40:20 +00:00
Matt Arsenault	c3263d7dee	AMDGPU: Rename check prefixes in test Will avoid noisy diff in future change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339022 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-06 15:16:12 +00:00
Matt Arsenault	7166ee595d	DAG: Enhance isKnownNeverNaN Add a parameter for testing specifically for sNaNs - at least one instruction pattern on AMDGPU needs to check specifically for this. Also handle more cases, and add a target hook for custom nodes, similar to the hooks for known bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338910 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-03 18:27:52 +00:00
Tim Renouf	79905333cd	[AMDGPU] Reworked SIFixWWMLiveness Summary: I encountered some problems with SIFixWWMLiveness when WWM is in a loop: 1. It sometimes gave invalid MIR where there is some control flow path to the new implicit use of a register on EXIT_WWM that does not pass through any def. 2. There were lots of false positives of registers that needed to have an implicit use added to EXIT_WWM. 3. Adding an implicit use to EXIT_WWM (and adding an implicit def just before the WWM code, which I tried in order to fix (1)) caused lots of the values to be spilled and reloaded unnecessarily. This commit is a rework of SIFixWWMLiveness, with the following changes: 1. Instead of considering any register with a def that can reach the WWM code and a def that can be reached from the WWM code, it now considers three specific cases that need to be handled. 2. A register that needs liveness over WWM to be synthesized now has it done by adding itself as an implicit use to defs other than the dominant one. Also added the following fixmes: FIXME: We should detect whether a register in one of the above categories is already live at the WWM code before deciding to add the implicit uses to synthesize its liveness. FIXME: I believe this whole scheme may be flawed due to the possibility of the register allocator doing live interval splitting. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46756 Change-Id: Ie7fba0ede0378849181df3f1a9a7a39ed1a94a94 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338783 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-02 23:31:32 +00:00
Tim Renouf	5e96e38d96	[AMDGPU] Avoid using divergent value in mubuf addr64 descriptor Summary: This fixes a problem where a load from global+idx generated incorrect code on <=gfx7 when the index is divergent. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47383 Change-Id: Ib4d177d6254b1dd3f8ec0203fdddec94bd8bc5ed git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338779 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-02 22:53:57 +00:00
Matt Arsenault	2920ef7815	DAG: Fix vector widening fcanonicalize git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338715 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-02 13:43:53 +00:00
Matt Arsenault	c9baad19d3	AMDGPU: Fix scalarizing v4f16 fcanonicalize git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338714 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-02 13:43:42 +00:00
Matt Arsenault	7943061ff9	AMDGPU: Improve hack for packing conversion ops Mutate the node type during selection when it doesn't matter. This avoids an intermediate bitcast node on targets with legal i16/f16. Also fixes missing output modifiers on v_cvt_pkrtz_f32_f16, which I assume are OK. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338619 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-01 20:13:58 +00:00
Matt Arsenault	6446fcd61a	AMDGPU: Partially fix handling of packed amdgpu_ps arguments Fixes annoying limitations when writing tests. Also remove more leftover code for manually scalarizing arguments and return values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338618 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-01 19:57:34 +00:00
Jan Vesely	bf6429d608	AMDGPU/R600: Convert kernel param loads to use PARAM_I_ADDRESS Non ext aligned i32 loads are still optimized to use CONSTANT_BUFFER (AS 8) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338610 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-01 18:36:07 +00:00
Ryan Taylor	a9d3893acc	[AMDGPU] Optimize _L image intrinsic to _LZ when lod is zero Summary: Add _L to _LZ image intrinsic table mapping to table gen. In ISelLowering check if image intrinsic has lod and if it's equal to zero, if so remove lod and change opcode to equivalent mapped _LZ. Change-Id: Ie24cd7e788e2195d846c7bd256151178cbb9ec71 Subscribers: arsenm, mehdi_amini, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49483 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338523 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-01 12:12:01 +00:00
Konstantin Zhuravlyov	b47f061f5b	AMDGPU: Add clamp bit to dot intrinsics Differential Revision: https://reviews.llvm.org/D49874 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338470 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-01 01:31:30 +00:00
Matt Arsenault	fa5ec153c2	AMDGPU: Split amdgcn/r600 fminnum/fmaxnum tests R600 breaks on too many things to usefully test changes with ieee_mode on vs. off. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338435 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-31 20:38:42 +00:00
Matt Arsenault	4b6157df8b	AMDGPU: Break 64-bit arguments into 32-bit pieces git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338421 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-31 19:29:04 +00:00
Matt Arsenault	b9d99ce19e	AMDGPU: Split wide vectors of i16/f16 into 32-bit regs on calls This improves code for the same reasons as scalarizing 32-bit element vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338418 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-31 19:17:47 +00:00
Matt Arsenault	0a67b1c905	AMDGPU: Scalarize vector argument types to calls When lowering calling conventions, prefer to decompose vectors into the constitute register types. This avoids artifical constraints to satisfy a wide super-register. This improves code quality because now optimizations don't need to deal with the super-register constraint. For example the immediate folding code doesn't deal with 4 component reg_sequences, so by breaking the register down earlier the existing immediate folding code is able to work. This also avoids the need for the shader input processing code to manually split vector types. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338416 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-31 19:05:14 +00:00
Matt Arsenault	48e2f47300	DAG: Fix PromoteFloatResult for fcanonicalize git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338382 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-31 14:15:22 +00:00
Matt Arsenault	8f44f41e0f	AMDGPU: Fold undef fcanonicalize to qNaN We could choose a free 0 for this, but this matches the behavior for fmul undef, 1.0. Also, the NaN use is more useful for folding use operations although if it's not eliminated it is more expensive in terms of code size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338376 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-31 13:34:31 +00:00
Matt Arsenault	8d00765ed1	AMDGPU: Fix test check line bugs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338374 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-31 13:25:23 +00:00
Matt Arsenault	78e0f47487	AMDGPU: Reduce code size with fcanonicalize (fneg x) When fcanonicalize is lowered to a mul, we can use -1.0 for free and avoid the cost of the bigger encoding for source modifers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338244 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-30 12:16:58 +00:00
Matt Arsenault	86dcb58e5d	AMDGPU: Make fneg combine handle fcanonicalize git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338243 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-30 12:16:47 +00:00
Nicolai Haehnle	bfada8913e	AMDGPU: Force skip over s_sendmsg and exp instructions Summary: These instructions interact with hardware blocks outside the shader core, and they can have "scalar" side effects even when EXEC = 0. We don't want these scalar side effects to occur when all lanes want to skip these instructions, so always add the execz skip branch instruction for basic blocks that contain them. Also ensure that we skip scalar stores / atomics, though we don't code-gen those yet. Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48431 Change-Id: Ieaeb58352e2789ffd64745603c14970c60819d44 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338235 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-30 09:23:59 +00:00
Matt Arsenault	c358be353a	AMDGPU: Stop wasting argument registers with v3i32/v3f32 SelectionDAGBuilder widens v3i32/v3f32 arguments to to v4i32/v4f32 which consume an additional register. In addition to wasting argument space, this produces extra instructions since now it appears the 4th vector component has a meaningful value to most combines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338197 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-28 14:11:34 +00:00
Matt Arsenault	3d794576ed	AMDGPU: Stop trying to extend arguments for clover This was trying to replace i8/i16 arguments with i32, which was broken and no longer necessary. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338193 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-28 12:34:25 +00:00
Jan Vesely	0a1753ac2d	AMDGPU/R600: Add MOV instructions to BFE patterns R600 can't handle immediates for BFE, these will be eliminated later. Fixes powr/pow regressions n r600 since r334817 Differential Revision: https://reviews.llvm.org/D49641 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338127 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-27 15:00:13 +00:00
Matt Arsenault	e9c22aa83f	AMDGPU: Fix code size for return_to_epilog pseudo git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338113 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-27 09:15:03 +00:00
Tom Stellard	6dce5ed08b	AMDGPU/GlobalISel: Fix crash in regbankselect on non-power-of-2 types Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D49624 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338102 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-27 06:04:40 +00:00
Scott Linder	7531be0d75	[AMDGPU] Fix VGPR spills where offset doesn't fit in 12 bits Scale the offset of VGPR spills by the wave size when it cannot fit in the 12-bit offset immediate field and so is added to the soffset SGPR. This accounts for hardware swizzling of scratch memory. Differential Revision: https://reviews.llvm.org/D49448 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338060 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-26 19:47:51 +00:00
Stanislav Mekhanoshin	a31192b537	[AMDGPU] Use AssumptionCacheTracker in the divrem32 expansion Differential Revision: https://reviews.llvm.org/D49761 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337938 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-25 17:02:11 +00:00
Tom Stellard	6779fccada	AMDGPU/GlobalISel: Legalize G_INSERT Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49601 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337798 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-24 02:19:20 +00:00
Matt Arsenault	06b493f7f0	Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering" Reverts r337079 with fix for msan error. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337535 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-20 09:05:08 +00:00

1 2 3 4 5 ...

1740 Commits