RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2025-03-01 09:26:22 +00:00

Author	SHA1	Message	Date
Matt Arsenault	ffac88a158	AMDGPU: Fix converting unanalyzable global loads to SMRD Not all memory dependence queries succeed, so this needs to be conservative if it fails. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307861 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-12 23:06:18 +00:00
Stanislav Mekhanoshin	16be511cb4	[AMDGPU] fcanonicalize elimination optimization We are using multiplication by 1.0 to flush denormals and quiet sNaNs. That is possible to omit this multiplication if source of the fcanonicalize instruction is known to be flushed/quieted, i.e. if it comes from another instruction known to do the normalization and we are using IEEE mode to quiet sNaNs. Differential Revision: https://reviews.llvm.org/D35218 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307848 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-12 21:20:28 +00:00
Rafael Espindola	4aebf83110	Fully fix the movw/movt addend. The issue is not if the value is pcrel. It is whether we have a relocation or not. If we have a relocation, the static linker will select the upper bits. If we don't have a relocation, we have to do it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307730 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-11 23:18:25 +00:00
Evandro Menezes	fdda7ea9d5	[CodeGen] Rename DEBUG_TYPE to match passnames Rename missing DEBUG_TYPE "machine-scheduler" from backend files, which were absent from https://reviews.llvm.org/rL303921. Differential revision: https://reviews.llvm.org/D35231 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307719 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-11 22:08:28 +00:00
Konstantin Zhuravlyov	2e2081eea2	Revert "AMDGPU: Do not test for SI in getIsaVersion" This reverts commit r307573. This breaks downstream test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307678 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-11 17:57:41 +00:00
Nirav Dave	c7acbe2ea6	Add DAG argument to canMergeStoresTo NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307583 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 20:25:54 +00:00
Matt Arsenault	d380c14b7a	AMDGPU: Allow SIShrinkInstructions to fold FrameIndexes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307576 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 20:04:35 +00:00
Matt Arsenault	a038a8340c	AMDGPU: Allow SIShrinkInstructions to work in non-SSA Immediates can be folded as long as the immediate is a vreg. Also undo commuting instructions if it didn't fold an immediate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307575 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 19:53:57 +00:00
Matt Arsenault	7231966089	AMDGPU: Remove unnecessary check for constant operands An instruction that has an immediate operand can't reach this point. This is only called for a freshly shrunk instruction, which prevously couldn't have had a literal constant operand. This was also not conservative enough since it woudl also have had to filter other constant-like inputs like frame indexes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307574 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 19:33:38 +00:00
Konstantin Zhuravlyov	f392c1f922	AMDGPU: Do not test for SI in getIsaVersion SI is being tested by isa version in the first two if statements of the function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307573 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 19:24:05 +00:00
Simon Pilgrim	db24b6e4f7	[AMDGPU] Fix -Wimplicit-fallthrough warning. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307485 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-08 19:50:03 +00:00
Simon Pilgrim	2541a59ac3	Fix some more -Wimplicit-fallthrough warnings. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307411 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 16:40:06 +00:00
Sam Kolton	f9327929eb	[AMDGPU] Assembler: refactor convert methods (VOP3 and MIMG) Summary: Simplified converter methods for VOP3 and MIMG. Reviewers: dp, artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, vpykhtin, t-tye Differential Revision: https://reviews.llvm.org/D35047 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307407 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 15:21:52 +00:00
Dmitry Preobrazhensky	c956bf87e0	[AMDGPU][mc][gfx9] Added support of op_sel/op_sel_hi for V_MAD_MIX* See https://bugs.llvm.org//show_bug.cgi?id=33595 Reviewers: vpykhtin, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D35021 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307402 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 14:29:06 +00:00
Simon Pilgrim	26aa51226a	[AMDGPU] Fix -Wimplicit-fallthrough warnings. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307381 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 10:18:57 +00:00
Sean Fertile	471398ffea	Extend memcpy expansion in Transform/Utils to handle wider operand types. Adds loop expansions for known-size and unknown-sized memcpy calls, allowing the target to provide the operand types through TTI callbacks. The default values for the TTI callbacks use int8 operand types and matches the existing behaviour if they aren't overridden by the target. Differential revision: https://reviews.llvm.org/D32536 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307346 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 02:00:06 +00:00
Matt Arsenault	8763b3ac42	AMDGPU: Add macro fusion schedule DAG mutation Try to increase opportunities to shrink vcc uses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307313 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 20:57:05 +00:00
Matt Arsenault	92223c6fe5	AMDGPU: Minor cleanup of shrinking logic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307312 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 20:56:59 +00:00
Stanislav Mekhanoshin	71b4fe4228	[AMDGPU] Always use rcp + mul with fast math Regardless of relaxation options such as -cl-fast-relaxed-math we are producing rather long code for fdiv via amdgcn_fdiv_fast intrinsic. This intrinsic is used to replace fdiv with 2.5ulp metadata and does not handle denormals, thus believed to be fast. An fdiv instruction can also have fast math flag either by itself or together with fpmath metadata. Clang used with a relaxation flag always produces both metadata and fast flag: %div = fdiv fast float %v, %0, !fpmath !12 !12 = !{float 2.500000e+00} Current implementation ignores fast flag and favors metadata. An instruction with just fast flag would be lowered to a fastest rcp + mul, but that never happen on practice because of described mutual clang and BE behavior. This change allows an "fdiv fast" to be always lowered as rcp + mul. Differential Revision: https://reviews.llvm.org/D34844 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307308 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 20:34:21 +00:00
Craig Topper	6dbd34d261	[Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307292 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 18:39:47 +00:00
Quentin Colombet	d268a8d71a	[AMDGPU] Move GISel accessor initialization from TargetMachine to Subtarget. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307186 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-05 18:40:56 +00:00
Alexander Timofeev	f9e9586c80	[AMDGPU] Switch scalarize global loads ON by default Differential revision: https://reviews.llvm.org/D34407 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307097 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-04 17:32:00 +00:00
Marek Olsak	97186bff40	[AMDGPU] Fix latency of MIMG instructions Patch by cwabbott (Connor Abbott). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307081 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-04 14:43:38 +00:00
NAKAMURA Takumi	0a256123a4	Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default" It broke a testcase. Failing Tests (1): LLVM :: CodeGen/AMDGPU/alignbit-pat.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307054 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-04 02:14:18 +00:00
Alexander Timofeev	0f9ec97238	[AMDGPU] Switch scalarize global loads ON by default Differential revision: https://reviews.llvm.org/D34407 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307026 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-03 14:54:11 +00:00
Matt Arsenault	ff0022d12c	AMDGPU: Add operand target flags serialization git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306995 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-02 23:21:48 +00:00
Hiroshi Inoue	71a28cb414	fix trivial typos; NFC suport -> support git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306968 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-02 03:24:54 +00:00
Matt Arsenault	c278dccfd0	AMDGPU: Remove SITypeRewriter This was an old workaround for using v16i8 in some old intrinsics for resource descriptors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306603 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 21:38:50 +00:00
Geoff Berry	28b3f06e1a	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI. Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306554 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 15:53:17 +00:00
Stanislav Mekhanoshin	8b38a13919	[AMDGPU] Add pattern for v_alignbit_b32 with immediate If immediate in shift is less than 32 we can use alignbit too. Differential Revision: https://reviews.llvm.org/D34729 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306500 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 02:52:39 +00:00
Stanislav Mekhanoshin	040f338ab8	[AMDGPU] Add 2 new alignbit patterns Differential Revision: https://reviews.llvm.org/D34655 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306449 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 19:10:47 +00:00
Stanislav Mekhanoshin	e764e24028	[AMDGPU] Simplify setcc (sext from i1 b), -1\|0, cc Depending on the compare code that can be either an argument of sext or negate of it. This helps to avoid v_cndmask_b64 instruction for sext. A reversed value can be further simplified and folded into its parent comparison if possible. Differential Revision: https://reviews.llvm.org/D34545 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306446 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 18:53:03 +00:00
Stanislav Mekhanoshin	e2d935510c	[AMDGPU] Combine and x, (sext cc from i1) => select cc, x, 0 Also factored out function to check if a boolean is an already deserialized value which does not require v_cndmask_b32 to be loaded. Added binary logical operators to its check. Differential Revision: https://reviews.llvm.org/D34500 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306439 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 18:25:26 +00:00
Sam Kolton	06ed4a14fd	[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it. 2. There were several problems with support of VOPC instructions in SDWA peephole pass. Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye Differential Revision: https://reviews.llvm.org/D34626 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306413 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 15:02:23 +00:00
Hiroshi Inoue	0df653a65e	fix trivial typos, NFC succesor -> successor git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306393 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 10:35:37 +00:00
Nicolai Haehnle	7ca35760c5	AMDGPU: M0 operands to spill/restore opcodes are dead Summary: With scalar stores, M0 is clobbered and therefore marked as implicitly defined. However, it is also dead. This fixes an assertion when the Greedy Register Allocator decides to optimize a spill/restore pair away again (via tryHintsRecoloring). Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33319 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306375 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 08:04:13 +00:00
Matt Arsenault	8e828b87b2	AMDGPU: Setup SP/FP in callee function prolog/epilog git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306312 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 17:53:59 +00:00
Tom Stellard	8d3ca7cfeb	AMDGPU/GlobalISel: Mark 32-bit G_SHL as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34589 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306298 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 15:56:52 +00:00
Matt Arsenault	92c7507eee	AMDGPU: Whitespace fixes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306265 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 03:01:36 +00:00
Matt Arsenault	ec6175c524	AMDGPU: Partially fix implicit.buffer.ptr intrinsic handling This should not be treated as a different version of private_segment_buffer. These are distinct things with different uses and register classes, and requires the function argument info to have more context about the function's type and environment. Also add missing test coverage for the intrinsic, and emit an error for HSA. This also encovers that the intrinsic is broken unless there happen to be stack objects. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306264 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 03:01:31 +00:00
Rafael Espindola	3a48f331ba	Remove a processFixupValue hack. The intention of processFixupValue is not to redefine the semantics of MCExpr. It is odd enough that a expression lowers to a PCRel MCExpr or not depending on what it looks like. At least it is a local hack now. I left a fix for anyone trying to figure out what producers should be producing a different expression. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306200 cdac9f57-aa62-4fd3-8940-286f4534e8a0	2017-06-24 05:12:29 +00:00
Rafael Espindola	bfb1e6dd81	Remove redundant argument. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306189 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-24 00:26:57 +00:00
Rafael Espindola	374592322d	Move Value adjustment to applyFixup. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306178 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-23 23:05:15 +00:00
Rafael Espindola	3d8b65f712	ARM: move some logic from processFixupValue to applyFixup. processFixupValue is called on every relaxation iteration. applyFixup is only called once at the very end. applyFixup is then the correct place to do last minute changes and value checks. While here, do proper range checks again for fixup_arm_thumb_bl. We used to do it, but dropped because of thumb2. We now do it again, but use the thumb2 range. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306177 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-23 22:52:36 +00:00
Tom Stellard	111d1b387d	AMDGPU/GlobalISel: Mark 32-bit G_AND as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34349 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306112 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-23 15:17:17 +00:00
David Stuttard	9066575ebe	[AMDGPU] Add intrinsics for tbuffer load and store - build error fix Variable was unused in non-debug build (used in assert) causing compile time warning and eventual build failure git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306034 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-22 17:15:49 +00:00
David Stuttard	dad6e61ce7	[AMDGPU] Add intrinsics for tbuffer load and store Intrinsic already existed for llvm.SI.tbuffer.store Needed tbuffer.load and also re-implementing the intrinsic as llvm.amdgcn.tbuffer.* Added CodeGen tests for the 2 new variants added. Left the original llvm.SI.tbuffer.store implementation to avoid issues with existing code Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr Differential Revision: https://reviews.llvm.org/D30687 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306031 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-22 16:29:22 +00:00
Sam Kolton	1f2bcd710f	[AMDGPU] SDWA: remove support for VOP2 instructions that have only 64-bit encoding Summary: Despite that this instructions are listed in VOP2, they are treated as VOP3 in specs. They should not support SDWA. There are no real instructions for them, but there are pseudo instructions. Reviewers: arsenm, vpykhtin, cfang Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34403 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305999 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-22 12:42:14 +00:00
Sam Kolton	e88fc4046f	[AMDGPU] SDWA: add support for GFX9 in peephole pass Summary: Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers. Added several subtarget features for GFX9 SDWA. This diff also contains changes from D34026. Depends D34026 Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34241 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305986 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-22 06:26:41 +00:00
Stanislav Mekhanoshin	91cd127b89	[AMDGPU] Add FP_CLASS to the add/setcc combine This is one of the nodes which also compile as v_cmp_*. Differential Revision: https://reviews.llvm.org/D34485 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305970 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-21 23:46:22 +00:00

1 2 3 4 5 ...

1927 Commits