RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2024-11-27 13:40:30 +00:00

Author	SHA1	Message	Date
Matt Arsenault	078c435803	AMDGPU: Return correct type during argument lowering The type needs to be casted back to the original argument type. Fixes an assert that for some reason is only run when using -debug. Includes an additional combine to avoid test regressions from having conversions mixed with multiple Assert[SZ]ext nodes. On subtargets where i16 is legal, this was producing an i32 register with an i16 AssertZExt, truncated to i16 with another i8 AssertZExt. t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0 t3: i16 = truncate t2 t5: i16 = AssertZext t3, ValueType:ch:i8 t6: i8 = truncate t5 t7: i32 = zero_extend t6 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308082 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-15 05:52:59 +00:00
Davide Italiano	33778b7f13	[AMDGPU] Throw away more dead code. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308055 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-14 21:20:29 +00:00
Davide Italiano	9994117767	[AMDGPU] Garbage collect dead code. NFCI. Unbreaks the build with GCC7. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308047 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-14 18:47:29 +00:00
Alfred Huang	1356a150af	[AMDGPU] Do not insert an instruction into worklist twice in movetovalu In moveToVALU(), move to vector ALU is performed, all instrs in the use chain will be visited. We do not want the same node to be pushed to the visit worklist more than once. Differential Revision: https://reviews.llvm.org/D34726 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308039 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-14 17:56:55 +00:00
Matt Arsenault	f9915c27c2	AMDGPU: Detect kernarg segment pointer This is necessary to pass the kernarg segment pointer to callee functions. Also don't unconditionally enable for kernels. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307978 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-14 00:11:13 +00:00
Stanislav Mekhanoshin	9fc15af9b2	[AMDGPU] fcaninicalize optimization for GFX9+ Since GFX9 supports denorm modes for v_min_f32/v_max_f32 that is possible to further optimize fcanonicalize and remove it if applied to min/max given their operands are known not to be an sNaN or that sNaNs are not supported. Additionally we can remove fcanonicalize if denorms are supported for the VT and we know that its argument is never a NaN. Differential Revision: https://reviews.llvm.org/D35335 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307976 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-13 23:59:15 +00:00
Matt Arsenault	a20c1d0cec	AMDGPU: Annotate call graph with used features Previously this wouldn't detect used features indirectly used in callee functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307967 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-13 21:43:42 +00:00
Hiroshi Inoue	ff281e5fb6	fix typos in comments and error messges; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307885 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-13 06:48:39 +00:00
Matt Arsenault	ffac88a158	AMDGPU: Fix converting unanalyzable global loads to SMRD Not all memory dependence queries succeed, so this needs to be conservative if it fails. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307861 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-12 23:06:18 +00:00
Stanislav Mekhanoshin	16be511cb4	[AMDGPU] fcanonicalize elimination optimization We are using multiplication by 1.0 to flush denormals and quiet sNaNs. That is possible to omit this multiplication if source of the fcanonicalize instruction is known to be flushed/quieted, i.e. if it comes from another instruction known to do the normalization and we are using IEEE mode to quiet sNaNs. Differential Revision: https://reviews.llvm.org/D35218 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307848 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-12 21:20:28 +00:00
Rafael Espindola	4aebf83110	Fully fix the movw/movt addend. The issue is not if the value is pcrel. It is whether we have a relocation or not. If we have a relocation, the static linker will select the upper bits. If we don't have a relocation, we have to do it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307730 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-11 23:18:25 +00:00
Evandro Menezes	fdda7ea9d5	[CodeGen] Rename DEBUG_TYPE to match passnames Rename missing DEBUG_TYPE "machine-scheduler" from backend files, which were absent from https://reviews.llvm.org/rL303921. Differential revision: https://reviews.llvm.org/D35231 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307719 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-11 22:08:28 +00:00
Konstantin Zhuravlyov	2e2081eea2	Revert "AMDGPU: Do not test for SI in getIsaVersion" This reverts commit r307573. This breaks downstream test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307678 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-11 17:57:41 +00:00
Nirav Dave	c7acbe2ea6	Add DAG argument to canMergeStoresTo NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307583 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 20:25:54 +00:00
Matt Arsenault	d380c14b7a	AMDGPU: Allow SIShrinkInstructions to fold FrameIndexes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307576 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 20:04:35 +00:00
Matt Arsenault	a038a8340c	AMDGPU: Allow SIShrinkInstructions to work in non-SSA Immediates can be folded as long as the immediate is a vreg. Also undo commuting instructions if it didn't fold an immediate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307575 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 19:53:57 +00:00
Matt Arsenault	7231966089	AMDGPU: Remove unnecessary check for constant operands An instruction that has an immediate operand can't reach this point. This is only called for a freshly shrunk instruction, which prevously couldn't have had a literal constant operand. This was also not conservative enough since it woudl also have had to filter other constant-like inputs like frame indexes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307574 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 19:33:38 +00:00
Konstantin Zhuravlyov	f392c1f922	AMDGPU: Do not test for SI in getIsaVersion SI is being tested by isa version in the first two if statements of the function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307573 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 19:24:05 +00:00
Simon Pilgrim	db24b6e4f7	[AMDGPU] Fix -Wimplicit-fallthrough warning. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307485 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-08 19:50:03 +00:00
Simon Pilgrim	2541a59ac3	Fix some more -Wimplicit-fallthrough warnings. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307411 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 16:40:06 +00:00
Sam Kolton	f9327929eb	[AMDGPU] Assembler: refactor convert methods (VOP3 and MIMG) Summary: Simplified converter methods for VOP3 and MIMG. Reviewers: dp, artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, vpykhtin, t-tye Differential Revision: https://reviews.llvm.org/D35047 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307407 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 15:21:52 +00:00
Dmitry Preobrazhensky	c956bf87e0	[AMDGPU][mc][gfx9] Added support of op_sel/op_sel_hi for V_MAD_MIX* See https://bugs.llvm.org//show_bug.cgi?id=33595 Reviewers: vpykhtin, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D35021 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307402 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 14:29:06 +00:00
Simon Pilgrim	26aa51226a	[AMDGPU] Fix -Wimplicit-fallthrough warnings. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307381 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 10:18:57 +00:00
Sean Fertile	471398ffea	Extend memcpy expansion in Transform/Utils to handle wider operand types. Adds loop expansions for known-size and unknown-sized memcpy calls, allowing the target to provide the operand types through TTI callbacks. The default values for the TTI callbacks use int8 operand types and matches the existing behaviour if they aren't overridden by the target. Differential revision: https://reviews.llvm.org/D32536 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307346 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 02:00:06 +00:00
Matt Arsenault	8763b3ac42	AMDGPU: Add macro fusion schedule DAG mutation Try to increase opportunities to shrink vcc uses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307313 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 20:57:05 +00:00
Matt Arsenault	92223c6fe5	AMDGPU: Minor cleanup of shrinking logic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307312 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 20:56:59 +00:00
Stanislav Mekhanoshin	71b4fe4228	[AMDGPU] Always use rcp + mul with fast math Regardless of relaxation options such as -cl-fast-relaxed-math we are producing rather long code for fdiv via amdgcn_fdiv_fast intrinsic. This intrinsic is used to replace fdiv with 2.5ulp metadata and does not handle denormals, thus believed to be fast. An fdiv instruction can also have fast math flag either by itself or together with fpmath metadata. Clang used with a relaxation flag always produces both metadata and fast flag: %div = fdiv fast float %v, %0, !fpmath !12 !12 = !{float 2.500000e+00} Current implementation ignores fast flag and favors metadata. An instruction with just fast flag would be lowered to a fastest rcp + mul, but that never happen on practice because of described mutual clang and BE behavior. This change allows an "fdiv fast" to be always lowered as rcp + mul. Differential Revision: https://reviews.llvm.org/D34844 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307308 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 20:34:21 +00:00
Craig Topper	6dbd34d261	[Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307292 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 18:39:47 +00:00
Quentin Colombet	d268a8d71a	[AMDGPU] Move GISel accessor initialization from TargetMachine to Subtarget. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307186 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-05 18:40:56 +00:00
Alexander Timofeev	f9e9586c80	[AMDGPU] Switch scalarize global loads ON by default Differential revision: https://reviews.llvm.org/D34407 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307097 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-04 17:32:00 +00:00
Marek Olsak	97186bff40	[AMDGPU] Fix latency of MIMG instructions Patch by cwabbott (Connor Abbott). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307081 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-04 14:43:38 +00:00
NAKAMURA Takumi	0a256123a4	Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default" It broke a testcase. Failing Tests (1): LLVM :: CodeGen/AMDGPU/alignbit-pat.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307054 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-04 02:14:18 +00:00
Alexander Timofeev	0f9ec97238	[AMDGPU] Switch scalarize global loads ON by default Differential revision: https://reviews.llvm.org/D34407 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307026 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-03 14:54:11 +00:00
Matt Arsenault	ff0022d12c	AMDGPU: Add operand target flags serialization git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306995 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-02 23:21:48 +00:00
Hiroshi Inoue	71a28cb414	fix trivial typos; NFC suport -> support git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306968 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-02 03:24:54 +00:00
Matt Arsenault	c278dccfd0	AMDGPU: Remove SITypeRewriter This was an old workaround for using v16i8 in some old intrinsics for resource descriptors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306603 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 21:38:50 +00:00
Geoff Berry	28b3f06e1a	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI. Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306554 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 15:53:17 +00:00
Stanislav Mekhanoshin	8b38a13919	[AMDGPU] Add pattern for v_alignbit_b32 with immediate If immediate in shift is less than 32 we can use alignbit too. Differential Revision: https://reviews.llvm.org/D34729 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306500 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 02:52:39 +00:00
Stanislav Mekhanoshin	040f338ab8	[AMDGPU] Add 2 new alignbit patterns Differential Revision: https://reviews.llvm.org/D34655 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306449 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 19:10:47 +00:00
Stanislav Mekhanoshin	e764e24028	[AMDGPU] Simplify setcc (sext from i1 b), -1\|0, cc Depending on the compare code that can be either an argument of sext or negate of it. This helps to avoid v_cndmask_b64 instruction for sext. A reversed value can be further simplified and folded into its parent comparison if possible. Differential Revision: https://reviews.llvm.org/D34545 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306446 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 18:53:03 +00:00
Stanislav Mekhanoshin	e2d935510c	[AMDGPU] Combine and x, (sext cc from i1) => select cc, x, 0 Also factored out function to check if a boolean is an already deserialized value which does not require v_cndmask_b32 to be loaded. Added binary logical operators to its check. Differential Revision: https://reviews.llvm.org/D34500 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306439 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 18:25:26 +00:00
Sam Kolton	06ed4a14fd	[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it. 2. There were several problems with support of VOPC instructions in SDWA peephole pass. Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye Differential Revision: https://reviews.llvm.org/D34626 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306413 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 15:02:23 +00:00
Hiroshi Inoue	0df653a65e	fix trivial typos, NFC succesor -> successor git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306393 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 10:35:37 +00:00
Nicolai Haehnle	7ca35760c5	AMDGPU: M0 operands to spill/restore opcodes are dead Summary: With scalar stores, M0 is clobbered and therefore marked as implicitly defined. However, it is also dead. This fixes an assertion when the Greedy Register Allocator decides to optimize a spill/restore pair away again (via tryHintsRecoloring). Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33319 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306375 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 08:04:13 +00:00
Matt Arsenault	8e828b87b2	AMDGPU: Setup SP/FP in callee function prolog/epilog git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306312 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 17:53:59 +00:00
Tom Stellard	8d3ca7cfeb	AMDGPU/GlobalISel: Mark 32-bit G_SHL as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34589 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306298 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 15:56:52 +00:00
Matt Arsenault	92c7507eee	AMDGPU: Whitespace fixes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306265 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 03:01:36 +00:00
Matt Arsenault	ec6175c524	AMDGPU: Partially fix implicit.buffer.ptr intrinsic handling This should not be treated as a different version of private_segment_buffer. These are distinct things with different uses and register classes, and requires the function argument info to have more context about the function's type and environment. Also add missing test coverage for the intrinsic, and emit an error for HSA. This also encovers that the intrinsic is broken unless there happen to be stack objects. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306264 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 03:01:31 +00:00
Rafael Espindola	3a48f331ba	Remove a processFixupValue hack. The intention of processFixupValue is not to redefine the semantics of MCExpr. It is odd enough that a expression lowers to a PCRel MCExpr or not depending on what it looks like. At least it is a local hack now. I left a fix for anyone trying to figure out what producers should be producing a different expression. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306200 cdac9f57-aa62-4fd3-8940-286f4534e8a0	2017-06-24 05:12:29 +00:00
Rafael Espindola	bfb1e6dd81	Remove redundant argument. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306189 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-24 00:26:57 +00:00

1 2 3 4 5 ...

1935 Commits