archived-llvm

mirror of https://github.com/RPCSX/llvm.git synced 2026-01-31 01:05:23 +01:00

Author	SHA1	Message	Date
Hiroshi Inoue	e3b8cd6b61	fix typos in comments; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308127 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-16 08:11:56 +00:00
Matt Arsenault	078c435803	AMDGPU: Return correct type during argument lowering The type needs to be casted back to the original argument type. Fixes an assert that for some reason is only run when using -debug. Includes an additional combine to avoid test regressions from having conversions mixed with multiple Assert[SZ]ext nodes. On subtargets where i16 is legal, this was producing an i32 register with an i16 AssertZExt, truncated to i16 with another i8 AssertZExt. t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0 t3: i16 = truncate t2 t5: i16 = AssertZext t3, ValueType:ch:i8 t6: i8 = truncate t5 t7: i32 = zero_extend t6 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308082 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-15 05:52:59 +00:00
Alfred Huang	1356a150af	[AMDGPU] Do not insert an instruction into worklist twice in movetovalu In moveToVALU(), move to vector ALU is performed, all instrs in the use chain will be visited. We do not want the same node to be pushed to the visit worklist more than once. Differential Revision: https://reviews.llvm.org/D34726 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308039 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-14 17:56:55 +00:00
Matt Arsenault	f9915c27c2	AMDGPU: Detect kernarg segment pointer This is necessary to pass the kernarg segment pointer to callee functions. Also don't unconditionally enable for kernels. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307978 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-14 00:11:13 +00:00
Stanislav Mekhanoshin	9fc15af9b2	[AMDGPU] fcaninicalize optimization for GFX9+ Since GFX9 supports denorm modes for v_min_f32/v_max_f32 that is possible to further optimize fcanonicalize and remove it if applied to min/max given their operands are known not to be an sNaN or that sNaNs are not supported. Additionally we can remove fcanonicalize if denorms are supported for the VT and we know that its argument is never a NaN. Differential Revision: https://reviews.llvm.org/D35335 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307976 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-13 23:59:15 +00:00
Matt Arsenault	a20c1d0cec	AMDGPU: Annotate call graph with used features Previously this wouldn't detect used features indirectly used in callee functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307967 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-13 21:43:42 +00:00
Matt Arsenault	ffac88a158	AMDGPU: Fix converting unanalyzable global loads to SMRD Not all memory dependence queries succeed, so this needs to be conservative if it fails. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307861 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-12 23:06:18 +00:00
Stanislav Mekhanoshin	16be511cb4	[AMDGPU] fcanonicalize elimination optimization We are using multiplication by 1.0 to flush denormals and quiet sNaNs. That is possible to omit this multiplication if source of the fcanonicalize instruction is known to be flushed/quieted, i.e. if it comes from another instruction known to do the normalization and we are using IEEE mode to quiet sNaNs. Differential Revision: https://reviews.llvm.org/D35218 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307848 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-12 21:20:28 +00:00
Konstantin Zhuravlyov	8f85685860	Enhance synchscope representation OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use syncscope("<scope>"), where <scope> can be "singlethread" (this replaces singlethread keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307722 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-11 22:23:00 +00:00
Matt Arsenault	d380c14b7a	AMDGPU: Allow SIShrinkInstructions to fold FrameIndexes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307576 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 20:04:35 +00:00
Matt Arsenault	a038a8340c	AMDGPU: Allow SIShrinkInstructions to work in non-SSA Immediates can be folded as long as the immediate is a vreg. Also undo commuting instructions if it didn't fold an immediate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307575 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-10 19:53:57 +00:00
Sean Fertile	471398ffea	Extend memcpy expansion in Transform/Utils to handle wider operand types. Adds loop expansions for known-size and unknown-sized memcpy calls, allowing the target to provide the operand types through TTI callbacks. The default values for the TTI callbacks use int8 operand types and matches the existing behaviour if they aren't overridden by the target. Differential revision: https://reviews.llvm.org/D32536 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307346 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 02:00:06 +00:00
Matt Arsenault	8763b3ac42	AMDGPU: Add macro fusion schedule DAG mutation Try to increase opportunities to shrink vcc uses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307313 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 20:57:05 +00:00
Matt Arsenault	0f915c6a85	AMDGPU: Remove unnecessary IR from MIR tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307311 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 20:56:57 +00:00
Stanislav Mekhanoshin	71b4fe4228	[AMDGPU] Always use rcp + mul with fast math Regardless of relaxation options such as -cl-fast-relaxed-math we are producing rather long code for fdiv via amdgcn_fdiv_fast intrinsic. This intrinsic is used to replace fdiv with 2.5ulp metadata and does not handle denormals, thus believed to be fast. An fdiv instruction can also have fast math flag either by itself or together with fpmath metadata. Clang used with a relaxation flag always produces both metadata and fast flag: %div = fdiv fast float %v, %0, !fpmath !12 !12 = !{float 2.500000e+00} Current implementation ignores fast flag and favors metadata. An instruction with just fast flag would be lowered to a fastest rcp + mul, but that never happen on practice because of described mutual clang and BE behavior. This change allows an "fdiv fast" to be always lowered as rcp + mul. Differential Revision: https://reviews.llvm.org/D34844 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307308 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 20:34:21 +00:00
David Stuttard	3b312dd635	[RegisterCoalescer] Fix for SubRange join unreachable Summary: During remat, some subranges might end up having invalid segments which caused problems for later coalescing. Added in a check to remove segments that are invalidated as part of the remat. See http://llvm.org/PR33524 Subscribers: MatzeB, qcolombet Differential Revision: https://reviews.llvm.org/D34391 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307247 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-06 10:07:57 +00:00
Alexander Timofeev	f9e9586c80	[AMDGPU] Switch scalarize global loads ON by default Differential revision: https://reviews.llvm.org/D34407 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307097 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-04 17:32:00 +00:00
NAKAMURA Takumi	0a256123a4	Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default" It broke a testcase. Failing Tests (1): LLVM :: CodeGen/AMDGPU/alignbit-pat.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307054 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-04 02:14:18 +00:00
Alexander Timofeev	0f9ec97238	[AMDGPU] Switch scalarize global loads ON by default Differential revision: https://reviews.llvm.org/D34407 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307026 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-03 14:54:11 +00:00
Richard Smith	638ba5afb5	Fix ODR violations due to abuse of LLVM_YAML_IS_(FLOW_)?SEQUENCE_VECTOR This is a short-term fix for PR33650 aimed to get the modules build bots green again. Remove all the places where we use the LLVM_YAML_IS_(FLOW_)?SEQUENCE_VECTOR macros to try to locally specialize a global template for a global type. That's not how C++ works. Instead, we now centrally define how to format vectors of fundamental types and of string (std::string and StringRef). We use flow formatting for the former cases, since that's the obvious right thing to do; in the latter case, it's less clear what the right choice is, but flow formatting is really bad for some cases (due to very long strings), so we pick block formatting. (Many of the cases that were using flow formatting for strings are improved by this change.) Other than the flow -> block formatting change for some vectors of strings, this should result in no functionality change. Differential Revision: https://reviews.llvm.org/D34907 Corresponding updates to clang, clang-tools-extra, and lld to follow. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306878 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-30 20:56:57 +00:00
Matt Arsenault	c278dccfd0	AMDGPU: Remove SITypeRewriter This was an old workaround for using v16i8 in some old intrinsics for resource descriptors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306603 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 21:38:50 +00:00
Stanislav Mekhanoshin	a143b4a4f3	Fold fneg and fabs like multiplications Given no NaNs and no signed zeroes it folds: (fmul X, (select (fcmp X > 0.0), -1.0, 1.0)) -> (fneg (fabs X)) (fmul X, (select (fcmp X > 0.0), 1.0, -1.0)) -> (fabs X) Differential Revision: https://reviews.llvm.org/D34579 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306592 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 20:25:50 +00:00
Stanislav Mekhanoshin	8b38a13919	[AMDGPU] Add pattern for v_alignbit_b32 with immediate If immediate in shift is less than 32 we can use alignbit too. Differential Revision: https://reviews.llvm.org/D34729 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306500 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 02:52:39 +00:00
Stanislav Mekhanoshin	a5e3faf5db	Allow to truncate left shift with non-constant shift amount That is pretty common for clang to produce code like (shl %x, (and %amt, 31)). In this situation we can still perform trunc (shl) into shl (trunc) conversion given the known value range of shift amount. Differential Revision: https://reviews.llvm.org/D34723 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306499 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 02:37:11 +00:00
Stanislav Mekhanoshin	040f338ab8	[AMDGPU] Add 2 new alignbit patterns Differential Revision: https://reviews.llvm.org/D34655 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306449 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 19:10:47 +00:00
Stanislav Mekhanoshin	e764e24028	[AMDGPU] Simplify setcc (sext from i1 b), -1\|0, cc Depending on the compare code that can be either an argument of sext or negate of it. This helps to avoid v_cndmask_b64 instruction for sext. A reversed value can be further simplified and folded into its parent comparison if possible. Differential Revision: https://reviews.llvm.org/D34545 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306446 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 18:53:03 +00:00
Matt Arsenault	d841eae40b	RenameIndependentSubregs: Fix infinite loop Apparently this replacement can really be substituting the same as the original register. Avoid restarting the loop when there's been no change in the register uses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306441 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 18:28:10 +00:00
Stanislav Mekhanoshin	e2d935510c	[AMDGPU] Combine and x, (sext cc from i1) => select cc, x, 0 Also factored out function to check if a boolean is an already deserialized value which does not require v_cndmask_b32 to be loaded. Added binary logical operators to its check. Differential Revision: https://reviews.llvm.org/D34500 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306439 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 18:25:26 +00:00
Sam Kolton	06ed4a14fd	[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it. 2. There were several problems with support of VOPC instructions in SDWA peephole pass. Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye Differential Revision: https://reviews.llvm.org/D34626 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306413 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 15:02:23 +00:00
Nicolai Haehnle	7ca35760c5	AMDGPU: M0 operands to spill/restore opcodes are dead Summary: With scalar stores, M0 is clobbered and therefore marked as implicitly defined. However, it is also dead. This fixes an assertion when the Greedy Register Allocator decides to optimize a spill/restore pair away again (via tryHintsRecoloring). Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33319 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306375 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 08:04:13 +00:00
Matthias Braun	ea254cbf8f	ScheduleDAGInstrs: Fix fixupKills() adding too many kill flags. Remove invalid shortcut in fixupKills(): A register needs to be marked live even when we are not adding a kill flag. This is because a partially live register must not get a kill flags, but it still needs to be fully marked live when walking backwards. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306352 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 00:58:48 +00:00
Matt Arsenault	ab5d97fb87	RenameIndependentSubregs: Fix iterator problem Fixes bug 33597. Use of substituteRegister in the tied operand case messes up the register use iterator, causing some uses to be left unprocessed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306333 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 21:33:36 +00:00
Matt Arsenault	8e828b87b2	AMDGPU: Setup SP/FP in callee function prolog/epilog git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306312 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 17:53:59 +00:00
Tom Stellard	8d3ca7cfeb	AMDGPU/GlobalISel: Mark 32-bit G_SHL as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34589 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306298 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 15:56:52 +00:00
Matt Arsenault	ec6175c524	AMDGPU: Partially fix implicit.buffer.ptr intrinsic handling This should not be treated as a different version of private_segment_buffer. These are distinct things with different uses and register classes, and requires the function argument info to have more context about the function's type and environment. Also add missing test coverage for the intrinsic, and emit an error for HSA. This also encovers that the intrinsic is broken unless there happen to be stack objects. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306264 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 03:01:31 +00:00
Rafael Espindola	c88c3632e7	Add missing %s to RUN line. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306199 cdac9f57-aa62-4fd3-8940-286f4534e8a0	2017-06-24 04:41:39 +00:00
Rafael Espindola	82693db150	Test the object file creation too. This should really be a llvm-mc test, but the parser is broken. See PR33579 for the parser bug. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306198 cdac9f57-aa62-4fd3-8940-286f4534e8a0	2017-06-24 04:31:45 +00:00
Tom Stellard	111d1b387d	AMDGPU/GlobalISel: Mark 32-bit G_AND as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34349 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306112 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-23 15:17:17 +00:00
David Stuttard	dad6e61ce7	[AMDGPU] Add intrinsics for tbuffer load and store Intrinsic already existed for llvm.SI.tbuffer.store Needed tbuffer.load and also re-implementing the intrinsic as llvm.amdgcn.tbuffer.* Added CodeGen tests for the 2 new variants added. Left the original llvm.SI.tbuffer.store implementation to avoid issues with existing code Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr Differential Revision: https://reviews.llvm.org/D30687 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306031 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-22 16:29:22 +00:00
Sam Kolton	1f2bcd710f	[AMDGPU] SDWA: remove support for VOP2 instructions that have only 64-bit encoding Summary: Despite that this instructions are listed in VOP2, they are treated as VOP3 in specs. They should not support SDWA. There are no real instructions for them, but there are pseudo instructions. Reviewers: arsenm, vpykhtin, cfang Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34403 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305999 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-22 12:42:14 +00:00
Sam Kolton	e88fc4046f	[AMDGPU] SDWA: add support for GFX9 in peephole pass Summary: Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers. Added several subtarget features for GFX9 SDWA. This diff also contains changes from D34026. Depends D34026 Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34241 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305986 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-22 06:26:41 +00:00
Stanislav Mekhanoshin	91cd127b89	[AMDGPU] Add FP_CLASS to the add/setcc combine This is one of the nodes which also compile as v_cmp_*. Differential Revision: https://reviews.llvm.org/D34485 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305970 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-21 23:46:22 +00:00
Stanislav Mekhanoshin	6189e64739	[AMDGPU] Combine add and adde, sub and sube If one of the arguments of adde/sube is zero we can fold another add/sub into it. Differential Revision: https://reviews.llvm.org/D34374 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305964 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-21 22:30:01 +00:00
Stanislav Mekhanoshin	1a1f544263	[AMDGPU] simplify add x, *ext (setcc) => addc\|subb x, 0, setcc This simplification allows to avoid generating v_cndmask_b32 to serialize condition code between compare and use. Differential Revision: https://reviews.llvm.org/D34300 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305962 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-21 22:05:06 +00:00
Stanislav Mekhanoshin	64373efcab	[AMDGPU] Fix illegal shrink of V_SUBB_U32 and V_ADDC_U32 If there is an immediate operand we shall not shrink V_SUBB_U32 and V_ADDC_U32, it does not fit e32 encoding. Differential Revison: https://reviews.llvm.org/D34291 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305840 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 20:33:44 +00:00
Matt Arsenault	5516bd1387	AMDGPU: Do operand folding in program order Before it was possible to partially fold use instructions before the defs. After the xor is folded into a copy, the same mov can end up in the fold list twice, so on the second attempt it will fail expecting to see a register to fold. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305821 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 18:56:32 +00:00
Matthias Braun	0cc137e051	RegisterScavenging: Followup to r305625 This does some improvements/cleanup to the recently introduced scavengeRegisterBackwards() functionality: - Rewrite findSurvivorBackwards algorithm to use the existing LiveRegUnit::accumulateBackward() code. This also avoids the Available and Candidates bitset and just need 1 LiveRegUnit instance (= 1 bitset). - Pick registers in allocation order instead of register number order. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305817 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 18:43:14 +00:00
Matt Arsenault	73854fd751	AMDGPU: Preserve undef when folding register operands If the source was a copy of an undef register, this would produce a read of an undefined register which is a verifier error. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305816 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 18:41:31 +00:00
Stanislav Mekhanoshin	423a449bd5	[AMDGPU] Eliminate SGPR to VGPR copy when possible SGPRs are generally cheaper, so try to use them over VGPRs. Differential Revision: https://reviews.llvm.org/D34130 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305815 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 18:32:42 +00:00
Matt Arsenault	4cabef582f	AMDGPU: Fix crash with undef vreg input operand git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305814 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 18:28:02 +00:00

1 2 3 4 5 ...

1112 Commits