RPCS3/llvm - llvm - Free-Git: DMCA Non-Compliant

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-07-21 03:05:26 -04:00

Author	SHA1	Message	Date
Nekotekina	71ca0f4f29	MCJIT: don't finalize modules on symbol lookup (workaround) This is extremely slow yet unnecessary with manual finalization. In LLVM 6 this wasn't a problem.	2019-03-28 01:33:39 +03:00
Nekotekina	beb8130d42	X86: detect pattern for variable SHL/SRL shifts (AVX2+) Remove VSELECT instruction which zeroes their result on exceeding shift amount	2019-03-28 01:33:39 +03:00
Nekotekina	8b097cc173	X86: add pattern for X86ISD::VSRAV Detect clamping ashr shift amount to max legal value	2019-03-28 01:33:39 +03:00
Nekotekina	2cbedaa71e	X86: expand detectAVGPattern() Allow all integer widths in the pattern, allow ashr Handle signed and mixed cases, allowing to replace truncation	2019-03-28 01:33:39 +03:00
Nekotekina	848fbeec7a	X86: optimize VSELECT for v16i8 with shl + sign bit test	2019-03-28 01:33:39 +03:00
Nekotekina	e020a7c8db	X86: combine inversion of VPTERNLOG	2019-03-28 01:33:39 +03:00
Nekotekina	caeae69943	X86: LowerShift: new algorithm for vector-vector shifts Emit pair of shifts of double size if possible	2019-03-28 01:33:39 +03:00
Nekotekina	f079bbcc91	X86: Fix/workaround Small Code Model for JIT Force RIP-relative jump tables and global values Force RIP-relative all zeros / all ones constants These things were causing crashes due to use of absolute addressing	2019-03-28 01:33:39 +03:00
Nekotekina	47c6cbff83	Appveyor + Travis	2019-03-28 01:33:39 +03:00
Nikita Popov	58c0bdde21	[ConstantRange] Add isWrappedSet() and isUpperSignWrapped() Split off from D59749. This adds isWrappedSet() and isUpperSignWrapped() set with the same behavior as isSignWrappedSet() and isUpperWrapped() for the respectively other domain. The methods isWrappedSet() and isSignWrappedSet() will not consider ranges of the form [X, Max] == [X, 0) and [X, SignedMax] == [X, SignedMin) to be wrapping, while isUpperWrapped() and isUpperSignWrapped() will. Also replace the checks in getUnsignedMin() and friends with method calls that implement the same logic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357112 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 19:12:09 +00:00
Teresa Johnson	9af297cc57	[CGP] Reset DT when optimizing select instructions Summary: A recent fix (r355751) caused a compile time regression because setting the ModifiedDT flag in optimizeSelectInst means that each time a select instruction is optimized the function walk in runOnFunction stops and restarts again (which was needed to build a new DT before we started building it lazily in r356937). Now that the DT is built lazily, a simple fix is to just reset the DT at this point, rather than restarting the whole function walk. In the future other places that set ModifiedDT may want to switch to just resetting the DT directly. But that will require an evaluation to ensure that they don't otherwise need to restart the function walk. Reviewers: spatel Subscribers: jdoerfert, llvm-commits, xur Tags: #llvm Differential Revision: https://reviews.llvm.org/D59889 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357111 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 18:44:25 +00:00
Jessica Paquette	8647448aa7	[opt-viewer] Teach optrecord.py about !Failure tags WarnMissedTransforms.cpp produces remarks that use !Failure tags. These weren't supported in optrecord.py, so if you encountered one in any of the tools, the tool would crash. Add them as a type of missed optimization. Differential Revision: https://reviews.llvm.org/D59895 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357110 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 18:35:04 +00:00
Eli Friedman	953adb2fe4	[ARM] Don't confuse the scheduler for very large VLDMDIA etc. ARMBaseInstrInfo::getNumLDMAddresses is making bad assumptions about the memory operands of load and store-multiple operations. This doesn't really fix the problem properly, but it's enough to prevent crashing, at least. Fixes https://bugs.llvm.org/show_bug.cgi?id=41231 . Differential Revision: https://reviews.llvm.org/D59834 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357109 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 18:33:30 +00:00
Amara Emerson	758238e94e	[AArch64][GlobalISel] Make G_PHI of v2s64, v4s32, v2s32 legal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357108 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 18:31:46 +00:00
Nikita Popov	faf8d9472d	[ConstantRange] Rename isWrappedSet() to isUpperWrapped() Split out from D59749. The current implementation of isWrappedSet() doesn't do what it says on the tin, and treats ranges like [X, Max] as wrapping, because they are represented as [X, 0) when using half-inclusive ranges. This also makes it inconsistent with the semantics of isSignWrappedSet(). This patch renames isWrappedSet() to isUpperWrapped(), in preparation for the introduction of a new isWrappedSet() method with corrected behavior. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357107 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 18:19:33 +00:00
Jessica Paquette	906b31a972	[opt-viewer] Make filter_=None by default in get_remarks and gather_results Right now, if you try to use optdiff.py on any opt records, it will fail because its calls to gather_results weren't updated to support filtering. Since filters are supposed to be optional, this makes them None by default in get_remarks and in gather_results. This allows other tools that don't support filtering to still use the functions as is. Differential Revision: https://reviews.llvm.org/D59894 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357106 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 18:14:32 +00:00
Matt Arsenault	53202c3a19	RegPressure: Fix crash on blocks with only dbg_value If there were only dbg_values in the block, recede would hit the beginning of the block and try to use thet dbg_value as a real instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357105 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 18:14:02 +00:00
Nikita Popov	a600eb60fd	[InstCombine] Use uadd.sat and usub.sat for canonicalization Start using the uadd.sat and usub.sat intrinsics for the existing canonicalizations. These intrinsics should optimize better than expanded IR, have better handling in the X86 backend and should be no worse than expanded IR in other backends, as far as we know. rL357012 already introduced use of uadd.sat for the add+umin pattern. Differential Revision: https://reviews.llvm.org/D58872 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357103 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:56:15 +00:00
Amara Emerson	a3e702346e	[GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead instructions. The artifact combiners push instructions which have been marked for deletion onto an list for the legalizer to deal with on return. However, for trunc(ext) combines the combiner routine recursively calls itself. When it does this the dead instructions list may not be empty, and the other combiners don't expect to be dealing with essentially invalid MIR (multiple vreg defs etc). This change fixes it by ensuring that the dead instructions are processed on entry into tryCombineInstruction. As a result, this fix exposed a few places in tests where G_TRUNC instructions were not being deleted even though they were dead. Differential Revision: https://reviews.llvm.org/D59892 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357101 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:47:42 +00:00
Clement Courbet	3cf6f94e45	[X86MacroFusion][NFC] Add a bulldozer test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357099 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:44:16 +00:00
Matt Arsenault	50359ffe2c	Reapply "AMDGPU: Scavenge register instead of findUnusedReg" This reapplies r356149, using the correct overload of findUnusedReg which passes the current iterator. This worked most of the time, because the scavenger iterator was moved at the end of the frame index loop in PEI. This would fail if the spill was the first instruction. This was further hidden by the fact that the scavenger wasn't passed in for normal frame index elimination. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357098 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:31:29 +00:00
Matt Arsenault	5503f81d14	AMDGPU: Add testcase I meant to merge into r357093 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357097 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:31:26 +00:00
Craig Topper	35912b770f	[X86] Add post-isel pseudos for rotate by immediate using SHLD/SHRD Haswell CPUs have special support for SHLD/SHRD with the same register for both sources. Such an instruction will go to the rotate/shift unit on port 0 or 6. This gives it 1 cycle latency and 0.5 cycle reciprocal throughput. When the register is not the same, it becomes a 3 cycle operation on port 1. Sandybridge and Ivybridge always have 1 cyc latency and 0.5 cycle reciprocal throughput for any SHLD. When FastSHLDRotate feature flag is set, we try to use SHLD for rotate by immediate unless BMI2 is enabled. But MachineCopyPropagation can look through a copy and change one of the sources to be different. This will break the hardware optimization. This patch adds psuedo instruction to hide the second source input until after register allocation and MachineCopyPropagation. I'm not sure if this is the best way to do this or if there's some other way we can make this work. Fixes PR41055 Differential Revision: https://reviews.llvm.org/D59391 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357096 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:29:34 +00:00
Quentin Colombet	5a4d441633	[PeepholeOpt] Don't stop simplifying copies on sequence of subregs This patch removes an overly conservative check that would prevent simplifying copies when the value we were tracking would go through several subregister indices. Indeed, the intend of this check was to not track values whenever we have to compose subregister, but actually what the check was doing was bailing anytime we see a second subreg, even if that second subreg would actually be the new source of truth (as opposed to a part of that subreg). Differential Revision: https://reviews.llvm.org/D59891 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357095 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:27:56 +00:00
Sander de Smalen	d0d95f2d77	[AArch64][SVE] Asm: error on unexpected SVE vector register type suffix This patch fixes an assembler bug that allowed SVE vector registers to contain a type suffix when not expected. The SVE unpredicated movprfx instruction is the only instruction affected. The following are examples of what was previously valid: movprfx z0.b, z0.b movprfx z0.b, z0.s movprfx z0, z0.s These instructions are now erroneous. Patch by Cullen Rhodes (c-rhodes) Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D59636 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357094 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:23:38 +00:00
Matt Arsenault	0755a8d19c	AMDGPU: Enable the scavenger for large frames Another test is needed for the case where the scavenge fail, but there's another issue with that which needs an additional fix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357093 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:14:32 +00:00
Matt Arsenault	44ee20512e	AMDGPU: Add additional MIR tests for exec mask optimizations Also includes one example of how this transform is unsound. This isn't verifying the copies are used in the control flow intrinisic patterns. Also add option to disable exec mask opt pass. Since this pass is unsound, it may be useful to turn it off until it is fixed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357091 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 16:58:30 +00:00
Matt Arsenault	99267d8b98	AMDGPU: Skip debug_instr when collapsing end_cf Based on how these are inserted, I doubt this was causing a problem in practice. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357090 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 16:58:27 +00:00
Matt Arsenault	502b049dcb	AMDGPU: Fix missing scc implicit def on s_andn2_b64_term Introduce new helper class to copy properties directly from the base instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357089 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 16:58:22 +00:00
Mikhail R. Gadelha	86a3d313e8	New methods to check for under-/overflow in the SMT API Summary: Added methods to check for under-/overflow in additions, subtractions, signed divisions/modulus, negations, and multiplications. Reviewers: ddcc, gou4shi1 Reviewed By: ddcc, gou4shi1 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59796 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357088 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 16:54:12 +00:00
Matt Arsenault	9859759f10	PEI: Delay checking requiresFrameIndexReplacementScavenging Currently this is called before the frame size is set on the function. For AMDGPU, the scavenger is used for large frames where part of the offset needs to be materialized in a register, so estimating the frame size is useful for knowing whether the scavenger is useful. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357087 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 16:37:31 +00:00
Andrea Di Biagio	77362bcbc4	[MCA] Fix -Wparentheses warning breaking the -Werror build. Waring was introduced at r357074. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357085 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 16:22:36 +00:00
Matt Arsenault	68048d45d3	AMDGPU: Don't hardcode num defs for MUBUF instructions This shouldn't change anything since the no-ret atomics are selected later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357084 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 16:12:29 +00:00
Matt Arsenault	4710131927	MIR: Freeze reserved regs after parsing everything The AMDGPU implementation of getReservedRegs depends on MachineFunctionInfo fields that are parsed from the YAML section. This was reserving the wrong register since it was setting the reserved regs before parsing the correct one. Some tests were relying on the default reserved set for the assumed default calling convention. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357083 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 16:12:26 +00:00
Matt Arsenault	17f72131c2	AMDGPU: wave_barrier is not isBarrier This is not a control flow instruction, so should not be marked as isBarrier. This fixes a verifier error if followed by unreachable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357081 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 15:54:45 +00:00
Yonghong Song	85ad8db97e	[BPF] use std::map to ensure consistent output The .BTF.ext FuncInfoTable and LineInfoTable contain information organized per ELF section. Current definition of FuncInfoTable/LineInfoTable is: std::unordered_map<uint32_t, std::vector<BTFFuncInfo>> FuncInfoTable std::unordered_map<uint32_t, std::vector<BTFLineInfo>> LineInfoTable where the key is the section name off in the string table. The unordered_map may cause the order of section output different for different platforms. The same for unordered map definition of std::unordered_map<std::string, std::unique_ptr<BTFKindDataSec>> DataSecEntries where BTF_KIND_DATASEC entries may have different ordering for different platforms. This patch fixed the issue by using std::map. Test static-var-derived-type.ll is modified to generate two DataSec's which will ensure the ordering is the same for all supported platforms. Signed-off-by: Yonghong Song <yhs@fb.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357077 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 15:45:27 +00:00
Clement Courbet	88220ed3cb	[X86MacroFusion][NFC] Improve macrofusion testing. Add negative tests. Add arithmetic/inc/cmp/and macrofusion tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357076 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 15:43:03 +00:00
Andrea Di Biagio	15b78418a4	[MCA][Pipeline] Don't visit stages in reverse order when calling method cycleEnd(). NFCI There is no reason why stages should be visited in reverse order. This patch allows the definition of stages that push instructions forward from their cycleEnd() routine. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357074 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 15:41:53 +00:00
Matt Arsenault	f46cbbae71	AMDGPU: Fix areLoadsFromSameBasePtr for DS atomics The offset operand index is different for atomics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357073 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 15:41:00 +00:00
Nico Weber	b23265474e	gn build: Merge r357047 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357071 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 15:10:47 +00:00
Nirav Dave	30d9733443	[DAGCombiner] Unify Lifetime and memory Op aliasing. Rework BaseIndexOffset and isAlias to fully work with lifetime nodes and fold in lifetime alias analysis. This is mostly NFC. Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59794 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357070 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 14:14:46 +00:00
Nirav Dave	269bbd151f	[DAGCombine] Refactor GatherAllAliases. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357069 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 14:14:35 +00:00
Hans Wennborg	939c183841	Re-commit r355490 "[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default" Original commit by Ayonam Ray. This commit adds a regression test for the issue discovered in the previous commit: that the range check for the jump table can only be omitted if the fall-through destination of the jump table is unreachable, which isn't necessarily true just because the default of the switch is unreachable. This addresses the missing optimization in PR41242. > During the lowering of a switch that would result in the generation of a > jump table, a range check is performed before indexing into the jump > table, for the switch value being outside the jump table range and a > conditional branch is inserted to jump to the default block. In case the > default block is unreachable, this conditional jump can be omitted. This > patch implements omitting this conditional branch for unreachable > defaults. > > Differential Revision: https://reviews.llvm.org/D52002 > Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357067 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 14:10:11 +00:00
Dmitry Preobrazhensky	7d583cafa6	Revert of 357063 [AMDGPU][MC] Corrected handling of tied src for atomic return MUBUF opcodes Reason: the change was mistakenly committed before review git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357066 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 13:49:52 +00:00
Kevin P. Neal	aaa3361678	The IR verifier currently supports the constrained floating point intrinsics, but the implementation is hard to extend. It doesn't currently have an easy way to support intrinsics that, for example, lack a rounding mode. This will be needed for impending new constrained intrinsics. This code is split out of D55897 <https://reviews.llvm.org/D55897>, which itself was split out of D43515 <https://reviews.llvm.org/D43515>. Reviewed by: arsenm Differential Revision: http://reviews.llvm.org/D59830 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357065 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 13:30:57 +00:00
Sander de Smalen	26882c9d25	[AArch64] NFC: Cleanup isAArch64FrameOffsetLegal Cleanup isAArch64FrameOffsetLegal by: - Merging the large switch statement to reuse AArch64InstrInfo::getMemOpInfo(). - Using AArch64InstrInfo::getUnscaledLdSt() to determine whether an instruction has an unscaled variant. - Simplifying the logic that calculates the offset to fit the immediate. Reviewers: paquette, evandro, eli.friedman, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59636 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357064 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 13:16:19 +00:00
Dmitry Preobrazhensky	f49a51bff0	[AMDGPU][MC] Corrected handling of tied src for atomic return MUBUF opcodes See bug 40917: https://bugs.llvm.org/show_bug.cgi?id=40917 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D59305 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357063 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 13:07:41 +00:00
Simon Pilgrim	1ac40fdf4d	[X86][SSE] Add shuffle test case for PR41249 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357062 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 11:21:09 +00:00
Sander de Smalen	9696991bd8	[AArch64] Adds cases for LDRSHWui and LDRSHXui to getMemOpInfo This patch also adds cases PRFUMi and PRFMui. This change was discussed in https://reviews.llvm.org/D59635. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357059 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 10:39:03 +00:00
Andrew Ng	a7d2ce0679	[Support] MemoryBlock size should reflect the requested size This patch mirrors the change made to the Unix equivalent in r351916. This in turn fixes bugs related to the use of FileOutputBuffer to output to "-", i.e. stdout, on Windows. Differential Revision: https://reviews.llvm.org/D59663 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357058 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 10:26:21 +00:00

1 2 3 4 5 ...

176994 Commits