llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-03-06 03:19:50 +00:00

Author	SHA1	Message	Date
Matthias Braun	00b30110fb	Add MIR-level outlining pass This is a patch for the outliner described in the RFC at: http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html The outliner is a code-size reduction pass which works by finding repeated sequences of instructions in a program, and replacing them with calls to functions. This is useful to people working in low-memory environments, where sacrificing performance for space is acceptable. This adds an interprocedural outliner directly before printing assembly. For reference on how this would work, this patch also includes X86 target hooks and an X86 test. The outliner is run like so: clang -mno-red-zone -mllvm -enable-machine-outliner file.c Patch by Jessica Paquette<jpaquette@apple.com>! rdar://29166825 Differential Revision: https://reviews.llvm.org/D26872 llvm-svn: 296418	2017-02-28 00:33:32 +00:00
Amaury Sechet	330a189992	Add test case for computing known bits of substraction operations. NFC llvm-svn: 296417	2017-02-28 00:15:13 +00:00
Michael Kuperstein	bbb5beaf34	[CGP] Split some critical edges coming out of indirect branches Splitting critical edges when one of the source edges is an indirectbr is hard in general (because it requires changing the memory the indirectbr reads). But if a block only has a single indirectbr predecessor (which is the common case), we can simulate splitting that edge by splitting the destination block, and retargeting the direct branches. This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame() ends up using an indirect branch with ~100 successors, and passing a constant to each of those. Since MachineSink can't break indirect critical edges on demand (and doing this in MIR doesn't look feasible), this causes us to emit about ~100 defs of registers containing constants, which we in the predecessor block, where only one of those constants is used in each successor. So, at each computed goto, we needlessly spill about a 100 constants to stack. The end result is that a clang-compiled python interpreter can be about ~2.5x slower on a simple python reduction loop than a gcc-compiled interpreter. Differential Revision: https://reviews.llvm.org/D29916 llvm-svn: 296416	2017-02-28 00:11:34 +00:00
Zachary Turner	cd226b0757	[PDB] Make streams carry their own endianness. Before the endianness was specified on each call to read or write of the StreamReader / StreamWriter, but in practice it's extremely rare for streams to have data encoded in multiple different endiannesses, so we should optimize for the 99% use case. This makes the code cleaner and more general, but otherwise has NFC. llvm-svn: 296415	2017-02-28 00:04:07 +00:00
Eugene Zelenko	7c39ec2ebd	[DebugInfo] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 296413	2017-02-27 23:43:14 +00:00
Michael Kuperstein	8a9eb97aa0	[SLP] Load sorting should not try to sort things that aren't loads. We may get a VL where the first element is a load, but the others aren't. Trying to sort such VLs can only lead to sorrow. llvm-svn: 296411	2017-02-27 23:18:11 +00:00
Dan Gohman	a0a2f9eacb	[MC] Implement the COFF directives in MCNullStreamer. This fixes -filetype=null errors introduced in r296403. llvm-svn: 296410	2017-02-27 23:10:18 +00:00
Matt Arsenault	3168453e6e	AMDGPU: Basic folds for fmed3 intrinsic Constant fold, canonicalize constants to RHS, reduce to minnum/maxnum when inputs are nan/undef. llvm-svn: 296409	2017-02-27 23:08:49 +00:00
Zachary Turner	c6ab3e91eb	Remove some code accidentally left in. llvm-svn: 296407	2017-02-27 22:57:32 +00:00
Petr Hosek	cab367b07a	[AddressSanitizer] Put shadow at 0 for Fuchsia The Fuchsia ASan runtime reserves the low part of the address space. Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D30426 llvm-svn: 296405	2017-02-27 22:49:37 +00:00
Eugene Zelenko	d8db98615d	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 296404	2017-02-27 22:45:06 +00:00
Dan Gohman	48b9ccfac2	[MC] Factor out non-COFF handling of COFF-specific directives. Instead of requiring every non-COFF MCObjectStreamer to implement the COFF hooks just to do an llvm_unreachable to say that they're not supported, do the llvm_unreachable in the default implementation, as suggested by rnk in https://reviews.llvm.org/D26722. llvm-svn: 296403	2017-02-27 22:44:37 +00:00
Dan Gohman	df23c64a6a	[WebAssembly] Add some comments and tidy up whitespace. llvm-svn: 296402	2017-02-27 22:41:39 +00:00
Matt Arsenault	824e186e4d	AMDGPU: Use v_med3_{f16\|i16\|u16} llvm-svn: 296401	2017-02-27 22:40:39 +00:00
Dan Gohman	f2e8848268	[WebAssembly] Split CFG-sorting into its own pass. NFC. CFG sorting was already an independent algorithm from block/loop insertion; this change makes it more convenient to debug. llvm-svn: 296399	2017-02-27 22:38:58 +00:00
Hans Wennborg	ff792be595	Revert r296366 "[InlineFunction] add nonnull assumptions based on argument attributes" It causes miscompiles e.g. during self-host of Clang (PR32082). llvm-svn: 296398	2017-02-27 22:33:02 +00:00
Zachary Turner	be1966c92e	Add missing namespace qualifier. llvm-svn: 296397	2017-02-27 22:17:50 +00:00
Matt Arsenault	8df97c1243	AMDGPU: Support v2i16/v2f16 packed operations llvm-svn: 296396	2017-02-27 22:15:25 +00:00
Arnold Schwaighofer	bb15bf82a9	ISel: We need to notify FastIS of the IMPLICIT_DEF we created in createSwiftErrorEntriesInEntryBlock Otherwise, it will insert instructions before it. rdar://30536186 llvm-svn: 296395	2017-02-27 22:12:06 +00:00
Zachary Turner	f7fd863005	[PDB] Partial resubmit of r296215, which improved PDB Stream Library. This was reverted because it was breaking some builds, and because of incorrect error code usage. Since the CL was large and contained many different things, I'm resubmitting it in pieces. This portion is NFC, and consists of: 1) Renaming classes to follow a consistent naming convention. 2) Fixing the const-ness of the interface methods. 3) Adding detailed doxygen comments. 4) Fixing a few instances of passing `const BinaryStream& X`. These are now passed as `BinaryStreamRef X`. llvm-svn: 296394	2017-02-27 22:11:43 +00:00
Matt Arsenault	11bacdf751	Revert "DAG: Check if extract_vector_elt is legal or custom" This reverts r295782. This could potentially result in some legalization loops and I avoided the need for this. llvm-svn: 296393	2017-02-27 21:59:07 +00:00
Xin Tong	b421558f5b	Empty line. NFCI llvm-svn: 296392	2017-02-27 21:51:48 +00:00
Rong Xu	65af7567a9	[PGO] Fix a bug in reading text format value profile. Summary: Should use the Valuekind read from the profile. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits, xur Differential Revision: https://reviews.llvm.org/D30420 llvm-svn: 296391	2017-02-27 21:42:39 +00:00
Sanjay Patel	5b3ea3a127	[ARM] don't transform an add(ext Cond), C to select unless there's a setcc of the condition The transform in question claims to be doing: // fold (add (select cc, 0, c), x) -> (select cc, x, (add, x, c)) ...starting in PerformADDCombineWithOperands(), but it wasn't actually checking for a setcc node for the sext/zext patterns. This is exactly the opposite of a transform I'd like to add to DAGCombiner's foldSelectOfConstants(), so I was seeing infinite loops with my draft of a patch applied. The changes in select_const.ll look positive (less instructions). The change in arm-and-tst-peephole.ll is unrelated. We're changing the input IR in that test to preserve the intent of the test, but that's not affected by this code change. Differential Revision: https://reviews.llvm.org/D30355 llvm-svn: 296389	2017-02-27 21:30:54 +00:00
Lang Hames	99fe4e4838	[Support][Error] Add a 'cantFail' utility function for known-safe calls to fallible functions. Some fallible functions (those returning Error or Expected<T>) may only fail for a subset of their inputs. For example, a "safe" square root function will succeed for all finite positive inputs: Expected<double> safeSqrt(double d) { if (d < 0 && !isnan(d) && !isinf(d)) return make_error<...>("Cannot sqrt -ve values, nans or infs"); return sqrt(d); } At a safe callsite for such a function, checking the error return value is redundant: if (auto ValOrErr = safeSqrt(42.0)) { // use *ValOrErr. } else llvm_unreachable("safeSqrt should always succeed for +ve values"); The cantFail function wraps this check and extracts the contained value, simplifying control flow: double Result = cantFail(safeSqrt(42.0)); This function should be used with care: it is a programmatic error to wrap a call with cantFail if it can in fact fail. For debug builds this will result in llvm_unreachable being called. For release builds the behavior is undefined. Use of this function is likely to be rare in library code, but more common for tool and unit-test code where inputs and mock functions may be known to be safe. llvm-svn: 296384	2017-02-27 21:09:47 +00:00
Matt Arsenault	607de29beb	AMDGPU: Add some of the new gfx9 VOP3 instructions llvm-svn: 296382	2017-02-27 21:04:41 +00:00
Simon Pilgrim	938d3567c1	[X86][SSE] Attempt to extract vector elements through target shuffles DAGCombiner already supports peeking thorough shuffles to improve vector element extraction, but legalization often leaves us in situations where we need to extract vector elements after shuffles have already been lowered. This patch adds support for VECTOR_EXTRACT_ELEMENT/PEXTRW/PEXTRB instructions to attempt to handle target shuffles as well. I've covered some basic scenarios including handling shuffle mask scaling and the implicit zero-extension of PEXTRW/PEXTRB, there is more that could be done here (that I've mentioned in TODOs) but I haven't found many cases where its worth it. Differential Revision: https://reviews.llvm.org/D30176 llvm-svn: 296381	2017-02-27 21:01:57 +00:00
Matt Arsenault	070672ec9c	AMDGPU: Support inlineasm for packed instructions Add packed types as legal so they may be used with inlineasm. Keep all operations expanded for now. llvm-svn: 296379	2017-02-27 20:52:10 +00:00
Alexey Bataev	001ecbcf5b	[SLP] Use different flags in tests for reduction ops and extra args. llvm-svn: 296376	2017-02-27 20:22:44 +00:00
Matt Arsenault	eed7a831ed	AMDGPU: Don't fold immediate if clamp/omod are set Doesn't fix any practical problems because clamp/omod are currently folded after peephole optimizer. llvm-svn: 296375	2017-02-27 20:21:31 +00:00
Matt Arsenault	e195aa940b	AMDGPU: Fold omod into instructions llvm-svn: 296372	2017-02-27 19:35:42 +00:00
Taewook Oh	b736eafd3f	[TailDuplicator] Maintain DebugLoc for branch instructions Summary: Existing implementation of duplicateSimpleBB function drops DebugLoc metadata of branch instructions during the transformation. This patch addresses this issue by making newly created branch instructions to keep the metadata of replaced branch instructions. Reviewers: qcolombet, craig.topper, aprantl, MatzeB, sanjoy, dblaikie Reviewed By: dblaikie Subscribers: dblaikie, llvm-commits Differential Revision: https://reviews.llvm.org/D30026 llvm-svn: 296371	2017-02-27 19:30:01 +00:00
Matt Arsenault	26b489ab08	AMDGPU: Add f16 to shader calling conventions Mostly useful for writing tests for f16 features. llvm-svn: 296370	2017-02-27 19:24:47 +00:00
Alexey Bataev	43f49cbc4f	[SLP] Modify test to check IR flags propagation for extra args. llvm-svn: 296369	2017-02-27 19:16:09 +00:00
Matt Arsenault	96b9e12990	AMDGPU: Add VOP3P instruction format Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler llvm-svn: 296368	2017-02-27 18:49:11 +00:00
Amaury Sechet	cd9c5c2f28	Refactor xaluo.ll and xmulo.ll tests. NFC llvm-svn: 296367	2017-02-27 18:32:54 +00:00
Sanjay Patel	b5ddf256ad	[InlineFunction] add nonnull assumptions based on argument attributes This was suggested in D27855: have the inliner add assumptions, so we don't lose nonnull info provided by argument attributes. This still doesn't solve PR28430 (dyn_cast), but this gets us closer. https://reviews.llvm.org/D29999 llvm-svn: 296366	2017-02-27 18:13:48 +00:00
Krzysztof Parzyszek	3924cb51e8	[Hexagon] Defs and clobbers can overlap llvm-svn: 296365	2017-02-27 18:03:35 +00:00
Xin Tong	dec22204f0	Fix a bug when unswitching on partial LIV for SwitchInst Summary: Fix a bug when unswitching on partial LIV for SwitchInst. Reviewers: hfinkel, efriedma, sanjoy Reviewed By: sanjoy Subscribers: david2050, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D29107 llvm-svn: 296363	2017-02-27 18:00:13 +00:00
Rong Xu	f5043842df	Fix comments. NFC. Change "Thin-LTO" to "ThinLTO" in the comments for consistency. llvm-svn: 296362	2017-02-27 17:59:01 +00:00
Steven Wu	4d961ac155	Fix LLVM module build Add WasmRelocs/WebAssembly.def to textual include header. llvm-svn: 296356	2017-02-27 16:56:37 +00:00
Craig Topper	093751b7fa	[X86] Use APInt instead of SmallBitVector tracking undef elements from getTargetConstantBitsFromNode and getConstVector. Summary: SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc. APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30392 llvm-svn: 296355	2017-02-27 16:15:32 +00:00
Craig Topper	27f2fe5729	[X86] Use APInt instead of SmallBitVector for tracking Zeroable elements in shuffle lowering Summary: SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc. APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30390 llvm-svn: 296354	2017-02-27 16:15:30 +00:00
Craig Topper	5eb13883a0	[X86] Fix SmallVector sizes in constant pool shuffle decoding to avoid heap allocation Some of the vectors are under sized to avoid heap allocation. In one case the vector was oversized. Differential Revision: https://reviews.llvm.org/D30387 llvm-svn: 296353	2017-02-27 16:15:27 +00:00
Craig Topper	f2749ccb21	[X86] Use APInt instead of SmallBitVector for tracking undef elements in constant pool shuffle decoding Summary: SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc. APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. This will incur a minor increase in stack usage due to APInt storing the bit count separately from the data bits unlike SmallBitVector, but that should be ok. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30386 llvm-svn: 296352	2017-02-27 16:15:25 +00:00
Amaury Sechet	dd0f513601	Remove an empty line in icmp-illegal.ll . NFC llvm-svn: 296350	2017-02-27 16:09:44 +00:00
Alexey Bataev	5fe52fcb8f	[SLP] A test for a fix of PR32038. llvm-svn: 296349	2017-02-27 16:07:10 +00:00
Artur Pilipenko	fd7893d800	Loop predication expand both sides of the widened condition This is a fix for a loop predication bug which resulted in malformed IR generation. Loop invariant side of the widened condition is not guaranteed to be available in the preheader as is, so we need to expand it as well. See added unsigned_loop_0_to_n_hoist_length test for example. Reviewed By: sanjoy, mkazantsev Differential Revision: https://reviews.llvm.org/D30099 llvm-svn: 296345	2017-02-27 15:44:49 +00:00
Sjoerd Meijer	1544720770	AArch64InstPrinter: rewrite of printSysAlias This is a cleanup/rewrite of the printSysAlias function. This was not using the tablegen instruction descriptions, but was "manually" decoding the instructions. This has been replaced with calls to lookup_XYZ_ByEncoding tablegen calls. This revealed several problems. First, instruction IVAU had the wrong encoding. This was cancelled out by the parser that incorrectly matched the wrong encoding. Second, instruction CVAP was missing from the SystemOperands tablegen descriptions, so this has been added. And third, the required target features were not captured in the tablegen descriptions, so support for this has also been added. Differential Revision: https://reviews.llvm.org/D30329 llvm-svn: 296343	2017-02-27 14:45:34 +00:00
John Brawn	3b54618ff1	[ARM] LSL #0 is an alias of MOV Currently we handle this correctly in arm, but in thumb we don't which leads to an unpredictable instruction being emitted for LSL #0 in an IT block and SP not being permitted in some cases when it should be. For the thumb2 LSL we can handle this by making LSL #0 an alias of MOV in the .td file, but for thumb1 we need to handle it in checkTargetMatchPredicate to get the IT handling right. We also need to adjust the handling of MOV rd, rn, LSL #0 to avoid generating the 16-bit encoding in an IT block. We should also adjust it to allow SP in the same way that it is allowed in MOV rd, rn, but I haven't done that here because it looks like it would take quite a lot of work to get right. Additionally correct the selection of the 16-bit shift instructions in processInstruction, where it was checking if the two registers were equal when it should have been checking if they were low. It appears that previously this code was never executed and the 16-bit encoding was selected by default, but the other changes I've done here have somehow made it start being used. Differential Revision: https://reviews.llvm.org/D30294 llvm-svn: 296342	2017-02-27 14:40:51 +00:00

1 2 3 4 5 ...

145513 Commits