llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-12-15 07:39:31 +00:00

Author	SHA1	Message	Date
Mehdi Amini	f448ca43f7	Revert "Revert "Invariant start/end intrinsics overloaded for address space"" This reverts commit 32fc6488e48eafc0ca1bac1bd9cbf0008224d530. llvm-svn: 278609	2016-08-13 23:31:24 +00:00
Mehdi Amini	2c3de3e169	Revert "Invariant start/end intrinsics overloaded for address space" This reverts commit r276447. llvm-svn: 278608	2016-08-13 23:27:32 +00:00
Eugene Zelenko	10633be3a7	Fix some Clang-tidy modernize-use-using and Include What You Use warnings. Differential revision: https://reviews.llvm.org/D23478 llvm-svn: 278583	2016-08-13 00:50:41 +00:00
Gor Nishanov	37bbef4980	[Coroutines]: Part6b: Add coro.id intrinsic. Summary: 1. Make coroutine representation more robust against optimization that may duplicate instruction by introducing coro.id intrinsics that returns a token that will get fed into coro.alloc and coro.begin. Due to coro.id returning a token, it won't get duplicated and can be used as reliable indicator of coroutine identify when a particular coroutine call gets inlined. 2. Move last three arguments of coro.begin into coro.id as they will be shared if coro.begin will get duplicated. 3. doc + test + code updated to support the new intrinsic. Reviewers: mehdi_amini, majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23412 llvm-svn: 278481	2016-08-12 05:45:49 +00:00
David Majnemer	95fedaaedc	Use the range variant of transform instead of unpacking begin/end No functionality change is intended. llvm-svn: 278476	2016-08-12 04:32:42 +00:00
David Majnemer	9880e078f0	Use the range variant of remove_if instead of unpacking begin/end No functionality change is intended. llvm-svn: 278475	2016-08-12 04:32:37 +00:00
David Majnemer	319d420e44	Use the range variant of find/find_if instead of unpacking begin/end If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278469	2016-08-12 03:55:06 +00:00
David Majnemer	85242fb9f9	Use the range variant of find instead of unpacking begin/end If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278433	2016-08-11 22:21:41 +00:00
David Majnemer	5423e4bff5	Use range algorithms instead of unpacking begin/end No functionality change is intended. llvm-svn: 278417	2016-08-11 21:15:00 +00:00
Duncan P. N. Exon Smith	c0211aa6e9	IR: Don't cast the end iterator to Instruction* End iterators are usually sentinels, not actually Instruction* at all. Stop casting to it just to get an iterator back. There is likely no observable functionality change here right now (although this is relying on UB, I doubt it was triggering anything), but I'll be removing the cast soon. llvm-svn: 278346	2016-08-11 15:45:04 +00:00
Sanjoy Das	1dda66d272	[Statepoints] Minor cosmetic change; NFC The verification failure message was missing a space. llvm-svn: 278309	2016-08-11 00:56:46 +00:00
Chandler Carruth	28e1aa2338	[x86] Fix a bug in the auto-upgrade from r276416 where we failed to give a sufficiently low alignment for the IR load created. There is no test case because we don't have any test cases for the IR produced by the autoupgrade, only the x86 assembly, and it happens that the x86 assembly for this intrinsic as it is tested in the autoupgrade path just happens to not produce a separate load instruction where we might have observed the alignment. I'm going to follow up on the original commit to suggest getting IR-level testing in addition to the asm level testing here so that we can see and test these kinds of issues. We might never get an x86 instruction out with an alignment constraint, but we could stil miscompile code by folding against the alignment marked on (or inferred for in this case) the load. llvm-svn: 278203	2016-08-10 07:41:26 +00:00
Vedant Kumar	b4e8531eb6	[IR] Remove some unused #includes (NFC) I needed a reader-writer lock for a downstream project and noticed that llvm has one. Function.cpp is the only file in-tree that refers to it. To anyone reading this: are you using RWMutex in out-of-tree code? Maybe it's not worth keeping around any more... Since we're not actually using RWMutex here, remove the #include (and a few other stale headers while we're at it). llvm-svn: 278178	2016-08-09 23:14:37 +00:00
Sean Silva	beb273cb73	Consistently use ModuleAnalysisManager Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278078	2016-08-09 00:28:38 +00:00
Sean Silva	11e71061b1	Consistently use FunctionAnalysisManager Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278077	2016-08-09 00:28:15 +00:00
Benjamin Kramer	a733725b3a	Move helpers into anonymous namespaces. NFC. llvm-svn: 277916	2016-08-06 11:13:10 +00:00
Gor Nishanov	d04999a10d	Part 4c: Coroutine Devirtualization: Devirtualize coro.resume and coro.destroy. Summary: This is the 4c patch of the coroutine series. CoroElide pass now checks if PostSplit coro.begin is referenced by coro.subfn.addr intrinsics. If so replace coro.subfn.addrs with an appropriate coroutine subfunction associated with that coro.begin. Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) 3.Add empty coroutine passes. (https://reviews.llvm.org/D22847) 4.Add coroutine devirtualization + tests. ab) Lower coro.resume and coro.destroy (https://reviews.llvm.org/D22998) c) Do devirtualization <= we are here 5.Add CGSCC restart trigger + tests. 6.Add coroutine heap elision + tests. 7.Add the rest of the logic (split into more patches) Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23229 llvm-svn: 277908	2016-08-06 02:16:35 +00:00
David Callahan	2001723a88	[AutoFDO] Fix handling of empty profiles Summary: If a profile has no samples for a function, then the function "entry count" is set to the value 0. Several places in the code test that if the Function::getEntryCount is defined at all. Here we change to treat a 0 entry count the same as undefined. In particular, this fixes a problem in getLayoutSuccessorProbThreshold in MachineBlockPlacement.cpp where we use a different and inferior heuristic for laying out basic blocks. Reviewers: danielcdh, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23082 llvm-svn: 277849	2016-08-05 18:38:19 +00:00
David Majnemer	cac831b368	[coroutines] Part 4[ab]: Coroutine Devirtualization: Lower coro.resume and coro.destroy. This is the forth patch in the coroutine series. CoroEaly pass now lowers coro.resume and coro.destroy intrinsics by replacing them with an indirect call to an address returned by coro.subfn.addr intrinsic. This is done so that CGPassManager recognizes devirtualization when CoroElide replaces a call to coro.subfn.addr with an appropriate function address. Patch by Gor Nishanov! Differential Revision: https://reviews.llvm.org/D22998 llvm-svn: 277765	2016-08-04 20:30:07 +00:00
Duncan P. N. Exon Smith	6a8d2b6215	IR: Drop uniquing when an MDNode Value operand is deleted This is a fix for PR28697. An MDNode can indirectly refer to a GlobalValue, through a ConstantAsMetadata. When the GlobalValue is deleted, the MDNode operand is reset to `nullptr`. If the node is uniqued, this can lead to a hard-to-detect cache invalidation in a Metadata map that's shared across an LLVMContext. Consider: 1. A map from Metadata* to `T` called RemappedMDs. 2. A node that references a global variable, `!{i1* @GV}`. 3. Insert `!{i1* @GV} -> SomeT` in the map. 4. Delete `@GV`, leaving behind `!{null} -> SomeT`. Looking up the generic and uninteresting `!{null}` gives you `SomeT`, which is likely related to `@GV`. Worse, `SomeT`'s lifetime may be tied to the deleted `@GV`. This occurs in practice in the shared ValueMap used since r266579 in the IRMover. Other code that handles more than one Module (with different lifetimes) in the same LLVMContext could hit it too. The fix here is a partial revert of r225223: in the rare case that an MDNode operand is a ConstantAsMetadata (i.e., wrapping a node from the Value hierarchy), drop uniquing if it gets replaced with `nullptr`. This changes step #4 above to leave behind `distinct !{null} -> SomeT`, which can't be confused with the generic `!{null}`. In theory, this can cause some churn in the LLVMContext's MDNode uniquing map when Values are being deleted. However: - The number of GlobalValues referenced from uniqued MDNodes is expected to be quite small. E.g., the debug info metadata schema only references GlobalValues from distinct nodes. - Other Constants have the lifetime of the LLVMContext, whose teardown is careful to drop references before deleting the constants. As a result, I don't expect a compile time regression from this change. llvm-svn: 277625	2016-08-03 18:19:43 +00:00
Sanjoy Das	7931b59359	[Verifier] Disallow illegal ptr<->int casts in ConstantExprs This should have been a part of rL277085, but I hadn't considered this case. llvm-svn: 277413	2016-08-02 02:55:57 +00:00
Sanjoy Das	2f2a2a4518	Tie the Verifier class to a Module; NFCI Summary: This commit changes the Verifier class to accept a Module via the constructor to make it obvious that a specific instance of the class is only intended to work with a specific module. The `updateModule` setter (despite being private) was making this fact less transparent. There are fields in the `Verifier` class like `DeoptimizeDeclarations` and `GlobalValueVisited` which are module specific, so a given Verifier instance will not in fact work across multiple modules today. This change just makes that more obvious. The motivation is to make it easy to get to the datalayout of the module unambiguously. That is required to verify that `inttoptr` and `ptrtoint` constant expressions are well typed in the face of non-integral pointer types. Reviewers: dexonsmith, bkramer, majnemer, chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23040 llvm-svn: 277409	2016-08-02 01:34:50 +00:00
David Majnemer	0e0bcfac37	[Verifier] Resume instructions can only be in functions w/ a personality This fixes PR28799. llvm-svn: 277360	2016-08-01 18:06:34 +00:00
Amjad Aboud	7c112cd426	Fixed "copy-paste" mistake from revision 255245. llvm-svn: 277290	2016-07-31 14:41:50 +00:00
Tim Northover	dda86274a2	CodeGen: add new "intrinsic" MachineOperand kind. This will be used during GlobalISel, where we need a more robust and readable way to write tests than a simple immediate ID. llvm-svn: 277209	2016-07-29 20:32:59 +00:00
Justin Lebar	b1ec783712	Revert "Don't invoke getName() from Function::isIntrinsic().", rL276942. This broke some out-of-tree AMDGPU tests that relied on the old behavior wherein isIntrinsic() would return true for any function that starts with "llvm.". And in general that change will not play nicely with out-of-tree backends. llvm-svn: 277087	2016-07-28 23:58:15 +00:00
Sanjoy Das	76a6f34a7e	[IR] Introduce a non-integral pointer type Summary: This change adds a `ni` specifier in the `datalayout` string to denote pointers in some given address spaces as "non-integral", and adds some typing rules around these special pointers. Reviewers: majnemer, chandlerc, atrick, dberlin, eli.friedman, tstellarAMD, arsenm Subscribers: arsenm, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22488 llvm-svn: 277085	2016-07-28 23:43:38 +00:00
Justin Lebar	c1a3abfb94	Don't invoke getName() from Function::isIntrinsic(). Summary: getName() involves a hashtable lookup, so is expensive given how frequently isIntrinsic() is called. (In particular, many users cast to IntrinsicInstr or one of its subclasses before calling getIntrinsicID().) This has an incidental functional change: Before, isIntrinsic() would return true for any function whose name started with "llvm.", even if it wasn't properly an intrinsic. The new behavior seems more correct to me, because it's strange to say that isIntrinsic() is true, but getIntrinsicId() returns "not an intrinsic". Some callers want the old behavior -- they want to know whether the caller is a recognized intrinsic, or might be one in some other version of LLVM. For them, we added Function::hasLLVMReservedName(), which checks whether the name starts with "llvm.". This change is good for a 1.5% e2e speedup compiling a large Eigen benchmark. Reviewers: bogner Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22065 llvm-svn: 276942	2016-07-27 23:46:57 +00:00
Anna Thomas	70e110c227	Add invariant start call creation in IRBuilder.NFC Differential Revision: https://reviews.llvm.org/D22700 llvm-svn: 276471	2016-07-22 20:57:23 +00:00
Anna Thomas	6f5ce86e80	Invariant start/end intrinsics overloaded for address space Summary: The llvm.invariant.start and llvm.invariant.end intrinsics currently support specifying invariant memory objects only in the default address space. With this change, these intrinsics are overloaded for any adddress space for memory objects and we can use these llvm invariant intrinsics in non-default address spaces. Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr) This overloaded intrinsic is needed for representing final or invariant memory in managed languages. Reviewers: apilipenko, reames Subscribers: llvm-commits llvm-svn: 276447	2016-07-22 17:49:40 +00:00
Simon Pilgrim	95ed20cecf	[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 (reapplied) As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector. This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match. We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts). Reapplied with fix for PR28657 - removed intrinsic definitions (clang companion patch to be be submitted shortly). Differential Revision: https://reviews.llvm.org/D22460 llvm-svn: 276416	2016-07-22 13:58:44 +00:00
Benjamin Kramer	b22be9a076	Revert "[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128" It caused PR28657. This reverts commit r276281. llvm-svn: 276405	2016-07-22 11:03:10 +00:00
Anna Thomas	a6e42b23de	Revert "Invariant start/end intrinsics overloaded for address space" This reverts commit r276316. llvm-svn: 276320	2016-07-21 19:06:28 +00:00
Anna Thomas	219ef36aa0	Invariant start/end intrinsics overloaded for address space Summary: The llvm.invariant.start and llvm.invariant.end intrinsics currently support specifying invariant memory objects only in the default address space. With this change, these intrinsics are overloaded for any adddress space for memory objects and we can use these llvm invariant intrinsics in non-default address spaces. Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr) This overloaded intrinsic is needed for representing final or invariant memory in managed languages. Reviewers: tstellarAMD, reames, apilipenko Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22519 llvm-svn: 276316	2016-07-21 18:41:44 +00:00
Simon Pilgrim	9b2c75bbd5	[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector. This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match. We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts). Differential Revision: https://reviews.llvm.org/D22460 llvm-svn: 276281	2016-07-21 14:10:54 +00:00
Benjamin Kramer	750272d02a	Rename StringMap::emplace_second to try_emplace. Coincidentally this function maps to the C++17 try_emplace. Rename it for consistentcy with C++17 std::map. NFC. llvm-svn: 276276	2016-07-21 13:37:48 +00:00
Amaury Sechet	e2c46ac53f	Add missing import to fix the build llvm-svn: 276237	2016-07-21 04:31:38 +00:00
Amaury Sechet	e639831d0c	Expose AttributeSetNode, use it to provide aggregate getter for attribute in the C API. Summary: See D19181 for context. Reviewers: whitequark, Wallbraker, jyknight, echristo, bkramer, void Subscribers: mehdi_amini Differential Revision: http://reviews.llvm.org/D21265 llvm-svn: 276236	2016-07-21 04:25:06 +00:00
Simon Pilgrim	e2f3b489b8	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. A companion clang patch is at D22105 Differential Revision: https://reviews.llvm.org/D22106 llvm-svn: 275981	2016-07-19 15:07:43 +00:00
Matt Arsenault	4df0821f1b	Fix -Wreturn-type with gcc 4.8 and libc++ llvm-svn: 275922	2016-07-18 22:12:46 +00:00
Adam Nemet	cb89dd6834	[OptRemark,LDist] RFC: Add hotness attribute Summary: This is the first set of changes implementing the RFC from http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334 This is a cross-sectional patch; rather than implementing the hotness attribute for all optimization remarks and all passes in a patch set, it implements it for the 'missed-optimization' remark for Loop Distribution. My goal is to shake out the design issues before scaling it up to other types and passes. Hotness is computed as an integer as the multiplication of the block frequency with the function entry count. It's only printed in opt currently since clang prints the diagnostic fields directly. E.g.: remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300) A new API added is similar to emitOptimizationRemarkMissed. The difference is that it additionally takes a code region that the diagnostic corresponds to. From this, hotness is computed using BFI. The new API is exposed via an analysis pass so that it can be made dependent on LazyBFI. (Thanks to Hal for the analysis pass idea.) This feature can all be enabled by setDiagnosticHotnessRequested in the LLVM context. If this is off, LazyBFI is not calculated (D22141) so there should be no overhead. A new command-line option is added to turn this on in opt. My plan is to switch all user of emitOptimizationRemark* to use this module instead. Reviewers: hfinkel Subscribers: rcox2, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D21771 llvm-svn: 275583	2016-07-15 17:23:20 +00:00
Justin Bogner	5a8c8a3672	IR: Sort generic intrinsics before target specific ones This splits out the intrinsic table such that generic intrinsics come first and target specific intrinsics are grouped by target. From here we can find out which target an intrinsic is for or differentiate between generic and target intrinsics. The motivation here is to make it easier to move target specific intrinsic handling out of generic code. llvm-svn: 275575	2016-07-15 16:31:37 +00:00
David Majnemer	f7d30cf67a	[IR] andIRFlags and copyIRFlags needs to handle GEP We didn't consider the inbounds flag on GEPs leading to downstream users introducing UB. This fixes PR28562. llvm-svn: 275532	2016-07-15 05:02:31 +00:00
David Majnemer	bf9b17b342	[IR] Make getIndexedOffsetInType return a signed result A GEPed offset can go negative, the result of getIndexedOffsetInType should according be a signed type. llvm-svn: 275246	2016-07-13 03:42:38 +00:00
David Majnemer	1f224c132e	[ConstantFold] Don't incorrectly infer inbounds on array GEP The many levels of nesting inside the responsible code made it easy for bugs to sneak in. Flattening the logic makes it easier to see what's going on. llvm-svn: 275244	2016-07-13 03:24:41 +00:00
Craig Topper	d120449666	[AVX512] Remove masked logic op intrinsics and autoupgrade them to native IR. llvm-svn: 275155	2016-07-12 05:27:53 +00:00
Craig Topper	22685deaf4	[X86,IR] Remove unnecessary or unused LLVMContext parameter from some of the X86 intrinsic upgrade functions. llvm-svn: 275138	2016-07-12 01:42:33 +00:00
Dehao Chen	8b4fe9a409	Fix the assertion failure caused by http://reviews.llvm.org/D22118 Summary: http://reviews.llvm.org/D22118 uses metadata to store the call count, which makes it possible to have branch weight to have only one elements. Also fix the assertion failure in inliner when checking the instruction type to include "invoke" instruction. Reviewers: mkuper, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22228 llvm-svn: 275079	2016-07-11 17:36:02 +00:00
David Majnemer	18cc889bef	[IR] Stop a -Wsign-compare warning from firing llvm-svn: 275077	2016-07-11 17:09:06 +00:00
Dehao Chen	0f5497429b	Implement callsite-hotness based inline cost for Sample-based PGO Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073	2016-07-11 16:48:54 +00:00

1 2 3 4 5 ...

2479 Commits