llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 11:59:56 +00:00

Author	SHA1	Message	Date
Juneyoung Lee	55591e689c	[ValueTracking] isKnownNonZero, computeKnownBits for freeze This implements support for isKnownNonZero, computeKnownBits when freeze is involved. ``` br (x != 0), BB1, BB2 BB1: y = freeze x ``` In the above program, we can say that y is non-zero. The reason is as follows: (1) If x was poison, `br (x != 0)` raised UB (2) If x was fully undef, the branch again raised UB (3) If x was non-zero partially undef, say `undef \| 1`, `freeze x` will return a nondeterministic value which is also non-zero. (4) If x was just a concrete value, it is trivial Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D75808	2020-09-10 08:07:38 +09:00
dfukalov	272bea1511	[AMDGPU] Fix for folding v2.16 literals. It was found some packed immediate operands (e.g. `<half 1.0, half 2.0>`) are incorrectly processed so one of two packed values were lost. Introduced new function to check immediate 32-bit operand can be folded. Converted condition about current op_sel flags value to fall-through. Fixes: SWDEV-247595 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D87158	2020-09-10 01:39:25 +03:00
Krzysztof Parzyszek	6bc3ff08f3	Mark masked.{store,scatter,compressstore} intrinsics as write-only	2020-09-09 17:28:21 -05:00
Jessica Paquette	4b96a7282b	[AArch64][GlobalISel] Share address mode selection code for memops We were missing support for the G_ADD_LOW + ADRP folding optimization in the manual selection code for G_LOAD, G_STORE, and G_ZEXTLOAD. As a result, we were missing cases like this: ``` @foo = external hidden global i32* define void @baz(i32* %0) { store i32* %0, i32** @foo ret void } ``` https://godbolt.org/z/16r7ad This functionality already existed in the addressing mode functions for the importer. So, this patch makes the manual selection code use `selectAddrModeIndexed` rather than duplicating work. This is a 0.2% geomean code size improvement for CTMark at -O3. There is one code size increase (0.1% on lencod) which is likely because `selectAddrModeIndexed` doesn't look through constants. Differential Revision: https://reviews.llvm.org/D87397	2020-09-09 15:14:46 -07:00
Florian Hahn	4ab31e1cd4	[DSE,MemorySSA] Handle atomic stores explicitly in isReadClobber. Atomic stores are modeled as MemoryDef to model the fact that they may not be reordered, depending on the ordering constraints. Atomic stores that are monotonic or weaker do not limit re-ordering, so we do not have to treat them as potential read clobbers. Note that llvm/test/Transforms/DeadStoreElimination/MSSA/atomic.ll already contains a set of negative test cases. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87386	2020-09-09 23:01:58 +01:00
Nikita Popov	5865bd4cc8	[DAGCombiner] Fold fmin/fmax of NaN fminnum(X, NaN) is X, fminimum(X, NaN) is NaN. This mirrors the behavior of existing InstSimplify folds. This is expected to improve the reduction lowerings in D87391, which use NaN as a neutral element. Differential Revision: https://reviews.llvm.org/D87415	2020-09-09 23:53:32 +02:00
Nikita Popov	65570213eb	[ARM] Add additional fmin/fmax with nan tests (NFC) Adding these to ARM which has both FMINNUM and FMINIMUM.	2020-09-09 23:53:32 +02:00
Amara Emerson	d476bfb259	Add REQUIRES: asserts to a test that uses an asserts only flag.	2020-09-09 14:31:12 -07:00
Amara Emerson	fcbefce153	[GlobalISel] Enable usage of BranchProbabilityInfo in IRTranslator. We weren't using this before, so none of the MachineFunction CFG edges had the branch probability information added. As a result, block placement later in the pipeline was flying blind. This is enabled only with optimizations enabled like SelectionDAG. Differential Revision: https://reviews.llvm.org/D86824	2020-09-09 14:31:12 -07:00
Nikita Popov	31aa934ebf	[X86] Add tests for minnum/maxnum with constant NaN (NFC)	2020-09-09 22:36:51 +02:00
Amara Emerson	6f86f1afef	[GlobalISel][IRTranslator] Generate better conditional branch lowering. This is a port of the functionality from SelectionDAG, which tries to find a tree of conditions from compares that are then combined using OR or AND, before using that result as the input to a branch. Instead of naively lowering the code as is, this change converts that into a sequence of conditional branches on the sub-expressions of the tree. Like SelectionDAG, we re-use the case block codegen functionality from the switch lowering utils, which causes us to generate some different code. The result of which I've tried to mitigate in earlier combine patches. Differential Revision: https://reviews.llvm.org/D86665	2020-09-09 13:16:11 -07:00
Amara Emerson	a7636dc8f8	[GlobalISel] Rewrite the elide-br-by-swapping-icmp-ops combine to do less. This combine previously tried to take sequences like: %cond = G_ICMP pred, a, b G_BRCOND %cond, %truebb G_BR %falsebb %truebb: ... %falsebb: ... and by inverting the compare predicate and swapping branch targets, delete the G_BR and instead have a single conditional branch to the falsebb. Since in an earlier patch we have a combine to fold not(icmp) into just an inverted icmp, we don't need this combine to do as much. This patch instead generalizes the combine by just looking for: G_BRCOND %cond, %truebb G_BR %falsebb %truebb: ... %falsebb: ... and then inverting the condition using a not (xor). The xor can be folded away in a separate combine. This change also lets us avoid some optimization code in the IRTranslator. I also think that deleting G_BRs in the combiner is unnecessary. That's something that targets can decide to do at selection time and could simplify generic code in future. Differential Revision: https://reviews.llvm.org/D86664	2020-09-09 13:08:16 -07:00
Hiroshi Yamauchi	dc79f6327a	[X86] Add support for using fast short rep mov for memcpy lowering. Disabled by default behind an option. Differential Revision: https://reviews.llvm.org/D86883	2020-09-09 12:46:40 -07:00
Tony	06c11c6b12	[AMDGPU] Correct gfx1031 XNACK setting documentation - gfx1031 does not support XNACK. Differential Revision: https://reviews.llvm.org/D87198	2020-09-09 19:43:02 +00:00
Jian Cai	cec86ef133	[MC] Resolve the difference of symbols in consecutive MCDataFragements Try to resolve the difference of two symbols in consecutive MCDataFragments. This is important for an idiom like "foo:instr; .if . - foo; instr; .endif" (https://bugs.llvm.org/show_bug.cgi?id=43795). Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D69411	2020-09-09 12:35:43 -07:00
Fangrui Song	cc25215e10	[gcov] Give the __llvm_gcov_ctr load instruction a name for more readable output	2020-09-09 12:34:43 -07:00
Krzysztof Parzyszek	90ea855f41	[Hexagon] Account for truncating pairs to non-pairs when widening truncates Added missing selection patterns for vpackl.	2020-09-09 14:31:52 -05:00
Sanjay Patel	09e9f6b632	[InstCombine] add tests for add/sub-of-shl; NFC	2020-09-09 15:29:08 -04:00
Fangrui Song	ce51bde5c3	[gcov] Don't split entry block; add a synthetic entry block instead The entry block is split at the first instruction where `shouldKeepInEntry` returns false. The created basic block has a br jumping to the original entry block. The new basic block causes the function label line and the other entry block lines to be covered by different basic blocks, which can affect line counts with special control flows (fork/exec in the entry block requires heuristics in llvm-cov gcov to get consistent line counts). int main() { // BB0 return 0; // BB2 (due to entry block splitting) } // BB1 is the exit block (since gcov 4.8) This patch adds a synthetic entry block (like PGOInstrumentation and GCC) and inserts an edge from the synthetic entry block to the original entry block. We can thus remove the tricky `shouldKeepInEntry` and entry block splitting. The number of basic blocks does not change, but the emitted .gcno files will be smaller because we can save one GCOV_TAG_LINES tag. // BB0 is the synthetic entry block with a single edge to BB2 int main() { // BB2 return 0; // BB2 } // BB1 is the exit block (since gcov 4.8)	2020-09-09 12:25:24 -07:00
Guillaume Chatelet	21ae6eda63	[NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD) This is preparatory work to unable storing alignment for AtomicCmpXchgInst. See D83136 for context and bug: https://bugs.llvm.org/show_bug.cgi?id=27168 This is the fixed version of D83375, which was submitted and reverted. Differential Revision: https://reviews.llvm.org/D87373	2020-09-09 19:10:30 +00:00
Mark de Wever	63ced1a0ad	Implements [[likely]] and [[unlikely]] in IfStmt. This is the initial part of the implementation of the C++20 likelihood attributes. It handles the attributes in an if statement. Differential Revision: https://reviews.llvm.org/D85091	2020-09-09 20:48:37 +02:00
Krzysztof Parzyszek	c8c11e1378	[DSE] Explicitly not use MSSA in testcase for now It fails for some reason, but it shouldn't stop switching to MSSA in DSE.	2020-09-09 13:45:55 -05:00
Krzysztof Parzyszek	3d66ccadfc	[DSE] Handle masked stores	2020-09-09 13:31:31 -05:00
Johannes Doerfert	a45efb6cec	Revert "[Attributor] Re-enable a run line in noalias.ll" The underlying issue is still there, just hides on most systems, even some Windows builds :( See: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/25479/steps/test-check-all/logs/FAIL%3A%20LLVM%3A%3Anoalias.ll This reverts commit 2600c9e2efce1dc4c64870b00a45ae0082c685fc.	2020-09-09 13:28:22 -05:00
Ulrich Weigand	487a2ff693	[DAGCombine] Skip re-visiting EntryToken to avoid compile time explosion During the main DAGCombine loop, whenever a node gets replaced, the new node and all its users are pushed onto the worklist. Omit this if the new node is the EntryToken (e.g. if a store managed to get optimized out), because re-visiting the EntryToken and its users will not uncover any additional opportunities, but there may be a large number of such users, potentially causing compile time explosion. This compile time explosion showed up in particular when building the SingleSource/UnitTests/matrix-types-spec.cpp test-suite case on any platform without SIMD vector support. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D86963	2020-09-09 19:13:46 +02:00
Mircea Trofin	7134ae5a9c	[NFC][MLInliner] Don't initialize in an assert. Since the build bots have assertions enabled, this flew under the radar.	2020-09-09 09:56:07 -07:00
Simon Pilgrim	82fd96da09	X86CallFrameOptimization.cpp - use const references where possible. NFCI.	2020-09-09 16:35:08 +01:00
Krzysztof Parzyszek	d45cf136d8	[DSE] Add testcase that uses masked loads and stores	2020-09-09 10:30:32 -05:00
Simon Pilgrim	a943063683	X86FrameLowering::adjustStackWithPops - cleanup auto usage. NFCI. Don't use auto for non-obvious types, and use const references.	2020-09-09 16:15:02 +01:00
Qiu Chaofan	6053db5a2e	[PowerPC] Fix STRICT_FRINT/STRICT_FNEARBYINT lowering In standard C library, both rint and nearbyint returns rounding result in current rounding mode. But nearbyint never raises inexact exception. On PowerPC, x(v\|s)r(d\|s)pic may modify FPSCR XX, raising inexact exception. So we can't select constrained fnearbyint into xvrdpic. One exception here is xsrqpi, which will not raise inexact exception, so fnearbyint f128 is okay here. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87220	2020-09-09 22:40:58 +08:00
Jay Foad	d81eb83109	[AMDGPU] Simplify S_SETREG_B32 case in EmitInstrWithCustomInserter NFC.	2020-09-09 15:18:31 +01:00
Dmitry Preobrazhensky	ba5e71a51e	[AMDGPU][MC] Improved diagnostic messages for invalid registers Corrected parser to issue meaningful error messages for invalid and malformed registers. See bug 41303: https://bugs.llvm.org/show_bug.cgi?id=41303 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D87234	2020-09-09 16:44:03 +03:00
Alon Kom	183a8b689c	[MachinePipeliner] Fix II_setByPragma initialization II_setByPragma was not reset between 2 calls of the MachinePipleiner pass Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D87088	2020-09-09 13:38:35 +00:00
Denis Antrushin	a5e6ed283a	[Statepoints] Update DAG root after emitting statepoint. Since we always generate CopyToRegs for statepoint results, we must update DAG root after emitting statepoint, so that these copies are scheduled before any possible local uses. Note: getControlRoot() flushes all PendingExports, not only those we generates for relocates. If that'll become a problem, we can change it to flushing relocate exports only. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D87251	2020-09-09 20:22:10 +07:00
Simon Pilgrim	a1a1eb4962	CommandLine.h - use auto const reference in ValuesClass::apply for range loop. NFCI.	2020-09-09 14:21:14 +01:00
Ronak Chauhan	7073a2ba14	Revert "[AMDGPU] Support disassembly for AMDGPU kernel descriptors" This reverts commit 487a80531006add8102d50dbcce4b6fd729ab1f6. Tests fail on big endian machines.	2020-09-09 18:01:28 +05:30
Simon Pilgrim	2c3a34bc05	[KnownBits] Move SelectionDAG::computeKnownBits ISD::ABS handling to KnownBits::abs Move the ISD::ABS handling to a KnownBits::abs handler, to simplify future implementations in ValueTracking/GlobalISel.	2020-09-09 13:22:58 +01:00
Simon Pilgrim	558faf7f34	APInt.h - return directly from clearUnusedBits in single word cases. NFCI. Consistently use the same pattern of returning *this from the clearUnusedBits() call to allow us to early out from the isSingleWord() path and avoid an else statement.	2020-09-09 13:22:57 +01:00
Xing GUO	a4b3197f72	[elf2yaml] Fix dumping a debug section whose name is not recognized. If the debug section's name isn't recognized, it should be dumped as a raw content section. Reviewed By: jhenderson, grimar Differential Revision: https://reviews.llvm.org/D87346	2020-09-09 20:07:05 +08:00
David Stenberg	7d4bb5d4ca	[UnifyFunctionExitNodes] Fix Modified status for unreachable blocks If a function had at most one return block, the pass would return false regardless if an unified unreachable block was created. This patch fixes that by refactoring runOnFunction into two separate helper functions for handling the unreachable blocks respectively the return blocks, as suggested by @bjope in a review comment. This was caught using the check introduced by D80916. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D85818	2020-09-09 13:36:03 +02:00
Juneyoung Lee	624f3d3053	[BuildLibCalls] Add more noundef to library functions This patch follows D85345 and adds more noundef attributes to return values/arguments of library functions that are mostly about accessing the file system or processes. A few functions like `chmod` or `times` use typedef `mode_t` and `clock_t`. They are neither struct nor union, so they cannot contain undef even if they're lowered to iN in IR. So, it is fine to add noundef to them. - clock_t's actual type is size_t (C17, 7.27.1.3), so it isn't struct or union. - For mode_t, either int or long is used in practice because programmers use bit manipulation. So, I think it is okay that it's never aggregate in practice. After this patch, the remaining library functions are those that eagerly participate in optimizations: they can be removed, reordered, or introduced by a transformation from primitive IR operations. For them, a few testings is needed, since it may not be valid to add noundef anymore even if C standard says it's okay. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85894	2020-09-09 20:33:35 +09:00
Juneyoung Lee	bd9d25252a	[ValueTracking] Add UndefOrPoison/Poison-only version of relevant functions This patch adds isGuaranteedNotToBePoison and programUndefinedIfUndefOrPoison. isGuaranteedNotToBePoison will be used at D75808. The latter function is used at isGuaranteedNotToBePoison. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84242	2020-09-09 20:00:26 +09:00
Simon Pilgrim	3a3a752c6d	TrigramIndex.cpp - remove unnecessary includes. NFCI. TrigramIndex.h already includes most of these.	2020-09-09 11:38:31 +01:00
Simon Pilgrim	a2525d173c	ARMTargetParser.cpp - use auto const references in for range loops. NFCI. Fix static analysis warnings about unnecessary copies.	2020-09-09 11:38:31 +01:00
Simon Pilgrim	095c0be790	[APFloat] Fix uninitialized variable in IEEEFloat constructors Some constructors of IEEEFloat do not initialize member variable exponent. Fix it by initializing exponent with the following values: For NaNs, the `exponent` is `maxExponent+1`. For Infinities, the `exponent` is `maxExponent+1`. For Zeroes, the `exponent` is `maxExponent-1`. Patch by: @nullptr.cpp (Yang Fan) Differential Revision: https://reviews.llvm.org/D86997	2020-09-09 11:38:30 +01:00
Florian Hahn	97c9619e22	[DomTree] Use SmallVector<DomTreeNodeBase *, 4> instead of std::vector. Currentl DomTreeNodeBase is using std::vectot to store it's children. Using SmallVector should be more efficient in terms of compile-time. A size of 4 seems to be the sweet-spot in terms of compile-time, according to http://llvm-compile-time-tracker.com/compare.php?from=9933188c90615c9c264ebb69117f09726e909a25&to=d7a801d027648877b20f0e00e822a7a64c58d976&stat=instructions This results in the following geomean improvements ``` geomean insts max rss O3 -0.31 % +0.02 % ReleaseThinLTO -0.35 % -0.12 % ReleaseLTO -0.28 % -0.12 % O0 -0.06 % -0.02 % NewPM O3 -0.36 % +0.05 % ReleaseThinLTO (link only) -0.44 % -0.10 % ReleaseLTO-g (link only): -0.32 % -0.03 % ``` I am not sure if there's any other benefits of using std::vector over SmallVector. Reviewed By: kuhar, asbirlea Differential Revision: https://reviews.llvm.org/D87319	2020-09-09 11:20:13 +01:00
Sjoerd Meijer	0297a27d72	[ARM] Fixup of a few test cases. NFC. After changing the semantics of get.active.lane.mask, I missed a few tests that should use now the tripcount instead of the backedge taken count.	2020-09-09 11:14:44 +01:00
Mirko Brkusanin	049222554b	[AMDGPU] Workaround for LDS Misalignment bug on GFX10 Add subtarget feature check to avoid using ds_read/write_b96/128 with too low alignment if a bug is present on that specific hardware. Add this "feature" to GFX 10.1.1 as it is also affected. Add global-isel test.	2020-09-09 11:46:09 +02:00
Max Kazantsev	97b38e1002	[Test] Add failing test for pr47457	2020-09-09 15:45:35 +07:00
Florian Hahn	1aaacc4182	[EarlyCSE] Explicitly require AAResultsWrapperPass. The MemorySSAWrapperPass depends on AAResultsWrapperPass and if MemorySSA is preserved but AAResultsWrapperPass is not, this could lead to a crash when updating the last user of the MemorySSAWrapperPass. Alternatively AAResultsWrapperPass could be marked preserved by GVN, but I am not sure if that would be safe. I am not sure what is required in order to preserve AAResultsWrapperPass. At the moment, it seems like a couple of passes that do similar transforms to GVN are preserving it. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87137	2020-09-09 09:14:50 +01:00

1 2 3 4 5 ...

203208 Commits