llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-12-12 22:17:47 +00:00

Author	SHA1	Message	Date
Venkataramanan Kumar	36b1d49191	[InstCombine] Transform 1.0/sqrt(X) * X to X/sqrt(X) These transforms will now be performed irrespective of the number of uses for the expression "1.0/sqrt(X)": 1.0/sqrt(X) * X => X/sqrt(X) X * 1.0/sqrt(X) => X/sqrt(X) We already handle more general cases, and we are intentionally not creating extra (and likely expensive) fdiv ops in IR. This pattern is the exception to the rule because we always expect the Backend to reduce X/sqrt(X) to sqrt(X), if it has the necessary (reassoc) fast-math-flags. Ref: DagCombiner optimizes the X/sqrt(X) to sqrt(X). Differential Revision: https://reviews.llvm.org/D86726	2020-09-02 08:23:48 -04:00
Sanjay Patel	4e9822e551	[VectorCombine] allow vector loads with mismatched insert type This is an enhancement to D81766 to allow loading the minimum target vector type into an IR vector with a different number of elements. In one of the motivating tests from PR16739, SLP creates <2 x float> load ops mixed with <4 x float> insert ops, so we want to handle that pattern in addition to potential oversized vectors created by the vectorizers. For now, we are assuming the insert/extract subvector with undef is free because there is no exact corresponding TTI modeling for that. Differential Revision: https://reviews.llvm.org/D86160	2020-09-02 08:11:36 -04:00
Max Kazantsev	260b1e427d	[Test] Simplify test by removing unneeded variable	2020-09-02 18:39:43 +07:00
Paul Walker	aeb7dd335b	[SVE] Don't reorder subvector/binop sequences when the resulting binop is not legal. When lowering fixed length vector operations for SVE the subvector operations are used extensively to marshall data between scalable and fixed-length vectors. This means that sequences like: extract_subvec(binop(insert_subvec(a), insert_subvec(b))) are very common. DAGCombine only checks if the resulting binop is legal or can be custom lowered when undoing such sequences. When it's custom lowering that is introducing them the result is an infinite legalise->combine->legalise loop. This patch extends the isOperationLegalOr... functions to include a "LegalOnly" parameter to restrict the check to legal operations only. Although isOperationLegal could be used it's common for the affected code paths to be visited pre and post legalisation, so the extra parameter keeps the code tidy. Differential Revision: https://reviews.llvm.org/D86450	2020-09-02 11:01:33 +01:00
Jay Foad	02f664086c	[AMDGPU] Fix offset for REL32_HI relocs The addend in a REL32 reloc needs to be adjusted to account for the offset from the PC value returned by the s_getpc instruction to the point where the reloc is applied. This was being done correctly for (GOTPC)REL32_LO but not for (GOTPC)REL32_HI. This will only make a difference if the target symbol happens to get loaded almost exactly a multiple of 4G away from the relocated instructions. Differential Revision: https://reviews.llvm.org/D86938	2020-09-02 10:55:55 +01:00
Sander de Smalen	40471e4fe8	[AArch64][SVE] Preserve full vector regs over EH edge. Unwinders may only preserve the lower 64bits of Neon and SVE registers, as only the registers in the base ABI are guaranteed to be preserved over the exception edge. The caller will need to preserve additional registers for when the call throws an exception and the unwinder has tried to recover state. For e.g. svint32_t bar(svint32_t); svint32_t foo(svint32_t x, bool err) { try { bar(x); } catch (...) { err = true; } return x; } `z0` needs to be spilled before the call to `bar(x)` and reloaded before returning from foo, as the exception handler may have clobbered z0. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84737	2020-09-02 10:54:18 +01:00
Igor Kudrin	7e0c7ec095	[DebugInfo] Emit a 1-byte value as a terminator of entries list in the name index. As stated in section 6.1.1.2, DWARFv5, p. 142, \| The last entry for each name is followed by a zero byte that \| terminates the list. There may be gaps between the lists. The patch changes emitting a 4-byte zero value to a 1-byte one, which effectively removes the gap between entry lists, and thus saves approximately 3 bytes per name; the calculation is not exact because the total size of the table is aligned to 4. Differential Revision: https://reviews.llvm.org/D86927	2020-09-02 16:12:39 +07:00
Igor Kudrin	ab5e60fa50	[DebugInfo] Remove Dwarf5AccelTableWriter::Header::UnitLength. NFC. The member is not in use; the unit length for the table is emitted as a difference between two labels. Moreover, the type of the member might be misleading, because for DWARF64 the field should be 64 bit long. Differential Revision: https://reviews.llvm.org/D86912	2020-09-02 16:11:45 +07:00
Martin Storsjö	a68f07462a	[X86] Remove superfluous trailing semicolons, fixing warnings. NFC.	2020-09-02 11:43:27 +03:00
Simon Pilgrim	da3091b1cb	[X86][SSE] SimplifyDemandedVectorEltsForTargetNode - add general shuffle combining support This patch uses partial DemandedElts masks to further simplify target shuffle chains and finally starts making target shuffle combining part of SimplifyDemandedBits/SimplifyDemandedVectorElts. We already manage this for Depth == 0 cases, where combineX86ShuffleChain would early-out if the shuffle combined to the same op, but the patch generalizes this by manipulating the depth handling of combineX86ShufflesRecursively - calling with a new Depth = 0 and reducing the maximum shuffle combine depth accordingly. Differential Revision: https://reviews.llvm.org/D66004	2020-09-02 09:24:46 +01:00
Shinji Okumura	ad9adda18c	[Attributor] Make use of AANoUndef in AAUndefinedBehavior This patch makes it possible for AAUB to use information from AANoUndef. This is the next patch of D86983 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86984	2020-09-02 16:08:03 +09:00
Shinji Okumura	e194827980	[Attributor] Fix AANoUndef initialization When the associated value is undef, we immediately forced to indicate a pessimistic fixpoint so far. This patch changes the initialization to check the attribute given in IR at first and to indicate an optimistic fixpoint when it is given. This change will enable us to catch , for example, the following case in AAUB. ``` call void @foo(i32 noundef undef) ``` Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86983	2020-09-02 15:40:43 +09:00
Zi Xuan Wu	ed26c15759	[RFC][Target] Add a new triple called Triple::csky Before upstream a new target called CSKY, make a new triple of that called Triple::csky. For now, it's a 32-bit little endian target and the detail can be referred at D86269. This is the split part of D86269, which add a new target called CSKY. Differential Revision: https://reviews.llvm.org/D86505	2020-09-02 12:46:09 +08:00
Fangrui Song	a714230414	[CMake] Remove -Wl,-allow-shlib-undefined which was added in rL221530 In GNU ld, gold and LLD, --no-allow-shlib-undefined is the default when linking an executable. The option disallows unresolved symbols in shared objects. (gold and LLD catch fewer cases than GNU ld. See D57385 for details) See D57569 why it is bad idea to use --allow-shlib-undefined for executables [a]. GNU ld traditionally copied DT_NEEDED entries transitively. This was deemed not good, so GNU ld 2.22 defaulted to --no-copy-dt-needed-entries. gold and LLD always behave like --no-copy-dt-needed-entries. rL221530 added -Wl,-allow-shlib-undefined to make some old releases of GNU ld's --no-copy-dt-needed-entries to actually work. Due to [a] and [b], this patch drops -Wl,-allow-shlib-undefined. [b]: In a -DBUILD_SHARED_LIBS=on build, `--as-needed --allow-shlib-undefined` can unexpectedly suppress some .dynsym entries. The issue can cause mlir-cpu-runner to fail at runtime. Note, on Debian, gcc newer than (gcc-9-20190125-2) enable --as-needed by default. See https://sourceware.org/bugzilla/show_bug.cgi?id=26551 for a reduced example. Reviewed By: mehdi_amini, echristo Differential Revision: https://reviews.llvm.org/D86839	2020-09-01 21:13:45 -07:00
Lang Hames	65c420c8ff	[ORC] Remove stray debugging output.	2020-09-01 20:53:49 -07:00
Alina Sbirlea	aa76d4b7d6	Fix build-bots. BasicAA can be freed (and it is not recomputed).	2020-09-01 20:24:15 -07:00
Lang Hames	4a706c5ba3	[ORC] Add an early out for MachOPlatform's init-scraper plugin setup. If there's no initializer symbol in the current MaterializationResponsibility then bail out without installing JITLink passes: they're going to be no-ops anyway.	2020-09-01 20:12:23 -07:00
Lang Hames	cbe0e339f3	[ORC] Fix MachOPlatform's synthetic symbol dependence registration. A think-o in the existing code meant that dependencies were never registered. This failure could lead to crashes rather than orderly error propagation if initialization dependencies failed to materialize. No test case: The bug was discovered in an out-of-tree code and requires pathalogically misconfigured JIT to generate the original error that lead to the crash.	2020-09-01 20:12:23 -07:00
Xing GUO	db9f7d8731	[DebugInfo] Simplify string table dumpers. This patch adds a helper function DumpStrSection to simplify codes. Besides, nonprintable chars in debug_str and debug_str.dwo sections are printed as escaped chars. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86918	2020-09-02 08:41:10 +08:00
Alina Sbirlea	d314cb9743	[MemCpyOptimizer] Preserve analyses and replace use of lambdas to get them. Summary: Analyses are preserved in MemCpyOptimizer. Get analyses before running the pass and store the pointers, instead of using lambdas and getting them every time on demand. Reviewers: lenary, deadalnix, mehdi_amini, nikic, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74494	2020-09-01 17:35:40 -07:00
Jordan Rupprecht	c2af995048	[NFC] Fix unused var in release builds. This was always unused, but the change in D86354 upgraded this to a compiler warning.	2020-09-01 16:38:24 -07:00
Varun Gandhi	99d53ed697	[ADT] Make Optional a literal type. This allows returning Optional values from constexpr contexts. Reviewed By: fhahn, dblaikie, rjmccall Differential Revision: https://reviews.llvm.org/D86354	2020-09-01 16:13:40 -07:00
Amy Kwan	2cc81d44bf	[PowerPC] Implement builtins for xvcvspbf16 and xvcvbf16spn This patch adds the builtin implementation for the xvcvspbf16 and xvcvbf16spn instructions. Differential Revision: https://reviews.llvm.org/D86795	2020-09-01 17:16:43 -05:00
Sergej Jaskiewicz	1842e5f0f3	[llvm] [unittests] Fix failing test 'FileCollectorTest.addDirectory' This fixes a regression in the test suite introduced by fad75598d272b9a5591fb7d9b591cf00cdf5022c	2020-09-02 00:54:37 +03:00
Cameron McInally	7b182e9bab	[SVE] Update INSERT_SUBVECTOR DAGCombine to use getVectorElementCount(). A small piece of the project to replace getVectorNumElements() with getVectorElementCount(). Differential Revision: https://reviews.llvm.org/D86894	2020-09-01 16:51:44 -05:00
Sergej Jaskiewicz	0975a7654a	[llvm] [unittests] Remove temporary files after they're not needed Some LLVM unit tests forget to clean up temporary files and directories. Introduce RAII classes for cleaning them up. Refactor the tests to use those classes. Differential Revision: https://reviews.llvm.org/D83228	2020-09-02 00:34:44 +03:00
Amara Emerson	e10dc0379d	Revert "Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.")" This reverts commit 8693ddc74371dedc742c9f3d3e4eda1da72c13ea. Re-committing with the test requiring asserts.	2020-09-01 14:29:04 -07:00
Michael Kruse	292847821f	[LangRef] Fix condition for when a loop is considered parallel. The wording before this patch applies to llvm.mem.parallel_loop_access, not access groups. Reviewed By: mppf, hfinkel Differential Revision: https://reviews.llvm.org/D83781	2020-09-01 15:41:59 -05:00
Jordan Rupprecht	deb09864b8	Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.") This reverts commit 8ad8f484b63ca507417b58c9016d2761f2b1a1a8. It causes crashes when running `ninja check-llvm-codegen-aarch64-globalisel`, e.g. http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132/steps/test-stage1-compiler/logs/stdio. Note that the crash does not seem to reproduce in debug builds. 5ded4442520d3dbb1aa72e6fe03cddef8828c618 depends on this, so revert that too.	2020-09-01 13:31:57 -07:00
Michael Liao	192b060a21	[amdgpu] Run SROA after loop unrolling. Summary: - There are promotable `alloca`s after loop unrolling. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, nikic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D84252	2020-09-01 16:09:56 -04:00
Jordan Rupprecht	e72bcb7267	[NFC] Fix unused var in release build	2020-09-01 13:05:56 -07:00
Florian Hahn	2248875209	[Loads] Add canReplacePointersIfEqual helper. This patch adds an initial, incomeplete and unsound implementation of canReplacePointersIfEqual to check if a pointer value A can be replaced by another pointer value B, that are deemed to be equivalent through some means (e.g. information from conditions). Note that is in general not sound to blindly replace pointers based on equality, for example if they are based on different underlying objects. LLVM's memory model is not completely settled as of now; see https://bugs.llvm.org/show_bug.cgi?id=34548 for a more detailed discussion. The initial version of canReplacePointersIfEqual only rejects a very specific case: replacing a pointer with a constant expression that is not dereferenceable. Such a replacement is problematic and can be restricted relatively easily without impacting most code. Using it to limit replacements in GVN/SCCP/CVP only results in small differences in 7 programs out of MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto. This patch is supposed to be an initial step to improve the current situation and the helper should be made stricter in the future. But this will require careful analysis of the impact on performance. Reviewed By: aqjune Differential Revision: https://reviews.llvm.org/D85524	2020-09-01 20:57:41 +01:00
Aaron Liu	a410101daf	[LV] Interleave to expose ILP for small loops with scalar reductions. Interleave for small loops that have reductions inside, which breaks dependencies and expose. This gives very significant performance improvements for some benchmarks. Because small loops could be in very hot functions in real applications. Differential Revision: https://reviews.llvm.org/D81416	2020-09-01 19:47:32 +00:00
Craig Topper	d3e914ec24	[MachineCopyPropagation] In isNopCopy, check the destination registers match in addition to the source registers. Previously if the source match we asserted that the destination matched. But GPR <-> mask register copies on X86 can violate this since we use the same K-registers for multiple sizes. Fixes this ISPC issue https://github.com/ispc/ispc/issues/1851 Differential Revision: https://reviews.llvm.org/D86507	2020-09-01 12:44:32 -07:00
Arthur Eubanks	81eaf47f84	[Bindings] Add LLVMAddInstructionSimplifyPass Reviewed By: sroland Differential Revision: https://reviews.llvm.org/D86764	2020-09-01 12:38:49 -07:00
Lang Hames	4dfbee4abb	[ORC] Add unit test for HasMaterializationSideEffectsOnly failure behavior.	2020-09-01 12:34:34 -07:00
Owen Anderson	f252e214b7	Revert "Revert "Reapply D70800: Fix AArch64 AAPCS frame record chain"" This reverts commit bc9a29b9ee6ade4894252b1470977142c32b4602. The reasoning that this patch was wrong was itself incorrect (see discussion on llvm-commits). This patch does seem to be exposing a latent SVE code generation bug on non-public tests, which should not block a correctness fix for public, non-SVE use cases.	2020-09-01 19:29:03 +00:00
Alina Sbirlea	deb30c80fe	[MemorySSA] Update phi map with replacement value.	2020-09-01 11:56:40 -07:00
Hans Wennborg	df821d52a8	First commit on the release/11.x branch.	2020-09-01 11:44:02 -07:00
LLVM GN Syncbot	cbf0672b28	[gn build] Port 3e1e5f54492	2020-09-01 18:30:13 +00:00
LLVM GN Syncbot	8a4a1f29c0	[gn build] Port 3d90a61cf2e	2020-09-01 18:30:13 +00:00
Nico Weber	5e12eaec49	[gn build] port 5ffd940ac02 a bit more	2020-09-01 14:30:01 -04:00
Sean Fertile	42dd72a403	[PowerPC][AIX] Update save/restore offset for frame and base pointers. General purpose registers 30 and 31 are handled differently when they are reserved as the base-pointer and frame-pointer respectively. This fixes the offset of their fixed-stack objects when there are fpr calle-saved registers. Differential Revision: https://reviews.llvm.org/D85850	2020-09-01 14:13:05 -04:00
Craig Topper	d9375ed652	[Bitstream] Use alignTo to make code more readable. NFC I was recently debugging a similar issue to https://reviews.llvm.org/D86500 only with a large metadata section. Only after I finished debugging it did I discover it was fixed very recently. My version of the fix was going to alignTo since that uses uint64_t and improves the readability of the code. So I though I would go ahead and share it. Differential Revision: https://reviews.llvm.org/D86957	2020-09-01 11:06:45 -07:00
Amara Emerson	b1b6d87965	[AArch64][GlobalISel] Optimize away a Not feeding a brcond by using tbz instead of tbnz. Usually brconds are fed by compares, but not always, in which case we would miss this fold. Differential Revision: https://reviews.llvm.org/D86413	2020-09-01 11:06:06 -07:00
Amara Emerson	399486642d	[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _) This is needed for an upcoming change to how we translate conditional branches which might generate these. Differential Revision: https://reviews.llvm.org/D86383	2020-09-01 10:57:17 -07:00
Eric Astor	6aff3ef558	x87 FPU state instructions do not use an f32 memory location These instructions actually use a 512-byte location, where bytes 464-511 are ignored. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D86942	2020-09-01 13:50:07 -04:00
Matt Arsenault	5e713f621a	GlobalISel: Implement computeNumSignBits for G_SELECT	2020-09-01 12:50:19 -04:00
Matt Arsenault	731e73be24	GlobalISel: Port smarter known bits for umin/umax from DAG	2020-09-01 12:50:15 -04:00
Matt Arsenault	942cf2892e	GlobalISel: Implement computeKnownBits for G_BSWAP and G_BITREVERSE	2020-09-01 12:49:57 -04:00

1 2 3 4 5 ...

202877 Commits