archived-llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2026-01-31 01:35:20 +01:00

Author	SHA1	Message	Date
Petar Avramovic	3c5ea6a039	[MIPatternMatch]: Add matchers for binary instructions Add matchers that support commutative and non-commutative binary opcodes. Differential Revision: https://reviews.llvm.org/D99736	2021-04-27 11:37:42 +02:00
Petar Avramovic	692456b891	[MIPatternMatch]: Add mi_match for MachineInstr This utility allows more efficient start of pattern match. Often MachineInstr(MI) is available and instead of using mi_match(MI.getOperand(0).getReg(), MRI, ...) followed by MRI.getVRegDef(Reg) that gives back MI we now use mi_match(MI, MRI, ...). Differential Revision: https://reviews.llvm.org/D99735	2021-04-27 11:08:16 +02:00
Petar Avramovic	00669818e5	[MIPatternMatch]: Add ICstRegMatch Matches G_CONSTANT and returns its def register. Differential Revision: https://reviews.llvm.org/D99734	2021-04-27 10:53:17 +02:00
Petar Avramovic	94c4ae8b43	[GlobalISel]: Add a getConstantIntVRegVal utility Returns ConstantInt from G_CONSTANT instruction given its def register. Differential Revision: https://reviews.llvm.org/D99733	2021-04-27 10:52:07 +02:00
dfukalov	584872cffa	[TTI] NFC: Change getScalarizationOverhead and getOperandsScalarizationOverhead to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101283	2021-04-27 08:51:48 +03:00
Lang Hames	53ce1eab3b	[ORC] Make LLVMOrcLLJITBuilderSetJITTargetMachineBuilder consume as advertised. This should fix some of the memory leaks seen in the ORC C API test case.	2021-04-26 22:26:38 -07:00
Chen Zheng	611f6b8043	[XCOFF] make .file directive have directory info The .file directive is changed to only have basename in D36018 for ELF. But on AIX, we require the .file directive to also contain the directory info. This aligns with other AIX compiler like XLC and is required by some AIX tool like DBX. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D99785	2021-04-27 00:15:23 -04:00
Lang Hames	895d41ab8e	Reapply "[ORC] Add unit tests for parts of the ..." with fixes and improvements. This reapplies 8740360093b, which was reverted in bbddadd46e4 due to buildbot errors. This version checks that a JIT instance can be safely constructed, skipping tests if it can not be. To enable this it introduces new C API to retrieve and set the target triple for a JITTargetMachineBuilder.	2021-04-26 20:44:40 -07:00
William S. Moses	e7084f2810	[NVPTX] Enable lowering of atomics on local memory LLVM does not have valid assembly backends for atomicrmw on local memory. However, as this memory is thread local, we should be able to lower this to the relevant load/store. Differential Revision: https://reviews.llvm.org/D98650	2021-04-26 20:12:12 -04:00
Fangrui Song	d0dc7f2344	[ADT] Remove StatisticBase and make NoopStatistic empty In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic` but has 3 mostly unused pointers. GlobalOpt considers that the pointers can potentially retain allocated objects, so GlobalOpt cannot optimize out the `NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage 2 clang. This patch makes `NoopStatistic` empty and thus reclaims the wasted space. The clang size is even smaller than applying D69428 (slightly smaller in both .bss and .text). ``` # This means the D69428 optimization on clang is mostly nullified by this patch. HEAD+D69428: size(.bss) = 0x0725a8 HEAD+D101211: size(.bss) = 0x072238 # bloaty - HEAD+D69428 vs HEAD+D101211 # With D101211, we also save a lot of string table space (.rodata). FILE SIZE VM SIZE -------------- -------------- -0.0% -32 -0.0% -24 .eh_frame -0.0% -336 [ = ] 0 .symtab -0.0% -360 [ = ] 0 .strtab [ = ] 0 -0.2% -880 .bss -0.0% -2.11Ki -0.0% -2.11Ki .rodata -0.0% -2.89Ki -0.0% -2.89Ki .text -0.0% -5.71Ki -0.0% -5.88Ki TOTAL ``` Note: LoopFuse is a disabled pass. For now this patch adds `#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in LLVM_ENABLE_STATS==0 builds. If these `OptimizationRemarkMissed` are useful in LLVM_ENABLE_STATS==0 builds, we can replace `llvm::Statistic` with `llvm::TrackingStatistic`, or use a different abstraction to keep track of the strings. Similarly, skip the code in `mlir/lib/Pass/PassStatistics.cpp` which calls `getName`/`getDesc`/`getValue`. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D101211	2021-04-26 16:47:32 -07:00
William S. Moses	5b35b95712	Revert "[NVPTX] Enable lowering of atomics on local memory" This reverts commit fede99d386ec9e7bab2762aa16cb10c0513ae464.	2021-04-26 19:33:01 -04:00
William S. Moses	9ac62ee58d	[NVPTX] Enable lowering of atomics on local memory LLVM does not have valid assembly backends for atomicrmw on local memory. However, as this memory is thread local, we should be able to lower this to the relevant load/store. Differential Revision: https://reviews.llvm.org/D98650	2021-04-26 19:27:27 -04:00
Lei Zhang	6f2c652025	Revert "[ADT] Remove StatisticBase and make NoopStatistic empty" This reverts commit b5403117814a7c39b944839e10492493f2ceb4ac because it breaks MLIR build: https://buildkite.com/mlir/mlir-core/builds/13299#ad0f8901-dfa4-43cf-81b8-7940e2c6c15b	2021-04-26 18:31:04 -04:00
Fangrui Song	4b2b6e81e1	Revert D76519 "[NFC] Refactor how CFI section types are represented in AsmPrinter" This reverts commit 0ce723cb228bc1d1a0f5718f3862fb836145a333. D76519 was not quite NFC. If we see a CFISection::Debug function before a CFISection::EH one (-fexceptions -fno-asynchronous-unwind-tables), we may incorrectly pick CFISection::Debug and emit a `.cfi_sections .debug_frame`. We should use .eh_frame instead. This scenario is untested.	2021-04-26 15:17:28 -07:00
Lang Hames	0a240e5b31	[ORC] C API updates. Adds support for creating custom MaterializationUnits in the C API with the new LLVMOrcCreateCustomMaterializationUnit function. Modifies ownership rules for LLVMOrcAbsoluteSymbols to make it consistent with LLVMOrcCreateCustomMaterializationUnit. This is an ABI breaking change for any clients of the LLVMOrcAbsoluteSymbols API. Adds LLVMOrcLLJITGetObjLinkingLayer and LLVMOrcObjectLayerEmit functions to allow clients to get a reference to an LLJIT instance's linking layer, then emit an object file using it. This can be used to support construction of custom materialization units in the common case where those units will generate an object file that needs to be emitted to complete the materialization.	2021-04-26 13:58:37 -07:00
Lang Hames	20ebd8978c	[ORC] Fix type name. Rename JITTargetSymbolFlags to JITSymbolTargetFlags. This matches the convention used for JITSymbolGenericFlags.	2021-04-26 13:58:37 -07:00
Fangrui Song	7fa56f879f	[ADT] Remove StatisticBase and make NoopStatistic empty In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic` but has 3 unused pointers. GlobalOpt considers that the pointers can potentially retain allocated objects, so GlobalOpt cannot optimize out the `NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage 2 clang. This patch makes `NoopStatistic` empty and thus reclaims the wasted space. The clang size is even smaller than applying D69428 (slightly smaller in both .bss and .text). ``` # This means the D69428 optimization on clang is mostly nullified by this patch. HEAD+D69428: size(.bss) = 0x0725a8 HEAD+D101211: size(.bss) = 0x072238 # bloaty - HEAD+D69428 vs HEAD+D101211 # With D101211, we also save a lot of string table space (.rodata). FILE SIZE VM SIZE -------------- -------------- -0.0% -32 -0.0% -24 .eh_frame -0.0% -336 [ = ] 0 .symtab -0.0% -360 [ = ] 0 .strtab [ = ] 0 -0.2% -880 .bss -0.0% -2.11Ki -0.0% -2.11Ki .rodata -0.0% -2.89Ki -0.0% -2.89Ki .text -0.0% -5.71Ki -0.0% -5.88Ki TOTAL ``` Note: LoopFuse is a disabled pass. This patch adds `#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in LLVM_ENABLE_STATS==0 builds. If these `OptimizationRemarkMissed` are useful and not noisy, we can replace `llvm::Statistic` with `llvm::TrackingStatistic` in the future. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D101211	2021-04-26 13:39:35 -07:00
William S. Moses	98f63bbad6	[Lexer] Allow LLLexer to be used as an API Explose LLVM Lexer for usage externally as an API Differential Revision: https://reviews.llvm.org/D100920	2021-04-26 12:43:14 -04:00
Paul C. Anagnostopoulos	6409c51e4c	[TableGen] Change assertion information from a tuple to a struct [NFC] Differential Revision: https://reviews.llvm.org/D100854	2021-04-26 09:57:16 -04:00
Tim Renouf	79e5f97ca7	[MC][AMDGPU][llvm-objdump] Synthesized local labels in disassembly 1. Add an accessor function to MCSymbolizer to retrieve addresses referenced by a symbolizable operand, but not resolved to a symbol. That way, the caller can synthesize labels at those addresses and then retry disassembling the section. 2. Implement that in AMDGPU -- a failed symbol lookup results in the address being added to a vector returned by the new function. 3. Use that in llvm-objdump when using MCSymbolizer (which only happens on AMDGPU) and SymbolizeOperands is on. Differential Revision: https://reviews.llvm.org/D101145 Change-Id: I19087c3bbfece64bad5a56ee88bcc9110d83989e	2021-04-26 13:56:36 +01:00
David Sherwood	6475ab5a00	[AArch64] Add AArch64TTIImpl::getMaskedMemoryOpCost function When vectorising for AArch64 targets if you specify the SVE attribute we automatically then treat masked loads and stores as legal. Also, since we have no cost model for masked memory ops we believe it's cheap to use the masked load/store intrinsics even for fixed width vectors. This can lead to poor code quality as the intrinsics will currently be scalarised in the backend. This patch adds a basic cost model that marks fixed-width masked memory ops as significantly more expensive than for scalable vectors. Tests for the cost model are added here: Transforms/LoopVectorize/AArch64/masked-op-cost.ll Differential Revision: https://reviews.llvm.org/D100745	2021-04-26 11:00:03 +01:00
Levy Hsu	b3953bd33d	[RISCV] [1/2] Add IR intrinsic for Zbe extension RV32/64: bcompress bdecompress RV64 ONLY: bcompressw bdecompressw Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101143	2021-04-25 19:14:34 -07:00
Tomasz Miąsko	d94bcd2889	[demangler] Use standard semantics for StringView::substr The StringView::substr now accepts a substring starting position and its length instead of previous non-standard `from` & `to` positions. All uses of two argument StringView::substr are in MicrosoftDemangler and have 0 as a starting position, so no changes are necessary. This also fixes a bug where attempting to extract a suffix with substr (a `to` position equal to size) would return a substring without the last character. Fixing the issue should not introduce observable changes in the demangler, since as currently used, a second argument to StringView::substr is either: 1) a result of a successful call to StringView::find and so necessarily smaller than size., or 2) in the case of Demangler::demangleCharLiteral potentially equal to size, but with demangler expecting more data to follow later on and failing either way. Reviewed By: #libc_abi, ldionne, erik.pilkington Differential Revision: https://reviews.llvm.org/D100246	2021-04-25 13:56:41 +02:00
Xiang1 Zhang	6da00a5d84	[X86] Support AMX fast register allocation Differential Revision: https://reviews.llvm.org/D100026	2021-04-25 09:45:41 +08:00
Lang Hames	8daeb57ed9	[ORC][C-bindings] Fix missing ')' in comments.	2021-04-24 18:04:57 -07:00
Nikita Popov	da6fbe8578	[PatternMatch] Improve m_Deferred() documentation (NFC) m_Deferred() has nothing to do with commutative matchers, it needs to be used whenever the value to match is determinde as part of the same match expression.	2021-04-24 21:00:24 +02:00
RamNalamothu	065e9bee9e	[NFC] Refactor how CFI section types are represented in AsmPrinter In terms of readability, the `enum CFIMoveType` didn't better document what it intends to convey i.e. the type of CFI section that gets emitted. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D76519	2021-04-24 23:29:42 +05:30
dfukalov	4398288fa7	[GVN] Clobber partially aliased loads. Use offsets stored in `AliasResult` implemented in D98718. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543	2021-04-24 14:14:20 +03:00
Teresa Johnson	0f0622320f	Mark type test intrinsics as speculatable to fix inline cost There is already code in InlineCost.cpp to identify and ignore ephemeral values (llvm.assume intrinsics and other side-effect free instructions only feeding the assumes). However, because llvm.type.test intrinsics were not marked speculatable, they and any instructions specifically feeding the type test (typically a bitcast) were being counted towards the instruction cost when inlining. This was causing profile matching issues in some cases when enabling -fwhole-program-vtables for whole program devirtualization. According to the language reference, the speculatable attribute means: "the function does not have any effects besides calculating its result and does not have undefined behavior". I see no reason why type tests cannot be marked with this attribute. There are 2 test changes: llvm/test/Transforms/Inline/ephemeral.ll: I added a type test intrinsic here to verify the fix. Also, I found the test was not actually testing what it originally intended. Many of the existing instructions were optimized away by -Oz, and the cost of inlining was negative due to the benefit of removing the call. So I changed the test to simply invoke the inline pass and check the number of instructions computed by InlineCost. I also fixed an instruction that was not actually used anywhere. llvm/test/Transforms/SimplifyCFG/no-md-sink.ll needed to be made more robust to code changes that reordered the metadata. Differential Revision: https://reviews.llvm.org/D101180	2021-04-23 10:02:31 -07:00
Nemanja Ivanovic	572869bea6	[PowerPC] Add vec_ctsl and vec_ctul to altivec.h These are added for compatibility with XLC. They are similar to vec_cts and vec_ctu except that the result is a doubleword vector regardless of the parameter type.	2021-04-23 11:03:38 -05:00
Sander de Smalen	b71b6a828f	[TTI] NFC: Change getIntImmCost[Inst\|Intrin] to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100565	2021-04-23 16:06:36 +01:00
Sander de Smalen	b447679db3	[TTI] NFC: Change getScalingFactorCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100564	2021-04-23 16:06:36 +01:00
Sander de Smalen	a6cde052ae	[TTI] NFC: Change getMemcpyCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100563	2021-04-23 16:06:35 +01:00
Sander de Smalen	9eda3a0c2e	[TTI] NFC: Change getGEPCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100562	2021-04-23 16:06:35 +01:00
Sander de Smalen	9390d5a48f	[TTI] NFC: Change getAddressComputationCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100561	2021-04-23 16:06:35 +01:00
dfukalov	9779f666df	[TTI] NFC: Use InstructionCost to store ScalarizationCost in IntrinsicCostAttributes. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D101151	2021-04-23 18:02:00 +03:00
Daniil Fukalov	3d4fc1f0f6	[TTI] Fix ScalarizationCost initialization. In cases when ScalarizationCostPassed has no value, UINT_MAX is actually used for cost estimation in `return ScalarCalls * ScalarCost + ScalarizationCost`. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101099	2021-04-23 17:59:59 +03:00
Stephen Tozer	8b8275b1fc	Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" Previous build failures were caused by an error in bitcode reading and writing for DIArgList metadata, which has been fixed in e5d844b587. There were also some unnecessary asserts that were being triggered on certain builds, which have been removed. This reverts commit dad5caa59e6b2bde8d6cf5b64a972c393c526c82.	2021-04-23 10:54:01 +01:00
Jay Foad	899f1c90ad	[GlobalISel] Remove ConstantFoldingMIRBuilder ConstantFoldingMIRBuilder was an experiment which is not used for anything. The constant folding functionality is now part of CSEMIRBuilder. Differential Revision: https://reviews.llvm.org/D101050	2021-04-23 09:13:27 +01:00
Fangrui Song	c83fe04e08	[IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all} The Linux kernel objtool diagnostic `call without frame pointer save/setup` arise in multiple instrumentation passes (asan/tsan/gcov). With the mechanism introduced in D100251, it's trivial to respect the command line -m[no-]omit-leaf-frame-pointer/-f[no-]omit-frame-pointer, so let's do it. Fix: https://github.com/ClangBuiltLinux/linux/issues/1236 (tsan) Fix: https://github.com/ClangBuiltLinux/linux/issues/1238 (asan) Also document the function attribute "frame-pointer" which is long overdue. Differential Revision: https://reviews.llvm.org/D101016	2021-04-22 18:07:30 -07:00
Levy Hsu	04656c7e3e	[RISCV] [1/2] Add IR intrinsic for Zbp extension RV32/64: grev grevi gorc gorci shfl shfli unshfl unshfli RV64 ONLY: grevw greviw gorcw gorciw shflw shfli (For non-existing shfliw) unshfli (For non-existing unshfliw) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100830	2021-04-22 16:34:51 -07:00
Craig Topper	ae70485f74	[RISCV] Add IR intrinsics for vmsge(u).vv/vx/vi. These instructions don't really exist, but we have ways we can emulate them. .vv will swap operands and use vmsle().vv. .vi will adjust the immediate and use .vmsgt(u).vi when possible. For .vx we need to use some of the multiple instruction sequences from the V extension spec. For unmasked vmsge(u).vx we use: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd For cases where mask and maskedoff are the same value then we have vmsge{u}.vx v0, va, x, v0.t which is the vd==v0 case that requires a temporary so we use: vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt For other masked cases we use this sequence: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0 We trust that register allocation will prevent vd in vmslt{u}.vx from being v0 since v0 is still needed by the vmxor. Differential Revision: https://reviews.llvm.org/D100925	2021-04-22 10:44:38 -07:00
Fangrui Song	dcdc354ce1	Temporarily revert the code part of D100981 "Delete le32/le64 targets" This partially reverts commit 77ac823fd285973cfb3517932c09d82e6a32f46d. Halide uses le32/le64 (https://github.com/halide/Halide/pull/5934). Temporarily brings back the code part to give them some time for migration.	2021-04-22 10:18:44 -07:00
Krzysztof Parzyszek	e644e52d90	[Hexagon] Add HVX intrinsics for conditional vector loads/stores Intrinsics for the following instructions are added. The intrinsic name is "int_hexagon_<inst>[_128B]", e.g. int_hexagon_V6_vL32b_pred_ai for 64-byte version int_hexagon_V6_vL32b_pred_ai_128B for 128-byte version V6_vL32b_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4) V6_vL32b_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3) V6_vL32b_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2) V6_vL32b_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4) V6_vL32b_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3) V6_vL32b_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2) V6_vL32b_nt_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4):nt V6_vL32b_nt_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3):nt V6_vL32b_nt_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2):nt V6_vL32b_nt_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4):nt V6_vL32b_nt_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3):nt V6_vL32b_nt_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2):nt V6_vS32b_pred_ai if (Pv4) vmem(Rt32+#s4) = Vs32 V6_vS32b_pred_pi if (Pv4) vmem(Rx32++#s3) = Vs32 V6_vS32b_pred_ppu if (Pv4) vmem(Rx32++Mu2) = Vs32 V6_vS32b_npred_ai if (!Pv4) vmem(Rt32+#s4) = Vs32 V6_vS32b_npred_pi if (!Pv4) vmem(Rx32++#s3) = Vs32 V6_vS32b_npred_ppu if (!Pv4) vmem(Rx32++Mu2) = Vs32 V6_vS32Ub_pred_ai if (Pv4) vmemu(Rt32+#s4) = Vs32 V6_vS32Ub_pred_pi if (Pv4) vmemu(Rx32++#s3) = Vs32 V6_vS32Ub_pred_ppu if (Pv4) vmemu(Rx32++Mu2) = Vs32 V6_vS32Ub_npred_ai if (!Pv4) vmemu(Rt32+#s4) = Vs32 V6_vS32Ub_npred_pi if (!Pv4) vmemu(Rx32++#s3) = Vs32 V6_vS32Ub_npred_ppu if (!Pv4) vmemu(Rx32++Mu2) = Vs32 V6_vS32b_nt_pred_ai if (Pv4) vmem(Rt32+#s4):nt = Vs32 V6_vS32b_nt_pred_pi if (Pv4) vmem(Rx32++#s3):nt = Vs32 V6_vS32b_nt_pred_ppu if (Pv4) vmem(Rx32++Mu2):nt = Vs32 V6_vS32b_nt_npred_ai if (!Pv4) vmem(Rt32+#s4):nt = Vs32 V6_vS32b_nt_npred_pi if (!Pv4) vmem(Rx32++#s3):nt = Vs32 V6_vS32b_nt_npred_ppu if (!Pv4) vmem(Rx32++Mu2):nt = Vs32	2021-04-22 11:49:29 -05:00
Irina Dobrescu	53c3a79b92	[flang][openmp] Add General Semantic Checks for Allocate Directive This patch adds semantic checks for the General Restrictions of the Allocate Directive. Since the requires directive is not yet implemented in Flang, the restriction: ``` allocate directives that appear in a target region must specify an allocator clause unless a requires directive with the dynamic_allocators clause is present in the same compilation unit ``` will need to be updated at a later time. A different patch will be made with the Fortran specific restrictions of this directive. I have used the code from https://reviews.llvm.org/D89395 for the CheckObjectListStructure function. Co-authored-by: Isaac Perry <isaac.perry@arm.com> Reviewed By: clementval, kiranchandramohan Differential Revision: https://reviews.llvm.org/D91159	2021-04-22 16:15:06 +00:00
Simon Pilgrim	6870b3d5f3	[LTO] Caching.h - remove unused <string> include. NFCI.	2021-04-22 14:07:12 +01:00
Jun Ma	684cf0c2d5	[DAGCombiner] Allow operand of step_vector to be negative. It is proper to relax non-negative limitation of step_vector. Also this patch adds more combines for step_vector: (sub X, step_vector(C)) -> (add X, step_vector(-C)) Differential Revision: https://reviews.llvm.org/D100812	2021-04-22 20:58:03 +08:00
Jay Foad	f7148011b7	Fix typo "beneficiates" in comments	2021-04-22 12:30:16 +01:00
Wenlei He	e734b4c21b	[CSSPGO][llvm-profdata] Support trimming cold context when merging profiles The change adds support for triming and merging cold context when mergine CSSPGO profiles using llvm-profdata. This is similar to the context profile trimming in llvm-profgen, however the flexibility to trim cold context after profile is generated can be useful. Differential Revision: https://reviews.llvm.org/D100528	2021-04-22 00:42:37 -07:00
Max Kazantsev	9fbf6639f4	[GVN] Introduce loop load PRE This patch allows PRE of the following type of loads: ``` preheader: br label %loop loop: br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p br label %merge merge: ... br i1 ..., label %loop, label %exit ``` Into ``` preheader: %x0 = load %p br label %loop loop: %x.pre = phi(x0, x2) br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p %x1 = load %p br label %merge merge: x2 = phi(x.pre, x1) ... br i1 ..., label %loop, label %exit ``` So instead of loading from %p on every iteration, we load only when the actual clobber happens. The typical pattern which it is trying to address is: hot loop, with all code inlined and provably having no side effects, and some side-effecting calls on cold path. The worst overhead from it is, if we always take clobber block, we make 1 more load overall (in preheader). It only matters if loop has very few iteration. If clobber block is not taken at least once, the transform is neutral or profitable. There are several improvements prospect open up: - We can sometimes be smarter in loop-exiting blocks via split of critical edges; - If we have block frequency info, we can handle multiple clobbers. The only obstacle now is that we don't know if their sum is colder than the header. Differential Revision: https://reviews.llvm.org/D99926 Reviewed By: reames	2021-04-22 12:50:38 +07:00

1 2 3 4 5 ...

44737 Commits