archived-llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2026-01-31 01:35:20 +01:00

Author	SHA1	Message	Date
Djordje Todorovic	c793732c01	[RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs This new MIR pass removes redundant DBG_VALUEs. After the register allocator is done, more precisely, after the Virtual Register Rewriter, we end up having duplicated DBG_VALUEs, since some virtual registers are being rewritten into the same physical register as some of existing DBG_VALUEs. Each DBG_VALUE should indicate (at least before the LiveDebugValues) variables assignment, but it is being clobbered for function parameters during the SelectionDAG since it generates new DBG_VALUEs after COPY instructions, even though the parameter has no assignment. For example, if we had a DBG_VALUE $regX as an entry debug value representing the parameter, and a COPY and after the COPY, DBG_VALUE $virt_reg, and after the virtregrewrite the $virt_reg gets rewritten into $regX, we'd end up having redundant DBG_VALUE. This breaks the definition of the DBG_VALUE since some analysis passes might be built on top of that premise..., and this patch tries to fix the MIR with the respect to that. This first patch performs bacward scan, by trying to detect a sequence of consecutive DBG_VALUEs, and to remove all DBG_VALUEs describing one variable but the last one: For example: (1) DBG_VALUE $edi, !"var1", ... (2) DBG_VALUE $esi, !"var2", ... (3) DBG_VALUE $edi, !"var1", ... ... in this case, we can remove (1). By combining the forward scan that will be introduced in the next patch (from this stack), by inspecting the statistics, the RemoveRedundantDebugValues removes 15032 instructions by using gdb-7.11 as a testbed. Differential Revision: https://reviews.llvm.org/D105279	2021-07-14 04:29:42 -07:00
Matt Arsenault	74be1319be	RegAlloc: Allow targets to split register allocation AMDGPU normally spills SGPRs to VGPRs. Previously, since all register classes are handled at the same time, this was problematic. We don't know ahead of time how many registers will be needed to be reserved to handle the spilling. If no VGPRs were left for spilling, we would have to try to spill to memory. If the spilled SGPRs were required for exec mask manipulation, it is highly problematic because the lanes active at the point of spill are not necessarily the same as at the restore point. Avoid this problem by fully allocating SGPRs in a separate regalloc run from VGPRs. This way we know the exact number of VGPRs needed, and can reserve them for a second run. This fixes the most serious issues, but it is still possible using inline asm to make all VGPRs unavailable. Start erroring in the case where we ever would require memory for an SGPR spill. This is implemented by giving each regalloc pass a callback which reports if a register class should be handled or not. A few passes need some small changes to deal with leftover virtual registers. In the AMDGPU implementation, a new pass is introduced to take the place of PrologEpilogInserter for SGPR spills emitted during the first run. One disadvantage of this is currently StackSlotColoring is no longer used for SGPR spills. It would need to be run again, which will require more work. Error if the standard -regalloc option is used. Introduce new separate -sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be controlled individually. PBQB is not currently supported, so this also prevents using the unhandled allocator.	2021-07-13 18:49:29 -04:00
Stanislav Mekhanoshin	dc43bb3409	[AMDGPU] Disable garbage collection passes Differential Revision: https://reviews.llvm.org/D105593	2021-07-07 15:47:57 -07:00
Rong Xu	f505b894a2	[SampleFDO] New hierarchical discriminator for FS SampleFDO (ProfileData part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is mainly for ProfileData part of change. It will load FS Profile when such profile is detected. For an extbinary format profile, create_llvm_prof tool will add a flag to profile summary section. For other format profiles, the users need to use an internal option (-profile-isfs) to tell the compiler that the profile uses FS discriminators. This patch also simplified the bit API used by FS discriminators. Differential Revision: https://reviews.llvm.org/D103041	2021-06-02 10:32:52 -07:00
Arthur Eubanks	7178161bf7	Remove "Rewrite Symbols" from codegen pipeline It breaks up the function pass manager in the codegen pipeline. With empty parameters, it looks at the -mllvm flag -rewrite-map-file. This is likely not in use. Add a check that we only have one function pass manager in the codegen pipeline. Some tests relied on the fact that we had a module pass somewhere in the codegen pipeline. addr-label.ll crashes on ARM due to this change. This is because a ARMConstantPoolConstant containing a BasicBlock to represent a blockaddress may hold an invalid pointer to a BasicBlock if the blockaddress is invalidated by its BasicBlock getting removed. In that case all referencing blockaddresses are RAUW a constant int. Making ARMConstantPoolConstant::CVal a WeakVH fixes the crash, but I'm not sure that's the right fix. As a workaround, create a barrier right before ISel so that IR optimizations can't happen while a ARMConstantPoolConstant has been created. Reviewed By: rnk, MaskRay, compnerd Differential Revision: https://reviews.llvm.org/D99707	2021-05-31 08:32:36 -07:00
Rong Xu	4e30d875bf	[SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This patch implements first part of Flow Sensitive SampleFDO (FSAFDO). It has the following changes: (1) disable current discriminator encoding scheme, (2) new hierarchical discriminator for FSAFDO. For this patch, option "-enable-fs-discriminator=true" turns on the new functionality. Option "-enable-fs-discriminator=false" (the default) keeps the current SampleFDO behavior. When the fs-discriminator is enabled, we insert a flag variable, namely, llvm_fs_discriminator, to the object. This symbol will checked by create_llvm_prof tool, and used to generate a profile with FS-AFDO discriminators enabled. If this happens, for an extbinary format profile, create_llvm_prof tool will add a flag to profile summary section. Differential Revision: https://reviews.llvm.org/D102246	2021-05-18 16:23:43 -07:00
Xiang1 Zhang	38ab3a2d9f	[X86] Support AMX fast register allocation Differential Revision: https://reviews.llvm.org/D100026	2021-05-08 14:21:11 +08:00
Xiang1 Zhang	f668ad4ced	Revert "[X86] Support AMX fast register allocation" This reverts commit 77e2e5e07d01fe0b83c39d0c527c0d3d2e659146.	2021-05-08 13:43:32 +08:00
Xiang1 Zhang	fe856bad78	[X86] Support AMX fast register allocation	2021-05-08 13:27:21 +08:00
Simon Moll	b4f3331ca0	Recommit "[VP,Integer,#2] ExpandVectorPredication pass" This reverts the revert 02c5ba8679873e878ae7a76fb26808a47940275b Fix: Pass was registered as DUMMY_FUNCTION_PASS causing the newpm-pass functions to be doubly defined. Triggered in -DLLVM_ENABLE_MODULE=1 builds. Original commit: This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-05-04 11:47:52 +02:00
Adrian Prantl	32b1ee25e7	Revert "[VP,Integer,#2] ExpandVectorPredication pass" This reverts commit 43bc584dc05e24c6d44ece8e07d4bff585adaf6d. The commit broke the -DLLVM_ENABLE_MODULES=1 builds. http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/31603/consoleFull#2136199809a1ca8a51-895e-46c6-af87-ce24fa4cd561	2021-04-30 17:02:28 -07:00
Simon Moll	63ed8031a5	[VP,Integer,#2] ExpandVectorPredication pass This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-04-30 15:47:28 +02:00
Benjamin Kramer	0c09b6edfc	Revert "[X86] Support AMX fast register allocation" This reverts commit 3b8ec86fd576b9808dc63da620d9a4f7bbe04372. Revert "[X86] Refine AMX fast register allocation" This reverts commit c3f95e9197643b699b891ca416ce7d72cf89f5fc. This pass breaks using LLVM in a multi-threaded environment by introducing global state.	2021-04-29 18:56:33 +02:00
Xiang1 Zhang	6da00a5d84	[X86] Support AMX fast register allocation Differential Revision: https://reviews.llvm.org/D100026	2021-04-25 09:45:41 +08:00
Arthur Eubanks	3d567a0891	Revert "Remove "Rewrite Symbols" from codegen pipeline" This reverts commit 6210261ecb21c84c9a440a76c0ccbc8ad211bed3. addr-label.ll crashes on armv7.	2021-04-10 23:28:16 -07:00
Arthur Eubanks	ce364576b2	Remove "Rewrite Symbols" from codegen pipeline It breaks up the function pass manager in the codegen pipeline. With empty parameters, it looks at the -mllvm flag -rewrite-map-file. This is likely not in use. Add a check that we only have one function pass manager in the codegen pipeline. This required reverting commit 9583a3f2625818b78c0cf6d473cdedb9f23ad82c: "[AsmPrinter] Delete dead takeDeletedSymbsForFunction()". This was not NFC as initially thought. By coalescing two function psas managers, this exposed the reverted code as necessary. addr-label.ll was crashing due to an emitted blockaddress's block being removed but the label not emitted. Some tests relied on the fact that we had a module pass somewhere in the codegen pipeline. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D99707	2021-04-10 22:38:44 -07:00
Arthur Eubanks	9edc5021eb	Move EntryExitInstrumentation pass location This seems to be more of a Clang thing rather than a generic LLVM thing, so this moves it out of LLVM pipelines and as Clang extension hooks into LLVM pipelines. Move the post-inline EEInstrumentation out of the backend pipeline and into a late pass, similar to other sanitizer passes. It doesn't fit into the codegen pipeline. Also fix up EntryExitInstrumentation not running at -O0 under the new PM. PR49143 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D97608	2021-03-01 10:08:10 -08:00
Lukas Sommer	ca692e8e27	[CodeGen] New pass: Replace vector intrinsics with call to vector library This patch adds a pass to replace calls to vector intrinsics (i.e., LLVM intrinsics operating on vector operands) with calls to a vector library. Currently, calls to LLVM intrinsics are only replaced with calls to vector libraries when scalar calls to intrinsics are vectorized by the Loop- or SLP-Vectorizer. With this pass, it is now possible to replace calls to LLVM intrinsics already operating on vector operands, e.g., if such code was generated by MLIR. For the replacement, information from the TargetLibraryInfo, e.g., as specified via -vector-library is used. This is a re-try of the original commit 2303e93e66 that was reverted due to pass manager problems. Other minor changes have also been made. Differential Revision: https://reviews.llvm.org/D95373	2021-02-12 12:53:27 -05:00
Snehasish Kumar	7744abeec1	[CodeGen] Basic block sections should take precendence over splitting. The use of basic block sections should take precedence over the machine function splitting pass. Since they use the same underlying mechanism they are kept exclusive. Updated the tests to check that split machine functions is overridden by all flavours of basic block sections. Differential Revision: https://reviews.llvm.org/D96392	2021-02-11 11:14:10 -08:00
Sanjay Patel	5fe7f130ea	Revert "[Codegen][ReplaceWithVecLib] add pass to replace vector intrinsics with calls to vector library" This reverts commit 2303e93e666e13ebf6d24323729c28f520ecca37. Investigating bot failures.	2021-02-05 15:10:11 -05:00
Lukas Sommer	c379590588	[Codegen][ReplaceWithVecLib] add pass to replace vector intrinsics with calls to vector library This patch adds a pass to replace calls to vector intrinsics (i.e., LLVM intrinsics operating on vector operands) with calls to a vector library. Currently, calls to LLVM intrinsics are only replaced with calls to vector libraries when scalar calls to intrinsics are vectorized by the Loop- or SLP-Vectorizer. With this pass, it is now possible to replace calls to LLVM intrinsics already operating on vector operands, e.g., if such code was generated by MLIR. For the replacement, information from the TargetLibraryInfo, e.g., as specified via -vector-library is used. Differential Revision: https://reviews.llvm.org/D95373	2021-02-05 14:25:19 -05:00
Matt Arsenault	a466bf7dc3	CodeGen: Refactor regallocator command line and target selection Make the sequence of passes to select and rewrite instructions to physical registers be a target callback. This is to prepare to allow targets to split register allocation into multiple phases.	2021-01-07 13:13:25 -05:00
Yuanfang Chen	09aff20364	Reland "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline" (again) This reverts commit 16c8f6e91344ec9840d6aa9ec6b8d0c87a104ca3 with fix. -Wswitch catched an unhandled enum value due to recent commits in TargetPassConfig.cpp.	2020-12-29 16:39:55 -08:00
Yuanfang Chen	222d17450a	Revert "Reland "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline"" This reverts commit 21314940c4856e0cb81b664fd2d2117d1b7dc3e3. Build failure in some bots.	2020-12-29 16:29:07 -08:00
Yuanfang Chen	015989c61b	Reland "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline" This reverts commit 94427af60c66ffea655a3084825c6c3a9deec1ad (relands 4646de5d75cfce3da4ddeffb6eb8e66e38238800 with fix). Use "return std::move(AsmStreamer);" instead of "return AsmStreamer;" in LVMTargetMachine::createMCStreamer. Unlike Clang, GCC seems having trouble inserting a implicit lvalue->rvalue conversion.	2020-12-29 15:17:23 -08:00
Yuanfang Chen	abf7733625	Revert "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline" This reverts commit 4646de5d75cfce3da4ddeffb6eb8e66e38238800. Some bots have build failure.	2020-12-28 17:44:22 -08:00
Yuanfang Chen	90a7aa8203	[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline Following up on D67687. Please refer to the RFC here http://lists.llvm.org/pipermail/llvm-dev/2020-July/143309.html `CodeGenPassBuilder` is the NPM counterpart of `TargetPassConfig` with below differences. - Debugging features (MIR print/verify, disable pass, start/stop-before/after, etc.) living in `TargetPassConfig` are moved to use PassInstrument as much as possible. (Implementation also lives in `TargetPassConfig.cpp`) - `TargetPassConfig` is a polymorphic base (virtual inheritance) to build the target-dependent pipeline whereas `CodeGenPassBuilder` is the CRTP base/helper to implement the target-dependent pipeline. The motivation is flexibility for targets to customize the pipeline, inlining opportunity, and fits the overall NPM value semantics design. - `TargetPassConfig` is a legacy immutable pass to declare hooks for targets to customize some target-independent codegen layer behavior. This is partially ported to TargetMachine::options. The rest, such as `createMachineScheduler/createPostMachineScheduler`, are left out for now. They should be implemented in LLVMTargetMachine in the future. Reviewed By: arsenm, aeubanks Differential Revision: https://reviews.llvm.org/D83608	2020-12-28 17:36:36 -08:00
Xiang1 Zhang	7ce08bafea	[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info. For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR. Much like the IR-level pass debugify, mir-debugify inserts sequentially increasing line locations to each MachineInstr in a Module, But there is no equivalent MIR-level check-debugify pass, So now we support it at "mir-check-debug". Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D91595	2020-12-16 22:17:25 -08:00
Xiang1 Zhang	085e55a045	Revert "[Debugify] Support checking Machine IR debug info" This reverts commit 50aaa8c274910d78d7bf6c929a34fe58b1f45579.	2020-12-16 20:12:33 -08:00
Xiang1 Zhang	d9173ede20	[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info. For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR. Much like the IR-level pass debugify, mir-debugify inserts sequentially increasing line locations to each MachineInstr in a Module, But there is no equivalent MIR-level check-debugify pass, So now we support it at "mir-check-debug". Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D91595	2020-12-16 18:04:05 -08:00
Nico Weber	f3e89af5f2	Revert "[Debugify] Support checking Machine IR debug info" This reverts commit c4d2d4337d50bed3cafd564daece1a197005b22b. Necessary to revert 2a5675f11d3bc803a245c0e.	2020-12-14 22:14:48 -05:00
Xiang1 Zhang	6d8bb495f3	[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info. For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR. Much like the IR-level pass debugify, mir-debugify inserts sequentially increasing line locations to each MachineInstr in a Module, But there is no equivalent MIR-level check-debugify pass, So now we support it at "mir-check-debug". Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D91595	2020-12-14 17:53:46 -08:00
Xiang1 Zhang	4721afcdaa	Revert "[Debugify] Support checking Machine IR debug info" This reverts commit 57a3d9ec4a8c1422f07264bed9f12a4ea416707e.	2020-12-14 17:48:49 -08:00
Xiang1 Zhang	437fc18cdb	[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info. For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR. Much like the IR-level pass debugify, mir-debugify inserts sequentially increasing line locations to each MachineInstr in a Module, But there is no equivalent MIR-level check-debugify pass, So now we support it at "mir-check-debug". Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D95195	2020-12-14 17:38:01 -08:00
Anna Thomas	ea0d3b5f95	[ScalarizeMaskedMemIntrin] Add new PM support This patch adds new PM support for the pass and the pass can be now used during middle-end transforms. The old pass is remamed to ScalarizeMaskedMemIntrinLegacyPass. Reviewed-By: skatkov, aeubanks Differential Revision: https://reviews.llvm.org/D92743	2020-12-08 17:15:22 -05:00
Hongtao Yu	4cefe8e200	[CSSPGO] Pseudo probes for function calls. An indirect call site needs to be probed for its potential call targets. With CSSPGO a direct call also needs a probe so that a calling context can be represented by a stack of callsite probes. Unlike pseudo probes for basic blocks that are in form of standalone intrinsic call instructions, pseudo probes for callsites have to be attached to the call instruction, thus a separate instruction would not work. One possible way of attaching a probe to a call instruction is to use a special metadata that carries information about the probe. The special metadata will have to make its way through the optimization pipeline down to object emission. This requires additional efforts to maintain the metadata in various places. Given that the `!dbg` metadata is a first-class metadata and has all essential support in place , leveraging the `!dbg` metadata as a channel to encode pseudo probe information is probably the easiest solution. With the requirement of not inflating `!dbg` metadata that is allocated for almost every instruction, we found that the 32-bit DWARF discriminator field which mainly serves AutoFDO can be reused for pseudo probes. DWARF discriminators distinguish identical source locations between instructions and with pseudo probes such support is not required. In this change we are using the discriminator field to encode the ID and type of a callsite probe and the encoded value will be unpacked and consumed right before object emission. When a callsite is inlined, the callsite discriminator field will go with the inlined instructions. The `!dbg` metadata of an inlined instruction is in form of a scope stack. The top of the stack is the instruction's original `!dbg` metadata and the bottom of the stack is for the original callsite of the top-level inliner. Except for the top of the stack, all other elements of the stack actually refer to the nested inlined callsites whose discriminator field (which actually represents a calliste probe) can be used together to represent the inline context of an inlined PseudoProbeInst or CallInst. To avoid collision with the baseline AutoFDO in various places that handles dwarf discriminators where a check against the `-pseudo-probe-for-profiling` switch is not available, a special encoding scheme is used to tell apart a pseudo probe discriminator from a regular discriminator. For the regular discriminator, if all lowest 3 bits are non-zero, it means the discriminator is basically empty and all higher 29 bits can be reversed for pseudo probe use. Callsite pseudo probes are inserted in `SampleProfileProbePass` and a target-independent MIR pass `PseudoProbeInserter` is added to unpack the probe ID/type from `!dbg`. Note that with this work the switch -debug-info-for-profiling will not work with -pseudo-probe-for-profiling anymore. They cannot be used at the same time. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91756	2020-12-02 13:45:20 -08:00
jasonliu	60c6f78bef	[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructions. 2. AIX uses a new personality routine, named __xlcxx_personality_v1. It doesn't use the GCC personality rountine, because the interoperability is not there yet on AIX. 3. AIX do not use eh_frame sections. Instead, it would use a eh_info section (compat unwind section) to store the information about personality routine and LSDA data address. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D91455	2020-12-02 18:42:44 +00:00
Amara Emerson	a4ad952c6e	[GlobalISel] Add translation support for vector reduction intrinsics. In order to prevent the ExpandReductions pass from expanding some intrinsics before they get to codegen, I had to add a -disable-expand-reductions flag for testing purposes. Differential Revision: https://reviews.llvm.org/D89028	2020-10-16 10:17:53 -07:00
Simon Pilgrim	1f062b848e	TargetPassConfig.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI.	2020-09-21 17:17:11 +01:00
Yuanfang Chen	8df10aca8c	Revert "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline" This reverts commit 31ecf8d29d81d196374a562c6d2bd2c25a62861e. This reverts commit 3fdaa8602a086a3fca5f0fc8527536ac659079d0. There is laying violation for Target->CodeGen.	2020-09-11 18:52:32 -07:00
Yuanfang Chen	cfd0162bc3	[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline Following up on D67687. Please refer to the RFC here http://lists.llvm.org/pipermail/llvm-dev/2020-July/143309.html `CodeGenPassBuilder` is the NPM counterpart of `TargetPassConfig` with below differences. - Debugging features (MIR print/verify, disable pass, start/stop-before/after, etc.) living in `TargetPassConfig` are moved to use PassInstrument as much as possible. (Implementation also lives in `TargetPassConfig.cpp`) - `TargetPassConfig` is a polymorphic base (virtual inheritance) to build the target-dependent pipeline whereas `CodeGenPassBuilder` is the CRTP base/helper to implement the target-dependent pipeline. The motivation is flexibility for targets to customize the pipeline, inlining opportunity, and fits the overall NPM value semantics design. - `TargetPassConfig` is a legacy immutable pass to declare hooks for targets to customize some target-independent codegen layer behavior. This is partially ported to TargetMachine::options. The rest, such as `createMachineScheduler/createPostMachineScheduler`, are left out for now. They should be implemented in LLVMTargetMachine in the future. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D83608	2020-09-11 16:41:17 -07:00
Snehasish Kumar	69b963173e	[llvm][CodeGen] Machine Function Splitter We introduce a codegen optimization pass which splits functions into hot and cold parts. This pass leverages the basic block sections feature recently introduced in LLVM from the Propeller project. The pass targets functions with profile coverage, identifies cold blocks and moves them to a separate section. The linker groups all cold blocks across functions together, decreasing fragmentation and improving icache and itlb utilization. We evaluated the Machine Function Splitter pass on clang bootstrap and SPECInt 2017. For clang bootstrap we observe a mean 2.33% runtime improvement with a ~32% reduction in itlb and stlb misses. Additionally, L1 icache misses reduced by 9.5% while L2 instruction misses reduced by 20%. For SPECInt we report the change in IntRate the C/C++ benchmarks. All benchmarks apart from mcf and x264 improve, on average by 0.6% with the max for deepsjeng at 1.6%. Benchmark % Change 500.perlbench_r 0.78 502.gcc_r 0.82 505.mcf_r -0.30 520.omnetpp_r 0.18 523.xalancbmk_r 0.37 525.x264_r -0.46 531.deepsjeng_r 1.61 541.leela_r 0.83 557.xz_r 0.15 Differential Revision: https://reviews.llvm.org/D85368	2020-08-28 11:10:14 -07:00
Snehasish Kumar	bfbcf062be	[NFC] Rename BBSectionsPrepare -> BasicBlockSections. Rename the BBSectionsPrepare pass as suggested by the review comment in https://reviews.llvm.org/D85368. Differential Revision: https://reviews.llvm.org/D85380	2020-08-06 13:12:06 -07:00
Evgeny Leviant	355534b549	[CodeGen][TargetPassConfig] Add unreachable-mbb-elimination pass explicitly Differential revision: https://reviews.llvm.org/D84228	2020-07-23 18:05:11 +03:00
Yuanfang Chen	bf8086d1c1	[llc] (almost) remove `--print-machineinstrs` Its effect could be achieved by `-stop-after`,`-print-after`,`-print-after-all`. But a few tests need to print MIR after ISel which could not be done with `-print-after`/`-stop-after` since isel pass does not have commandline name. That's the reason `--print-machineinstrs` is downgraded to `--print-after-isel` in this patch. `--print-after-isel` could be removed after we switch to new pass manager since isel pass would have a commandline text name to use `print-after` or equivalent switches. The motivation of this patch is to reduce tests dependency on would-be-deprecated feature. Reviewed By: arsenm, dsanders Differential Revision: https://reviews.llvm.org/D83275	2020-07-20 10:43:28 -07:00
Evgeny Leviant	3d264ff874	[CodeGen][TargetPassConfig] Add TargetTransformInfo pass correctly Patch adds tti pass directly enforcing its execution with correctly set TargetTransformInfo. Differential revision: https://reviews.llvm.org/D84047	2020-07-18 14:11:40 +03:00
Yuanfang Chen	cc889d9127	[NFC] change getLimitedCodeGenPipelineReason to static function	2020-07-06 15:39:27 -07:00
Juneyoung Lee	1b21295ff1	[TargetPassConfig] Add CanonicalizeFreezeInLoops before LSR Summary: This patch adds CanonicalizeFreezeInLoops before LSR. Relevant patch: https://reviews.llvm.org/D77523 Reviewers: spatel, efriedma, jdoerfert, fhahn, nikic, reames, xbolva00 Reviewed By: nikic Subscribers: xbolva00, nikic, lebedev.ri, hiraditya, llvm-commits, sanwou01, nlopes Tags: #llvm Differential Revision: https://reviews.llvm.org/D77524	2020-05-28 05:21:12 +09:00
Nikita Popov	9ec2cd3569	[DwarfEHPrepare] Don't prune unreachable resumes at optnone Disable pruning of unreachable resumes in the DwarfEHPrepare pass at optnone. While I expect the pruning itself to be essentially free, this does require a dominator tree calculation, that is not used for anything else. Saving this DT construction makes for a 0.4% O0 compile-time improvement. Differential Revision: https://reviews.llvm.org/D80400	2020-05-23 20:58:01 +02:00
Nikita Popov	be31ff3754	[TargetPassConfig] Don't add alias analysis at optnone When performing codegen at optnone, don't add alias analysis to the pipeline. We don't need it, but it causes an unnecessary dominator tree calculation. I've also moved the module verifier call to the top so that a bunch of disabled-at-optnone passes group more nicely. Differential Revision: https://reviews.llvm.org/D80378	2020-05-23 10:35:03 +02:00

1 2 3 4

187 Commits