llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-23 20:34:58 +00:00

Author	SHA1	Message	Date
Francis Visoiu Mistrih	aad252295b	Reland "[Remarks] Add a new Remark / RemarkParser abstraction" This adds a Remark class that allows us to share code when working with remarks. The C API has been updated to reflect this. Instead of the parser generating C structs, it's now using a C++ object that is used through opaque pointers in C. This gives us much more flexibility on what changes we can make to the internal state of the object and interacts much better with scenarios where the library is used through dlopen. * C API updates: * move from C structs to opaque pointers and functions * the remark type is now an enum instead of a string * unit tests updates: * use mostly the C++ API * keep one test for the C API * rename to YAMLRemarksParsingTest * a typo was fixed: AnalysisFPCompute -> AnalysisFPCommute. * a new error message was added: "expected a remark tag." * llvm-opt-report has been updated to use the C++ parser instead of the C API Differential Revision: https://reviews.llvm.org/D59049 Original llvm-svn: 356491 llvm-svn: 356519	2019-03-19 21:11:07 +00:00
Francis Visoiu Mistrih	a58b553f92	Revert "[Remarks] Add a new Remark / RemarkParser abstraction" This reverts commit 51dc6a8c84cd6a58562e320e1828a0158dbbf750. Breaks http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/20034/steps/build%20stage%201/logs/stdio. llvm-svn: 356492	2019-03-19 18:21:43 +00:00
Francis Visoiu Mistrih	de75d7efaf	[Remarks] Add a new Remark / RemarkParser abstraction This adds a Remark class that allows us to share code when working with remarks. The C API has been updated to reflect this. Instead of the parser generating C structs, it's now using a C++ object that is used through opaque pointers in C. This gives us much more flexibility on what changes we can make to the internal state of the object and interacts much better with scenarios where the library is used through dlopen. * C API updates: * move from C structs to opaque pointers and functions * the remark type is now an enum instead of a string * unit tests updates: * use mostly the C++ API * keep one test for the C API * rename to YAMLRemarksParsingTest * a typo was fixed: AnalysisFPCompute -> AnalysisFPCommute. * a new error message was added: "expected a remark tag." * llvm-opt-report has been updated to use the C++ parser instead of the C API Differential Revision: https://reviews.llvm.org/D59049 llvm-svn: 356491	2019-03-19 18:09:51 +00:00
Jordan Rupprecht	4229c66a80	[llvm-ar] Support N [count] modifier Summary: GNU ar supports the 'N' count modifier for the extract (x) and delete (d) operations. When an archive contains multiple members with the same name, this can be used to extract (or delete) them individually. For example: ``` $ llvm-ar t archive.a foo foo $ llvm-ar x archive.a -> Writes foo twice, overwriting it the second time :( :( $ llvm-ar xN 1 archive.a foo && mv foo foo.1 $ llvm-ar xN 2 archive.a foo && mv foo foo.2 -> Write foo twice, renaming it in between invocations to preserve all versions ``` Reviewers: ruiu, MaskRay Reviewed By: ruiu, MaskRay Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59503 llvm-svn: 356466	2019-03-19 16:09:54 +00:00
Serge Guelton	ff8e1cddfe	Use response file when generating LLVM-C.dll As discovered in D56774 the command line gets to long, so use a response file to give the script the libs. This change has been tested and is confirmed working for me. Commited on behalf of Jakob Bornecrantz. Differential Revision: https://reviews.llvm.org/D56781 llvm-svn: 356443	2019-03-19 09:14:09 +00:00
Jake Ehrlich	83efed9fe8	[llvm-objcopy] Make .build-id linking atomic This change makes linking into .build-id atomic and safe to use. Some users under particular workflows are reporting that this races more than half the time under particular conditions. llvm-svn: 356404	2019-03-18 20:35:18 +00:00
George Rimar	85af6d1c85	[llvm-objcopy] - Calculate the string table section sizes correctly. This fixes the https://bugs.llvm.org/show_bug.cgi?id=40980. Previously if string optimization occurred as a result of StringTableBuilder's finalize() method, the size wasn't updated. This hopefully also makes the interaction between sections during finalization processes a bit more clear. Differential revision: https://reviews.llvm.org/D59488 llvm-svn: 356371	2019-03-18 14:27:41 +00:00
Roman Lebedev	5d84c488b2	[llvm-exegesis] Separate tool options into three categories. Results in much nicer -help output: ``` $ ./bin/llvm-exegesis -help USAGE: llvm-exegesis [options] OPTIONS: Color Options: -color - Use colors in output (default=autodetect) General options: -enable-cse-in-irtranslator - Should enable CSE in irtranslator -enable-cse-in-legalizer - Should enable CSE in Legalizer Generic Options: -help - Display available options (-help-hidden for more) -help-list - Display list of available options (-help-list-hidden for more) -version - Display the version of this program llvm-exegesis analysis options: -analysis-clustering-epsilon=<number> - dbscan epsilon for benchmark point clustering -analysis-clusters-output-file=<string> - -analysis-display-unstable-clusters - if there is more than one benchmark for an opcode, said benchmarks may end up not being clustered into the same cluster if the measured performance characteristics are different. by default all such opcodes are filtered out. this flag will instead show only such unstable opcodes -analysis-inconsistencies-output-file=<string> - -analysis-inconsistency-epsilon=<number> - epsilon for detection of when the cluster is different from the LLVM schedule profile values -analysis-numpoints=<uint> - minimum number of points in an analysis cluster llvm-exegesis benchmark options: -ignore-invalid-sched-class - ignore instructions that do not define a sched class -mode=<value> - the mode to run =latency - Instruction Latency =inverse_throughput - Instruction Inverse Throughput =uops - Uop Decomposition =analysis - Analysis -num-repetitions=<uint> - number of time to repeat the asm snippet -opcode-index=<int> - opcode to measure, by index -opcode-name=<string> - comma-separated list of opcodes to measure, by name -snippets-file=<string> - code snippets to measure llvm-exegesis options: -benchmarks-file=<string> - File to read (analysis mode) or write (latency/uops/inverse_throughput modes) benchmark results. “-” uses stdin/stdout. -mcpu=<string> - cpu name to use for pfm counters, leave empty to autodetect ``` llvm-svn: 356364	2019-03-18 11:32:37 +00:00
Fangrui Song	109967fa40	[llvm-profdata] Deleted unused Cutoffs added by D16005 llvm-svn: 356248	2019-03-15 10:43:51 +00:00
James Henderson	cccf1bdc3a	[yaml2obj]Allow explicit setting of p_filesz, p_memsz, and p_offset yaml2obj currently derives the p_filesz, p_memsz, and p_offset values of program headers from their sections. This makes writing tests for certain formats more complex, and sometimes impossible. This patch allows setting these fields explicitly, overriding the default value, when relevant. Reviewed by: jakehehrlich, Higuoxing Differential Revision: https://reviews.llvm.org/D59372 llvm-svn: 356247	2019-03-15 10:35:27 +00:00
Fangrui Song	0e49cca591	[llvm-readobj] Delete unused variable. NFC llvm-svn: 356246	2019-03-15 10:34:57 +00:00
Fangrui Song	9abcfb073d	[llvm-objcopy] Delete unused parameter from replaceDebugSections. NFC llvm-svn: 356245	2019-03-15 10:27:28 +00:00
Fangrui Song	afd6b2d8c3	[llvm-objcopy] Don't use {}; NFC llvm-svn: 356244	2019-03-15 10:20:51 +00:00
Jordan Rupprecht	48ffd0d44c	[llvm-strip] Hook up (unimplemented) --only-keep-debug For ELF, we accept but ignore --only-keep-debug. Do the same for llvm-strip. COFF does implement this, so update the test that it is supported. llvm-svn: 356207	2019-03-14 21:51:42 +00:00
Max Moroz	31cf7668c7	Speeding up llvm-cov export with multithreaded renderFiles implementation. Summary: CoverageExporterJson::renderFiles accounts for most of the execution time given a large profdata file with multiple binaries. Proposed solution is to generate JSON for each file in parallel and sort at the end to preserve deterministic output. Also added flags to skip generating parts of the output to trim the output size. Patch by Sajjad Mirza (@sajjadm). Reviewers: Dor1s, vsk Reviewed By: Dor1s, vsk Subscribers: liaoyuke, mgrang, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59277 llvm-svn: 356178	2019-03-14 17:49:27 +00:00
James Henderson	f563d45752	[llvm-objcopy]Don't implicitly strip sections in segments This patch changes llvm-objcopy's behaviour to not strip sections that are in segments, if they otherwise would be due to a stripping operation (--strip-all, --strip-sections, --strip-non-alloc). This preserves the segment contents. It does not change the behaviour of --strip-all-gnu (although we could choose to do so), because GNU objcopy's behaviour in this case seems to be to strip the section, nor does it prevent removing of sections in segments with --remove-section (if a user REALLY wants to remove a section, we should probably let them, although I could be persuaded that warning might be appropriate). Tests have been added to show this latter behaviour. This fixes https://bugs.llvm.org/show_bug.cgi?id=41006. Reviewed by: grimar, rupprecht, jakehehrlich Differential Revision: https://reviews.llvm.org/D59293 This is a reland of r356129, attempting to fix greendragon failures due to a suspected compatibility issue with od on the greendragon bots versus other versions. llvm-svn: 356136	2019-03-14 11:47:41 +00:00
James Henderson	88e40ab1b8	Revert r356129 due to greendragon bot failures llvm-svn: 356133	2019-03-14 11:23:04 +00:00
James Henderson	1bd484c2b8	[llvm-objcopy]Don't implicitly strip sections in segments This patch changes llvm-objcopy's behaviour to not strip sections that are in segments, if they otherwise would be due to a stripping operation (--strip-all, --strip-sections, --strip-non-alloc). This preserves the segment contents. It does not change the behaviour of --strip-all-gnu (although we could choose to do so), because GNU objcopy's behaviour in this case seems to be to strip the section, nor does it prevent removing of sections in segments with --remove-section (if a user REALLY wants to remove a section, we should probably let them, although I could be persuaded that warning might be appropriate). Tests have been added to show this latter behaviour. This fixes https://bugs.llvm.org/show_bug.cgi?id=41006. Reviewed by: grimar, rupprecht, jakehehrlich Differential Revision: https://reviews.llvm.org/D59293 llvm-svn: 356129	2019-03-14 10:20:27 +00:00
Jordan Rupprecht	4047825f64	[llvm-objcopy][NFC] Remove unnecessary llvm-objcopy.h #includes llvm-svn: 356109	2019-03-13 23:40:16 +00:00
Jordan Rupprecht	bd637ddd64	[llvm-objcopy] Cleanup errors from CopyConfig and remove llvm-objcopy.h dependency error() was previously cleaned up from CopyConfig, but new uses were introduced. This also tweaks the error message for --add-symbol to report all invalid flags. llvm-svn: 356105	2019-03-13 22:26:01 +00:00
Tim Renouf	0740b7d5b8	[AMDGPU] Switched HSA metadata to use MsgPackDocument Summary: MsgPackDocument is the lighter-weight replacement for MsgPackTypes. This commit switches AMDGPU HSA metadata processing to use MsgPackDocument instead of MsgPackTypes. Differential Revision: https://reviews.llvm.org/D57024 Change-Id: I0751668013abe8c87db01db1170831a76079b3a6 llvm-svn: 356081	2019-03-13 18:55:50 +00:00
Francis Visoiu Mistrih	14e47fc8be	Reland "[Remarks] Add -foptimization-record-passes to filter remark emission" Currently we have -Rpass for filtering the remarks that are displayed as diagnostics, but when using -fsave-optimization-record, there is no way to filter the remarks while generating them. This adds support for filtering remarks by passes using a regex. Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline` will only emit the remarks coming from the pass `inline`. This adds: * `-fsave-optimization-record` to the driver * `-opt-record-passes` to cc1 * `-lto-pass-remarks-filter` to the LTOCodeGenerator * `--opt-remarks-passes` to lld * `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2 * `-opt-remarks-passes` to gold-plugin Differential Revision: https://reviews.llvm.org/D59268 Original llvm-svn: 355964 llvm-svn: 355984	2019-03-12 21:22:27 +00:00
Francis Visoiu Mistrih	b72c67b1b1	Revert "[Remarks] Add -foptimization-record-passes to filter remark emission" This reverts commit 20fff32b7d1f1a1bd417b22aa9f26ededd97a3e5. llvm-svn: 355976	2019-03-12 20:54:18 +00:00
Francis Visoiu Mistrih	5fb25c63d5	[Remarks] Add -foptimization-record-passes to filter remark emission Currently we have -Rpass for filtering the remarks that are displayed as diagnostics, but when using -fsave-optimization-record, there is no way to filter the remarks while generating them. This adds support for filtering remarks by passes using a regex. Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline` will only emit the remarks coming from the pass `inline`. This adds: * `-fsave-optimization-record` to the driver * `-opt-record-passes` to cc1 * `-lto-pass-remarks-filter` to the LTOCodeGenerator * `--opt-remarks-passes` to lld * `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2 * `-opt-remarks-passes` to gold-plugin Differential Revision: https://reviews.llvm.org/D59268 llvm-svn: 355964	2019-03-12 20:28:50 +00:00
James Henderson	9af6a3c5cb	[yaml2obj]Allow explicit symbol indexes in relocations and emit error for bad names Prior to this change, the "Symbol" field of a relocation would always be assumed to be a symbol name, and if no such symbol existed, the relocation would reference index 0. This confused me when I tried to use a literal symbol index in the field: since "0x1" was not a known symbol name, the symbol index was set as 0. This change falls back to treating unknown symbol names as integers, and emits an error if the name is not found and the string is not an integer. Note that the Symbol field is optional, so if a relocation doesn't reference a symbol, it shouldn't be specified. The new error required a number of test updates. Reviewed by: grimar, ruiu Differential Revision: https://reviews.llvm.org/D58510 llvm-svn: 355938	2019-03-12 17:00:25 +00:00
Xing GUO	737a42660e	[llvm-readobj] Print symbol version when dumping relocations (PR31564) Summary: This helps resolve https://bugs.llvm.org/show_bug.cgi?id=31564 Reviewers: jhenderson, grimar Reviewed By: jhenderson Subscribers: rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59175 llvm-svn: 355922	2019-03-12 14:30:13 +00:00
Eugene Leviant	27a13eb691	[llvm-objcopy] Remove unneeded checks. NFC Differential revision: https://reviews.llvm.org/D59081 llvm-svn: 355914	2019-03-12 12:41:06 +00:00
Peter Collingbourne	ab72ac7186	llvm-objcopy: Remove unused field. NFCI. Differential Revision: https://reviews.llvm.org/D59126 llvm-svn: 355892	2019-03-12 02:17:01 +00:00
Nathan Lanza	0368dcdd6e	Add Swift enumerator value for CodeView::SourceLanguage Summary: Swift now generates PDBs for debugging on Windows. llvm and lldb need a language enumerator value too properly handle the output emitted by swiftc. Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59231 llvm-svn: 355882	2019-03-11 23:27:59 +00:00
George Rimar	757b91db38	[yaml2obj] - Simplify. NFC. llvm-svn: 355832	2019-03-11 16:10:02 +00:00
George Rimar	3e5b9fe4cd	[llvm-objcopy] - Fix --compress-debug-sections when there are relocations. When --compress-debug-sections is given, llvm-objcopy removes the uncompressed sections and adds compressed to the section list. This makes all the pointers to old sections to be outdated. Currently, code already has logic for replacing the target sections of the relocation sections. But we also have to update the relocations by themselves. This fixes https://bugs.llvm.org/show_bug.cgi?id=40885. Differential revision: https://reviews.llvm.org/D58960 llvm-svn: 355821	2019-03-11 11:01:24 +00:00
Sunil Srivastava	b609f580a3	Improve "llvm-nm -f sysv" output for Elf files Specifically, compute and Print Type and Section columns. This is a re-commit of rL354833, after fixing the Asan problem found a a buildbot. Differential Revision: https://reviews.llvm.org/D59060 llvm-svn: 355742	2019-03-08 22:00:50 +00:00
James Henderson	502ebba706	[llvm-readelf]Don't lose negative-ness of negative addends for no symbol relocations llvm-readelf prints relocation addends as: <symbol value>[+-]<absolute addend> where [+-] is determined from whether addend is less than zero or not. However, it does not print the +/- if there is no symbol, which meant that negative addends became their positive value with no indication that this had happened. This patch stops the absolute conversion when addends are negative and there is no associated symbol. Reviewed by: Higuoxing, mattd, MaskRay Differential Revision: https://reviews.llvm.org/D59095 llvm-svn: 355696	2019-03-08 13:22:05 +00:00
Matt Davis	28382105e7	[llvm-mca] Emit a message when no bottlenecks are identified. Summary: Since bottleneck hints are enabled via user request, it can be confusing if no bottleneck information is presented. Such is the case when no bottlenecks are identified. This patch emits a message in that case. Reviewers: andreadb Reviewed By: andreadb Subscribers: tschuett, gbedwell, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59098 llvm-svn: 355628	2019-03-07 19:34:44 +00:00
Xing GUO	ef6399f2ef	[llvm-readobj] Dump DT_USED value as string like GNU readelf does Reviewers: jhenderson Reviewed By: jhenderson Subscribers: rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59089 llvm-svn: 355600	2019-03-07 14:53:10 +00:00
George Rimar	bd3c267f5b	[yaml2obj] - Allow producing ELFDATANONE ELFs I need this to remove a binary from LLD test suite. The patch also simplifies the code a bit. Differential revision: https://reviews.llvm.org/D59082 llvm-svn: 355591	2019-03-07 12:09:19 +00:00
Francis Visoiu Mistrih	db7c2e63ef	Reland "[Remarks] Refactor remark diagnostic emission in a RemarkStreamer" This allows us to store more info about where we're emitting the remarks without cluttering LLVMContext. This is needed for future support for the remark section. Differential Revision: https://reviews.llvm.org/D58996 Original llvm-svn: 355507 llvm-svn: 355514	2019-03-06 15:20:13 +00:00
Francis Visoiu Mistrih	ab95f38139	Revert "[Remarks] Refactor remark diagnostic emission in a RemarkStreamer" This reverts commit 2e8c4997a2089f8228c843fd81b148d903472e02. Breaks bots. llvm-svn: 355511	2019-03-06 14:52:37 +00:00
Francis Visoiu Mistrih	45571530a1	[Remarks] Refactor remark diagnostic emission in a RemarkStreamer This allows us to store more info about where we're emitting the remarks without cluttering LLVMContext. This is needed for future support for the remark section. Differential Revision: https://reviews.llvm.org/D58996 llvm-svn: 355507	2019-03-06 14:32:08 +00:00
George Rimar	4f51fe0c01	[llvm-objcopy] - Remove dead code. NFCI. DecompressedSection can only be created if --decompress-debug-sections is specified. https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-objcopy/ELF/ELFObjcopy.cpp#L492 If it is specified when !zlib::isAvailable(), we error out early when parsing the options: https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-objcopy/CopyConfig.cpp#L657 What means the code I am removing in this patch is dead. Differential revision: https://reviews.llvm.org/D59017 llvm-svn: 355505	2019-03-06 14:12:18 +00:00
George Rimar	edd70802af	[llvm-objcopy] - Remove an excessive zlib::isAvailable() check and dead code. There are 2 places where llvm-objcopy creates CompressedSection: For --compress-debug-sections. It might create the compressed section from regular here: https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-objcopy/ELF/ELFObjcopy.cpp#L486 All initially compressed sections are created as CompressedSection during reading the sections from an object: https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-objcopy/ELF/Object.cpp#L1118 Those have DebugCompressionType::None type and a different constructor. Case 1 has the following code in its constructor: if (!zlib::isAvailable()) { CompressionType = DebugCompressionType::None; return; } (https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-objcopy/ELF/Object.cpp#L267) We can never reach that code with because would report an error much earlier: https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-objcopy/CopyConfig.cpp#L480 So the code I am removing is dead. Landing this will address the issue mentioned in https://bugs.llvm.org/show_bug.cgi?id=40886. Differential revision: https://reviews.llvm.org/D59019 llvm-svn: 355503	2019-03-06 14:08:27 +00:00
George Rimar	09f6aceae8	[llvm-objcopy] - Fix incorrect CompressedSection creation. We should create CompressedSection only if the section has SHF_COMPRESSED flag or it's name starts from '.zdebug'. Currently, we create it if section's data starts from ZLIB signature. Differential revision: https://reviews.llvm.org/D59018 llvm-svn: 355501	2019-03-06 14:01:54 +00:00
Florian Hahn	eafd27fedf	[opt] Report if the provided architecture is invalid. Partly addresses PR15026. There are a few tests that passed in invalid architectures, which are fixed in: rL355349 and D58931 Reviewers: echristo, efriedma, rengolin, atrick Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D58933 llvm-svn: 355455	2019-03-05 23:10:28 +00:00
Francis Visoiu Mistrih	7f77ac0406	[OptRemarks] Make OptRemarks more generic: rename OptRemarks to Remarks Getting rid of the name "optimization remarks" for anything that involves handling remarks on the client side. It's safer to do this now, before we get stuck with that name in all the APIs and public interfaces we decide to export to users in the future. This renames llvm/tools/opt-remarks to llvm/tools/remarks-shlib, and now generates `libRemarks.dylib` instead of `libOptRemarks.dylib`. Differential Revision: https://reviews.llvm.org/D58535 llvm-svn: 355439	2019-03-05 20:45:17 +00:00
George Rimar	e24ba5947b	[llvm-objcopy] - Simplify `isCompressable` and fix the issue relative. When --compress-debug-sections is given, llvm-objcopy do not compress sections that have "ZLIB" header in data. Normally this signature is used in zlib-gnu compression format. But if zlib-gnu used then the name of the compressed section should start from .z* (e.g .zdebug_info). If it does not, then it is not a zlib-gnu format and section should be treated as a normal uncompressed section. Differential revision: https://reviews.llvm.org/D58908 llvm-svn: 355399	2019-03-05 13:07:43 +00:00
George Rimar	d6ec2efde8	[llvm-objcopy] - Report "no zlib available" error properly when --compress-debug-sections is used. If zlib is not available, and --compress-debug-sections is passed, we want to report an error. Currently, it is only reported for --compress_debug_sections= form of the option. Fixes the https://bugs.llvm.org/show_bug.cgi?id=40886. I do not think there is a way to write a test for this. Differential revision: https://reviews.llvm.org/D58909 llvm-svn: 355391	2019-03-05 11:32:14 +00:00
Evgeniy Stepanov	135b61b60c	Fix wrong enum value in switch. llvm-svn: 355338	2019-03-04 21:00:28 +00:00
Rong Xu	888aa59db9	[PGO] Context sensitive PGO (part 3) Part 3 of CSPGO changes (mostly related to PassMananger). Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355330	2019-03-04 20:21:27 +00:00
Andrea Di Biagio	e84fbe16c1	[MCA] Correctly initialize struct SummaryView::BackPressureInfo. This should appease the buildbots. llvm-svn: 355309	2019-03-04 12:23:05 +00:00
Andrea Di Biagio	c5a150eca8	[MCA] Highlight kernel bottlenecks in the summary view. This patch adds a new flag named -bottleneck-analysis to print out information about throughput bottlenecks. MCA knows how to identify and classify dynamic dispatch stalls. However, it doesn't know how to analyze and highlight kernel bottlenecks. The goal of this patch is to teach MCA how to correlate increases in backend pressure to backend stalls (and therefore, the loss of throughput). From a Scheduler point of view, backend pressure is a function of the scheduler buffer usage (i.e. how the number of uOps in the scheduler buffers changes over time). Backend pressure increases (or decreases) when there is a mismatch between the number of opcodes dispatched, and the number of opcodes issued in the same cycle. Since buffer resources are limited, continuous increases in backend pressure would eventually leads to dispatch stalls. So, there is a strong correlation between dispatch stalls, and how backpressure changed over time. This patch teaches how to identify situations where backend pressure increases due to: - unavailable pipeline resources. - data dependencies. Data dependencies may delay execution of instructions and therefore increase the time that uOps have to spend in the scheduler buffers. That often translates to an increase in backend pressure which may eventually lead to a bottleneck. Contention on pipeline resources may also delay execution of instructions, and lead to a temporary increase in backend pressure. Internally, the Scheduler classifies instructions based on whether register / memory operands are available or not. An instruction is marked as "ready to execute" only if data dependencies are fully resolved. Every cycle, the Scheduler attempts to execute all instructions that are ready to execute. If an instruction cannot execute because of unavailable pipeline resources, then the Scheduler internally updates a BusyResourceUnits mask with the ID of each unavailable resource. ExecuteStage is responsible for tracking changes in backend pressure. If backend pressure increases during a cycle because of contention on pipeline resources, then ExecuteStage sends a "backend pressure" event to the listeners. That event would contain information about instructions delayed by resource pressure, as well as the BusyResourceUnits mask. Note that ExecuteStage also knows how to identify situations where backpressure increased because of delays introduced by data dependencies. The SummaryView observes "backend pressure" events and prints out a "bottleneck report". Example of bottleneck report: ``` Cycles with backend pressure increase [ 99.89% ] Throughput Bottlenecks: Resource Pressure [ 0.00% ] Data Dependencies: [ 99.89% ] - Register Dependencies [ 0.00% ] - Memory Dependencies [ 99.89% ] ``` A bottleneck report is printed out only if increases in backend pressure eventually caused backend stalls. About the time complexity: Time complexity is linear in the number of instructions in the Scheduler::PendingSet. The average slowdown tends to be in the range of ~5-6%. For memory intensive kernels, the slowdown can be significant if flag -noalias=false is specified. In the worst case scenario I have observed a slowdown of ~30% when flag -noalias=false was specified. We can definitely recover part of that slowdown if we optimize class LSUnit (by doing extra bookkeeping to speedup queries). For now, this new analysis is disabled by default, and it can be enabled via flag -bottleneck-analysis. Users of MCA as a library can enable the generation of pressure events through the constructor of ExecuteStage. This patch partially addresses https://bugs.llvm.org/show_bug.cgi?id=37494 Differential Revision: https://reviews.llvm.org/D58728 llvm-svn: 355308	2019-03-04 11:52:34 +00:00

1 2 3 4 5 ...

10336 Commits