llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-12-11 17:08:42 +00:00

Author	SHA1	Message	Date
David Blaikie	298720d324	DebugInfo: Attribute implicit boolean tests to the expression being tested, not to the outer use of that expression. This is half a fix for a GDB test suite failure that expects to start at 'a' in the following code: void func(int a) if (a && b) ... But instead, without this change, the comparison was assigned to '&&' (well, worse actually - because there was a chained 'a && b && c' and it was assigned to the second '&&' because of a recursive application of this bug) and then the load folded into the comparison so breaking on the function started at '&&' instead of 'a'. The other part of this needs to be fixed in LLVM where it's ignoring the location of the icmp and instead using the location of the branch instruction. The fix to the conditional operator is actually a no-op currently, because the conditional operator's location coincides with 'a' (the start of the conditional expression) but should probably be '?' instead. See the FIXME in the test case that mentions the ARCMigration tool failures when I tried to make that change. llvm-svn: 227356	2015-01-28 19:50:09 +00:00
Sanjay Patel	4058dd9f3f	invert check for less indentation; use local vars to reduce duplication; NFC llvm-svn: 227355	2015-01-28 19:44:21 +00:00
Kostya Serebryany	d2ef30feed	Add clang-format-fuzzer target Summary: This adds clang-format-fuzzer binary, which depends on the Fuzzer lib, see http://reviews.llvm.org/D7184 This fuzer has found ~15 bugs so far, and I hope to set up a bot for it. Test Plan: run on a bot. Reviewers: samsonov, djasper Reviewed By: djasper Subscribers: curdeius, cfe-commits Differential Revision: http://reviews.llvm.org/D7202 llvm-svn: 227354	2015-01-28 19:39:18 +00:00
Colin LeMahieu	1de7e0d923	[Hexagon] Updating many V4 intrinsic patterns. Adding missing instruction and deleting unused classes. llvm-svn: 227353	2015-01-28 19:39:09 +00:00
Chandler Carruth	be09eb75aa	[LPM] Try to work around a bug with local-dynamic TLS on PowerPC 64. Sadly, this precludes optimizing it down to initial-exec or local-exec when statically linking, and in general makes the code slower on PPC 64, but there's nothing else for it until we can arrange to produce the correct bits for the linker. Lots of thanks to Ulirch for tracking this down and Bill for working on the long-term fix to LLVM so that we can relegate this to old host clang versions. I'll be watching the PPC build bots to make sure this effectively revives them. llvm-svn: 227352	2015-01-28 19:29:22 +00:00
Philip Reames	23cf2e2f97	Remove gc.root's performCustomLowering This is a refactoring to restructure the single user of performCustomLowering as a specific lowering pass and remove the custom lowering hook entirely. Before this change, the LowerIntrinsics pass (note to self: rename!) was essentially acting as a pass manager, but without being structured in terms of passes. Instead, it proxied calls to a set of GCStrategies internally. This adds a lot of conceptual complexity (i.e. GCStrategies are stateful!) for very little benefit. Since there's been interest in keeping the ShadowStackGC working, I extracting it's custom lowering pass into a dedicated pass and just added that to the pass order. It will only run for functions which opt-in to that gc. I wasn't able to find an easy way to preserve the runtime registration of custom lowering functionality. Given that no user of this exists that I'm aware of, I made the choice to just remove that. If someone really cares, we can look at restoring it via dynamic pass registration in the future. Note that despite the large diff, none of the lowering code actual changes. I added the framing needed to make it a pass and rename the class, but that's it. Differential Revision: http://reviews.llvm.org/D7218 llvm-svn: 227351	2015-01-28 19:28:03 +00:00
Enrico Granata	2265acf39e	Harden against the process pointer being null - this seems like it shouldn't happen, except it did - by a user stopping the debugger while the variables view was refreshing Fixes rdar://19599357 llvm-svn: 227350	2015-01-28 19:23:51 +00:00
Chris Bieneman	a3dcc93812	Moving AddLiteralOption's declaration higher up in the header to make gcc happy. llvm-svn: 227348	2015-01-28 19:17:09 +00:00
Colin LeMahieu	94c33218e3	[Hexagon] Adding XTYPE/MPY intrinsic tests and some missing multiply instructions. llvm-svn: 227347	2015-01-28 19:16:17 +00:00
Chris Bieneman	d1d9430a05	Refactoring llvm command line parsing and option registration. Summary: The primary goal of this patch is to remove the need for MarkOptionsChanged(). That goal is accomplished by having addOption and removeOption properly sort the options. This patch puts the new add and remove functionality on a CommandLineParser class that is a placeholder. Some of the functionality in this class will need to be merged into the OptionRegistry, and other bits can hopefully be in a better abstraction. This patch also removes the RegisteredOptionList global, and the need for cl::Option objects to be linked list nodes. The changes in CommandLineTest.cpp are required because these changes shift when we validate that options are not duplicated. Before this change duplicate options were only found during certain cl API calls (like cl::ParseCommandLine). With this change duplicate options are found during option construction. Reviewers: dexonsmith, chandlerc, pete Reviewed By: pete Subscribers: pete, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D7132 llvm-svn: 227345	2015-01-28 19:00:25 +00:00
Filipe Cabecinhas	22e7635dc5	Testcase for PS4 target defaults (from r227215 and r227219) llvm-svn: 227343	2015-01-28 18:49:45 +00:00
Enrico Granata	7e0255c769	As promised, make this more efficient by only doing all the busy work when necessary llvm-svn: 227342	2015-01-28 18:45:28 +00:00
Rui Ueyama	0b55151d3e	Add a unit test for LinkerScript. llvm-svn: 227341	2015-01-28 18:38:50 +00:00
Alex Rosenberg	d558ffaf69	Assume code ownership for the PS4 to ensure patches get reviewed, per the Developer Policy. llvm-svn: 227340	2015-01-28 18:33:39 +00:00
Bjorn Steinbrink	88b2b57cc9	Fix build breakage caused by memory leaks in llvm-c-test I accidently introduced those in r227319. llvm-svn: 227339	2015-01-28 18:32:31 +00:00
Colin LeMahieu	19ed07c75a	[Hexagon] Deleting a lot of old variants of intrinsics and updating references. llvm-svn: 227338	2015-01-28 18:29:11 +00:00
Frederic Riss	d34551833f	[dsymutil] Add DwarfLinker class. It's an empty shell for now. It's main method just opens the debug map objects and parses their Dwarf info. Test that we at least do that correctly. llvm-svn: 227337	2015-01-28 18:27:01 +00:00
Alex Rosenberg	286f124da5	Enable pragma comment processing for PS4. Original patch by Yunzhong Gao! llvm-svn: 227336	2015-01-28 18:26:15 +00:00
Colin LeMahieu	39b846ce0f	[Hexagon] Converting XTYPE/BIT intrinsic patterns and adding tests. llvm-svn: 227335	2015-01-28 18:06:23 +00:00
Sanjay Patel	9bb601856e	use SDValue methods directly instead of getNode()->* ; NFCI llvm-svn: 227334	2015-01-28 18:01:31 +00:00
Rafael Espindola	a05b3b73a4	Simplify code. NFC. llvm-svn: 227333	2015-01-28 17:54:19 +00:00
Colin LeMahieu	fe03c9a678	[Hexagon] Replacing XTYPE/SHIFT intrinsic patternss. Adding tests and missing instructions with tests. llvm-svn: 227330	2015-01-28 17:37:59 +00:00
Oleksiy Vyalov	f8ce61c5d8	Launch lldb-gdbserver in same process group when launched remotely using lldb-platform - commit on behalf of flackr. http://reviews.llvm.org/D7211 llvm-svn: 227329	2015-01-28 17:36:59 +00:00
Nico Weber	64a74bf1cf	Fix indents on asan_symbolize.py's argument parsing code. No behavior change. llvm-svn: 227327	2015-01-28 17:29:57 +00:00
Nico Weber	406f640a68	Make asan_symbolize.py not crash on Windows. asan_symbolize.py isn't needed on Windows, but it's nice if asan has a unified UI on all platforms. So rather than have asan_symolize.py die on startup due to it importing modules that don't exist on Windows, let it just echo the input. llvm-svn: 227326	2015-01-28 17:28:04 +00:00
Jozef Kolek	e10a02ecf0	[mips][microMIPS] Implement LWGP instruction Differential Revision: http://reviews.llvm.org/D6650 llvm-svn: 227325	2015-01-28 17:27:26 +00:00
Colin LeMahieu	fdbc5adbb6	[Hexagon] Replacing intrinsics for halfword adds and max/min word/dword. llvm-svn: 227322	2015-01-28 17:06:40 +00:00
Colin LeMahieu	c17902b89b	[Hexagon] Replacing old intrinsic tests with organized versions that match the reference manual. llvm-svn: 227321	2015-01-28 16:58:05 +00:00
Greg Fitzgerald	df0f5cd474	Remove PPC ELF target Differential Revision: http://reviews.llvm.org/D7225 llvm-svn: 227320	2015-01-28 16:37:43 +00:00
Bjorn Steinbrink	a09ac0085d	Fix LLVMSetMetadata and LLVMAddNamedMetadataOperand for single value MDNodes Summary: MetadataAsValue uses a canonical format that strips the MDNode if it contains only a single constant value. This triggers an assertion when trying to cast the value to a MDNode. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7165 llvm-svn: 227319	2015-01-28 16:35:59 +00:00
Simon Atanasyan	e13a9624c2	[ELFYAML] Provide explicit value for relocation addendums in the test The `Addend` is an optional field of the `Relocation` YAML record. But we do not provide its default value while reading it from a YAML file and so it might keep uninitialized. I am going to fix the code by a separate commit. We might either make this field mandatory (at least for .rela sections) or specify 0 as a default value explicitly. llvm-svn: 227318	2015-01-28 16:22:50 +00:00
Michael Kuperstein	90e08320c9	[x32] Change the condition from bitness to LP64 for TCRETURNdi64. TCRETURNmi64, which was mistakenly changed in r227307 will wait for another day. llvm-svn: 227317	2015-01-28 16:11:35 +00:00
Tom Stellard	40ce8af4a5	R600: Move DataLayout to AMDGPUTargetMachine This is a follow up to r227113. It is now required to use the amdgcn target for SI and newer GPUs. llvm-svn: 227316	2015-01-28 16:04:26 +00:00
Tom Stellard	d99fb956a3	R600: Use a Southern Islands GPU as the default for the amdgcn target llvm-svn: 227315	2015-01-28 15:38:44 +00:00
Tom Stellard	eba5648ad2	R600: Use a Southern Islands GPU as the default for the amdgcn target llvm-svn: 227314	2015-01-28 15:38:42 +00:00
Hal Finkel	77366097f4	Fix typo (count correctly) in include/clang/AST/TypeNodes.def Patch by Mingjie Xing. llvm-svn: 227313	2015-01-28 15:11:18 +00:00
Nathan Sidwell	0ba1940b8d	PR 20146 make new diagnostic an ExtWarn llvm-svn: 227312	2015-01-28 14:48:39 +00:00
Hal Finkel	34c94d5caa	Correct the AggressiveAntiDepBreaker's handling of subregisters defining super registers As the AggressiveAntiDepBreaker iterated backward through a scheduling region, we must leave super registers live through subregister definitions so that all relevant subregister definitions are renamed together. The problem was that we were also discarding sub-register use locations as the sub-registers are redefined. The result is that we'd rename the super register along with some, but not all, subregister definitions. R0_D = {R0_L, R1_L} R0_L = {R0_S, R1_S} %R0_L<def> = TRLi9 16, pred:8, pred:%noreg %R1_L<def> = LSRLrr %R1_L<kill>, %R0_S, pred:8, pred:%noreg %R0_L<def> = LSRLrr %R2_L, %R0_S, pred:8, pred:%noreg, %R0_L<imp-use,kill> %R1_L<def> = ANDLri %R1_L<kill>, 2047, pred:8, pred:%noreg %R0_L<def> = ANDLri %R0_L<kill>, 2047, pred:8, pred:%noreg %R4_D<def> = ASRDrr %R0_D<kill>, %R6_S Anti: %R4_D<def> = ASRDrr %R0_D<kill>, %R6_S Def Groups: R4_D=g213->g215(via R4_S)->g214(via R4_L)->g216(via R5_S)->g216(via R4_L)->g217(via R5_L) Use Groups: R0_D=g0->g218(last-use) R1_L->g219(last-use) R6_S=g204->g220(last-use) Anti: %R0_L<def> = ANDLri %R0_L<kill>, 2047, pred:8, pred:%noreg Def Groups: R0_L=g208->g209(via R0_S)->g218(via R0_D)->g210(via R1_S)->g210(via R0_D) Antidep reg: R0_L (real dependency) Use Groups: R0_L=g210->g224(last-use) R0_S->g225(last-use) R1_S->g226(last-use) Anti: %R1_L<def> = ANDLri %R1_L<kill>, 2047, pred:8, pred:%noreg Def Groups: R1_L=g219->g210(via R0_D) Antidep reg: R1_L (real dependency) Use Groups: R1_L=g210->g229(last-use) Anti: %R0_L<def> = LSRLrr %R2_L, %R0_S, pred:8, pred:%noreg, %R0_L<imp-use,kill> Def Groups: R0_L=g224->g225(via R0_S)->g210(via R0_D)->g226(via R1_S)->g226(via R0_D) Antidep reg: R0_L Use Groups: R2_L=g192 R0_S=g226->g230(last-use) R0_L=g226->g231(last-use) R1_S->g232(last-use) Anti: %R1_L<def> = LSRLrr %R1_L<kill>, %R0_S, pred:8, pred:%noreg Def Groups: R1_L=g229->g226(via R0_D) Antidep reg: R1_L Use Groups: R1_L=g226->g233(last-use) R0_S=g230 Anti: %R0_L<def> = TRLi9 16, pred:8, pred:%noreg Def Groups: R0_L=g231->g230(via R0_S)->g226(via R0_D)->g232(via R1_S)->g232(via R0_D) Antidep reg: R0_L Rename Candidates for Group g232: R0_D: elcInt64Regs :: R0_D R1_D R2_D R3_D R4_D R5_D R8_D R9_D R10_D R11_D R12_D R13_D R14_D R15_D R16_D R17_D R18_D R19_D R20_D R21_D R22_D R23_D R24_D R25_D R0_L: elcIntRegs :: R0_L R1_L R2_L R3_L R4_L R5_L R8_L R9_L R10_L R11_L R12_L R13_L R14_L R15_L R16_L R17_L R18_L R19_L R20_L R21_L R22_L R23_L R24_L R25_L R0_S: elcShrtRegs elcShrtRegs :: R0_S R1_S R2_S R3_S R4_S R5_S R8_S R9_S R10_S R11_S R12_S R13_S R14_S R15_S R16_S R17_S R18_S R19_S R20_S R21_S R22_S R23_S R24_S R25_S Find Registers: [R12_D: R12_D R12_L R12_S] Breaking anti-dependence edge on R0_L: R0_D->R12_D(1 refs) R0_L->R12_L(2 refs) R0_S->R12_S(2 refs) Use Groups: ... %R12_L<def> = TRLi9 16, pred:8, pred:%noreg %R1_L<def> = LSRLrr %R1_L<kill>, %R12_S, pred:8, pred:%noreg %R0_L<def> = LSRLrr %R2_L<kill>, %R12_S, pred:8, pred:%noreg, %R12_L<imp-use> %R1_L<def> = ANDLri %R1_L<kill>, 2047, pred:8, pred:%noreg %R0_L<def> = ANDLri %R0_L<kill>, 2047, pred:8, pred:%noreg %R4_D<def> = ASRDrr %R12_D<kill>, %R6_S With this change, we now produce: Anti: %R4_D<def> = ASRDrr %R0_D<kill>, %R6_S Def Groups: R4_D=g213->g215(via R4_S)->g214(via R4_L)->g216(via R5_S)->g216(via R4_L)->g217(via R5_L) Use Groups: R0_D=g0->g218(last-use) R1_L->g219(last-use) R6_S=g204->g220(last-use) Anti: %R0_L<def> = ANDLri %R0_L<kill>, 2047, pred:8, pred:%noreg Def Groups: R0_L=g208->g209(via R0_S)->g218(via R0_D)->g210(via R1_S)->g210(via R0_D) Antidep reg: R0_L (real dependency) Use Groups: R0_L=g210 Anti: %R1_L<def> = ANDLri %R1_L<kill>, 2047, pred:8, pred:%noreg Def Groups: R1_L=g219->g210(via R0_D) Antidep reg: R1_L (real dependency) Use Groups: R1_L=g210 Anti: %R0_L<def> = LSRLrr %R2_L, %R0_S, pred:8, pred:%noreg, %R0_L<imp-use,kill> Def Groups: R0_L=g210->g210(via R0_D)->g210(via R0_D) Antidep reg: R0_L Use Groups: R2_L=g192 R0_S=g210 R0_L=g210 Anti: %R1_L<def> = LSRLrr %R1_L<kill>, %R0_S, pred:8, pred:%noreg Def Groups: R1_L=g210->g210(via R0_D) Antidep reg: R1_L Use Groups: R1_L=g210 R0_S=g210 Anti: %R0_L<def> = TRLi9 16, pred:8, pred:%noreg Def Groups: R0_L=g210->g210(via R0_D)->g210(via R0_D) Antidep reg: R0_L Rename Candidates for Group g210: R0_D: elcInt64Regs :: R0_D R1_D R2_D R3_D R4_D R5_D R8_D R9_D R10_D R11_D R12_D R13_D R14_D R15_D R16_D R17_D R18_D R19_D R20_D R21_D R22_D R23_D R24_D R25_D R0_L: elcIntRegs elcIntAIRegs elcIntRegs elcIntRegs elcIntRegs elcIntRegs :: R0_L R1_L R2_L R3_L R4_L R5_L R8_L R9_L R10_L R11_L R12_L R13_L R14_L R15_L R16_L R17_L R18_L R19_L R20_L R21_L R22_L R23_L R24_L R25_L R1_L: elcIntRegs elcIntRegs elcIntRegs elcIntRegs elcIntRegs :: R0_L R1_L R2_L R3_L R4_L R5_L R8_L R9_L R10_L R11_L R12_L R13_L R14_L R15_L R16_L R17_L R18_L R19_L R20_L R21_L R22_L R23_L R24_L R25_L R0_S: elcShrtRegs elcShrtRegs :: R0_S R1_S R2_S R3_S R4_S R5_S R8_S R9_S R10_S R11_S R12_S R13_S R14_S R15_S R16_S R17_S R18_S R19_S R20_S R21_S R22_S R23_S R24_S R25_S Find Registers: [R12_D: R12_D R12_L R13_L R12_S] Breaking anti-dependence edge on R0_L: R0_D->R12_D(1 refs) R0_L->R12_L(7 refs) R1_L->R13_L(5 refs) R0_S->R12_S(2 refs) Use Groups: ... %R12_L<def> = TRLi9 16, pred:8, pred:%noreg %R13_L<def> = LSRLrr %R13_L<kill>, %R12_S, pred:8, pred:%noreg %R12_L<def> = LSRLrr %R2_L<kill>, %R12_S<kill>, pred:8, pred:%noreg, %R12_L<imp-use,kill> %R13_L<def> = ANDLri %R13_L<kill>, 2047, pred:8, pred:%noreg %R12_L<def> = ANDLri %R12_L<kill>, 2047, pred:8, pred:%noreg %R4_D<def> = ASRDrr %R12_D, %R6_S, %R12_L<imp-def>, %R12_S<imp-def>, %R13_S<imp-def> As demonstrated by this example, this is also somewhat unfortunate, because there is actually no need to rename the super register in this case (it is fully covered by later subregister definitions), but we don't seem to track enough information here to exploit that either. Thanks to Daniil Troshkov for reporting the issue. The debug outputs in this commit message are from Daniil. llvm-svn: 227311	2015-01-28 14:44:14 +00:00
Sean Silva	fdcbb0284e	Avoid testing for a particular choice of resource dir. Without this patch, this test was accidentally testing that CLANG_RESOURCE_DIR, CLANG_LIBDIR_SUFFIX, and CLANG_VERSION_STRING were set to a particular set of values. The test was also getting pretty hairy since it was attempting to craft a regular expression that covered "all" possible combinations of settings for these configure-time constants. Clean it up by directly capturing the resource directory in a FileCheck variable. llvm-svn: 227310	2015-01-28 14:19:08 +00:00
Francisco Lopes da Silva	0c010cddb3	Improves overload completion result chunks. The code building the code completion string for overloads was providing less detail compared to the one building completion strings for function declarations. There was no information about optionals and no information about what's a parameter and what's a function identifier, everything besides ResultType, CurrentParameter and special characters was classified as Text. This makes code completion strings for overload candidates to follow a pattern very similar, but not identical, to the one in use for function declarations: - return type chunk: ResultType - function identifier chunk: Text - parameter chunks: Placeholder - optional parameter chunks: Optional - current parameter chunk: CurrentParameter llvm-svn: 227309	2015-01-28 14:17:22 +00:00
Michael Kuperstein	951995821a	[X86] Reduce some 32-bit imuls into lea + shl Reduce integer multiplication by a constant of the form k*2^c, where k is in {3,5,9} into a lea + shl. Previously it was only done for imulq on 64-bit platforms, but it makes sense for imull and 32-bit as well. Differential Revision: http://reviews.llvm.org/D7196 llvm-svn: 227308	2015-01-28 14:08:22 +00:00
Michael Kuperstein	f387611ac2	[x32] Enable sibcall optimization on x32. This includes two things: 1) Fix TCRETURNdi and TCRETURN64di patterns to check the right thing (LP64 as opposed to target bitness). 2) Allow LEA64_32 in MatchingStackOffset. llvm-svn: 227307	2015-01-28 13:38:48 +00:00
Sean Silva	52c7dcd55d	[docs] Use slightly more proper .rst markup Again, I'd like to emphasize to everyone that this sort of markup change is not what you should be concerned about when writing docs. Focus on content. I applaud Chandler for focusing on the fantastic content of this new section! llvm-svn: 227305	2015-01-28 10:36:41 +00:00
Sean Silva	b1548edf25	[docs] [cleanup] No need for a comment around C++11 override llvm-svn: 227304	2015-01-28 10:26:29 +00:00
Elena Demikhovsky	7b0dd39db6	AVX-512: Added FMA intrinsics with rounding mode By Asaf Badouh and Elena Demikhovsky Added special nodes for rounding: FMADD_RND, FMSUB_RND.. It will prevent merge between nodes with rounding and other standard nodes. llvm-svn: 227303	2015-01-28 10:21:27 +00:00
Craig Topper	7d3c6d307a	[X86] Teach disassembler to handle illegal immediates on AVX512 integer compare instructions. llvm-svn: 227302	2015-01-28 10:09:56 +00:00
Craig Topper	6772eac490	[X86] Merge printSSECC and printAVXCC. They only differed by an assertion. llvm-svn: 227301	2015-01-28 10:09:52 +00:00
Chandler Carruth	16b670ec20	[LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack tracing code. Managed static was just insane overhead for this. We took memory fences and external function calls in every path that pushed a pretty stack frame. This includes a multitude of layers setting up and tearing down passes, the parser in Clang, everywhere. For the regression test suite or low-overhead JITs, this was contributing to really significant overhead. Even the LLVM ThreadLocal is really overkill here because it uses pthread_{set,get}_specific logic, and has careful code to both allocate and delete the thread local data. We don't actually want any of that, and this code in particular has problems coping with deallocation. What we want is a single TLS pointer that is valid to use during global construction and during global destruction, any time we want. That is exactly what every host compiler and OS we use has implemented for a long time, and what was standardized in C++11. Even though not all of our host compilers support the thread_local keyword, we can directly use the platform-specific keywords to get the minimal functionality needed. Provided this limited trial survives the build bots, I will move this to Compiler.h so it is more widely available as a light weight if limited alternative to the ThreadLocal class. Many thanks to David Majnemer for helping me think through the implications across platforms and craft the MSVC-compatible syntax. The end result is substantially faster. When running llc in a tight loop over a small IR file targeting the aarch64 backend, this improves its performance by over 10% for me. It also seems likely to fix the remaining regressions seen by JIT users with threading enabled. This may actually have more impact on real-world compile times due to the use of the pretty stack tracing utility throughout the rest of Clang or LLVM, but I've not collected any detailed measurements. llvm-svn: 227300	2015-01-28 09:52:14 +00:00
Chandler Carruth	5b0d3e3f3a	[LPM] A targeted but somewhat horrible fix to the legacy pass manager's querying of the pass registry. The pass manager relies on the static registry of PassInfo objects to perform all manner of its functionality. I don't understand why it does much of this. My very vague understanding is that this registry is touched both during static initialization and while each pass is being constructed. As a consequence it is hard to make accessing it not require a acquiring some lock. This lock ends up in the hot path of setting up, tearing down, and invaliditing analyses in the legacy pass manager. On most systems you can observe this as a non-trivial % of the time spent in 'ninja check-llvm'. However, I haven't really seen it be more than 1% in extreme cases of compiling more real-world software, including LTO. Unfortunately, some of the GPU JITs are seeing this taking essentially all of their time because they have very small IR running through a small pass pipeline very many times (at least, this is the vague understanding I have of it). This patch tries to minimize the cost of looking up PassInfo objects by leveraging the fact that the objects themselves are immutable and they are allocated separately on the heap and so don't have their address change. It also requires a change I made the last time I tried to debug this problem which removed the ability to de-register a pass from the registry. This patch creates a single access path to these objects inside the PMTopLevelManager which memoizes the result of querying the registry. This is somewhat gross as I don't really know if PMTopLevelManager is the right place to put it, and I dislike using a mutable member to memoize things, but it seems to work. For long-lived pass managers this should completely eliminate the cost of acquiring locks to look into the pass registry once the memoized cache is warm. For 'ninja check' I measured about 1.5% reduction in CPU time and in total time on a machine with 32 hardware threads. For normal compilation, I don't know how much this will help, sadly. We will still pay the cost while we populate the memoized cache. I don't think it will hurt though, and for LTO or compiles with many small functions it should still be a win. However, for tight loops around a pass manager with many passes and small modules, this will help tremendously. On the AArch64 backend I saw nearly 50% reductions in time to complete 2000 cycles of spinning up and tearing down the pipeline. Measurements from Owen of an actual long-lived pass manager show more along the lines of 10% improvements. Differential Revision: http://reviews.llvm.org/D7213 llvm-svn: 227299	2015-01-28 09:47:21 +00:00
Elena Demikhovsky	45f0448081	Fold fcmp in cases where value is provably non-negative. By Arch Robison. This patch folds fcmp in some cases of interest in Julia. The patch adds a function CannotBeOrderedLessThanZero that returns true if a value is provably not less than zero. I.e. the function returns true if the value is provably -0, +0, positive, or a NaN. The patch extends InstructionSimplify.cpp to fold instances of fcmp where: - the predicate is olt or uge - the first operand is provably not less than zero - the second operand is zero The motivation for handling these cases optimizing away domain checks for sqrt in Julia for common idioms such as sqrt(xx+yy).. http://reviews.llvm.org/D6972 llvm-svn: 227298	2015-01-28 08:03:58 +00:00

1 2 3 4 5 ...

191564 Commits