archived-llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2026-01-31 01:35:20 +01:00

Author	SHA1	Message	Date
Nick Desaulniers	1edbe64ce4	[LoopRotate] fix crash encountered with callbr Summary: While implementing inlining support for callbr (https://bugs.llvm.org/show_bug.cgi?id=40722), I hit a crash in Loop Rotation when trying to build the entire x86 Linux kernel (drivers/char/random.c). This is a small fix up to r353563. Test case is drivers/char/random.c (with callbr's inlined), then ran through creduce, then `opt -opt-bisect-limit=<limit>`, then bugpoint. Thanks to Craig Topper for immediately spotting the fix, and teaching me how to fish. Reviewers: craig.topper, jyknight Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D58929 llvm-svn: 355564	2019-03-06 23:04:40 +00:00
Nikita Popov	4018037d9b	[InstCombine] Fold add nsw + sadd.with.overflow Fold `add nsw` and `sadd.with.overflow` with constants if the addition does not overflow. Part of https://bugs.llvm.org/show_bug.cgi?id=38146. Patch by Dan Robertson. Differential Revision: https://reviews.llvm.org/D58881 llvm-svn: 355530	2019-03-06 18:30:00 +00:00
Evgeniy Stepanov	973bc4b579	[msan] Instrument x86 BMI intrinsics. Summary: They simply shuffle bits. MSan needs to do the same with shadow bits, after making sure that the shuffle mask is fully initialized. Reviewers: pcc, vitalybuka Subscribers: hiraditya, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D58858 llvm-svn: 355348	2019-03-04 22:58:20 +00:00
Jordan Rupprecht	6daafbf4be	[NFC] Fix PGO link error in shared libs build llvm-svn: 355346	2019-03-04 22:54:44 +00:00
Sanjay Patel	14a0e6ae7d	[ConstantHoisting] avoid hang/crash from unreachable blocks (PR40930) I'm not too familiar with this pass, so there might be a better solution, but this appears to fix the degenerate: PR40930 PR40931 PR40932 PR40934 ...without affecting any real-world code. As we've seen in several other passes, when we have unreachable blocks, they can contain semi-bogus IR and/or cause unexpected conditions. We would not typically expect these patterns to make it this far, but we have to guard against them anyway. llvm-svn: 355337	2019-03-04 20:57:14 +00:00
Rong Xu	888aa59db9	[PGO] Context sensitive PGO (part 3) Part 3 of CSPGO changes (mostly related to PassMananger). Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355330	2019-03-04 20:21:27 +00:00
Davide Italiano	85cfb00565	[InstCombine] Mark debug values as unavailable after DCE. Fixes PR40838. llvm-svn: 355301	2019-03-04 04:38:58 +00:00
Sanjay Patel	0879ff91d9	[InstCombine] move add after smin/smax Follow-up to rL355221. This isn't specifically called for within PR14613, but we'll get there eventually if it's not already requested in some other bug report. https://rise4fun.com/Alive/5b0 Name: smax Pre: WillNotOverflowSignedSub(C1,C0) %a = add nsw i8 %x, C0 %cond = icmp sgt i8 %a, C1 %r = select i1 %cond, i8 %a, i8 C1 => %c2 = icmp sgt i8 %x, C1-C0 %u2 = select i1 %c2, i8 %x, i8 C1-C0 %r = add nsw i8 %u2, C0 Name: smin Pre: WillNotOverflowSignedSub(C1,C0) %a = add nsw i32 %x, C0 %cond = icmp slt i32 %a, C1 %r = select i1 %cond, i32 %a, i32 C1 => %c2 = icmp slt i32 %x, C1-C0 %u2 = select i1 %c2, i32 %x, i32 C1-C0 %r = add nsw i32 %u2, C0 llvm-svn: 355272	2019-03-02 16:45:10 +00:00
Philip Reames	0c96688c0a	[InstCombine] Extend saturating idempotent atomicrmw transform to FP I'm assuming that the nan propogation logic for InstructonSimplify's handling of fadd and fsub is correct, and applying the same to atomicrmw. Differential Revision: https://reviews.llvm.org/D58836 llvm-svn: 355222	2019-03-01 19:50:36 +00:00
Sanjay Patel	177b843b47	[InstCombine] move add after umin/umax In the motivating cases from PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 ...moving the add enables us to narrow the min/max which eliminates zext/trunc which enables signficantly better vectorization. But that bug is still not completely fixed. https://rise4fun.com/Alive/5KQ Name: umax Pre: C1 u>= C0 %a = add nuw i8 %x, C0 %cond = icmp ugt i8 %a, C1 %r = select i1 %cond, i8 %a, i8 C1 => %c2 = icmp ugt i8 %x, C1-C0 %u2 = select i1 %c2, i8 %x, i8 C1-C0 %r = add nuw i8 %u2, C0 Name: umin Pre: C1 u>= C0 %a = add nuw i32 %x, C0 %cond = icmp ult i32 %a, C1 %r = select i1 %cond, i32 %a, i32 C1 => %c2 = icmp ult i32 %x, C1-C0 %u2 = select i1 %c2, i32 %x, i32 C1-C0 %r = add nuw i32 %u2, C0 llvm-svn: 355221	2019-03-01 19:42:40 +00:00
Philip Reames	bd51fb8638	[LICM] Infer proper alignment from loads during scalar promotion This patch fixes an issue where we would compute an unnecessarily small alignment during scalar promotion when no store is not to be guaranteed to execute, but we've proven load speculation safety. Since speculating a load requires proving the existing alignment is valid at the new location (see Loads.cpp), we can use the alignment fact from the load. For non-atomics, this is a performance problem. For atomics, this is a correctness issue, though an incredibly rare one to see in practice. For atomics, we might not be able to lower an improperly aligned load or store (i.e. i32 align 1). If such an instruction makes it all the way to codegen, we may fail to codegen the operation, or we may simply generate a slow call to a library function. The part that makes this super hard to see in practice is that the memory location actually is well aligned, and instcombine knows that. So, to see a failure, you have to have a) hit the bug in LICM, b) somehow hit a depth limit in InstCombine/ValueTracking to avoid fixing the alignment, and c) then have generated an instruction which fails codegen rather than simply emitting a slow libcall. All around, pretty hard to hit. Differential Revision: https://reviews.llvm.org/D58809 llvm-svn: 355217	2019-03-01 18:45:05 +00:00
Philip Reames	2a3e3d83a1	[InstCombine] Extend "idempotent" atomicrmw optimizations to floating point An idempotent atomicrmw is one that does not change memory in the process of execution. We have already added handling for the various integer operations; this patch extends the same handling to floating point operations which were recently added to IR. Note: At the moment, we canonicalize idempotent fsub to fadd when ordering requirements prevent us from using a load. As discussed in the review, I will be replacing this with canonicalizing both floating point ops to integer ops in the near future. Differential Revision: https://reviews.llvm.org/D58251 llvm-svn: 355210	2019-03-01 18:00:07 +00:00
Jonas Hahnfeld	065a990766	Hide two unused debugging methods, NFCI. GCC correctly moans that PlainCFGBuilder::isExternalDef(llvm::Value*) and StackSafetyDataFlowAnalysis::verifyFixedPoint() are defined but not used in Release builds. Hide them behind 'ifndef NDEBUG'. llvm-svn: 355205	2019-03-01 17:15:21 +00:00
Manman Ren	3f92e947c8	Try to fix NetBSD buildbot breakage introduced in D57463. By including the header file in the source. llvm-svn: 355202	2019-03-01 15:25:24 +00:00
Fangrui Song	2fb1d0fd03	[ConstantHoisting] Call cleanup() in ConstantHoistingPass::runImpl to avoid dangling elements in ConstIntInfoVec for new PM Summary: ConstIntInfoVec contains elements extracted from the previous function. In new PM, releaseMemory() is not called and the dangling elements can cause segfault in findConstantInsertionPoint. Rename releaseMemory() to cleanup() to deliver the idea that it is mandatory and call cleanup() in ConstantHoistingPass::runImpl to fix this. Reviewers: ormris, zzheng, dmgreen, wmi Reviewed By: ormris, wmi Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58589 llvm-svn: 355174	2019-03-01 05:27:01 +00:00
Reid Kleckner	3d80173d14	[sancov] Instrument reachable blocks that end in unreachable Summary: These sorts of blocks often contain calls to noreturn functions, like longjmp, throw, or trap. If they don't end the program, they are "interesting" from the perspective of sanitizer coverage, so we should instrument them. This was discussed in https://reviews.llvm.org/D57982. Reviewers: kcc, vitalybuka Subscribers: llvm-commits, craig.topper, efriedma, morehouse, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D58740 llvm-svn: 355152	2019-02-28 22:54:30 +00:00
Manman Ren	cfe7d6c196	Add a module pass for order file instrumentation The basic idea of the pass is to use a circular buffer to log the execution ordering of the functions. We only log the function when it is first executed. We use a 8-byte hash to log the function symbol name. In this pass, we add three global variables: (1) an order file buffer: a circular buffer at its own llvm section. (2) a bitmap for each module: one byte for each function to say if the function is already executed. (3) a global index to the order file buffer. At the function prologue, if the function has not been executed (by checking the bitmap), log the function hash, then atomically increase the index. Differential Revision: https://reviews.llvm.org/D57463 llvm-svn: 355133	2019-02-28 20:13:38 +00:00
Rong Xu	7da9d9200c	[PGO] Context sensitive PGO (part 2) Part 2 of CSPGO changes (mostly related to ProfileSummary). Note that I use a default parameter in setProfileSummary() and getSummary(). This is to break the dependency in clang. I will make the parameter explicit after changing clang in a separated patch. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355131	2019-02-28 19:55:07 +00:00
Sanjay Patel	a333b0fd18	[InstCombine] fold adds of constants separated by sext/zext This is part of a transform that may be done in the backend: D13757 ...but it should always be beneficial to fold this sooner in IR for all targets. https://rise4fun.com/Alive/vaiW Name: sext add nsw %add = add nsw i8 %i, C0 %ext = sext i8 %add to i32 %r = add i32 %ext, C1 => %s = sext i8 %i to i32 %r = add i32 %s, sext(C0)+C1 Name: zext add nuw %add = add nuw i8 %i, C0 %ext = zext i8 %add to i16 %r = add i16 %ext, C1 => %s = zext i8 %i to i16 %r = add i16 %s, zext(C0)+C1 llvm-svn: 355118	2019-02-28 19:05:26 +00:00
Chijun Sima	a16394ad24	Make MergeBlockIntoPredecessor conformant to the precondition of calling DTU.applyUpdates Summary: It is mentioned in the document of DTU that "It is illegal to submit any update that has already been submitted, i.e., you are supposed not to insert an existent edge or delete a nonexistent edge." It is dangerous to violet this rule because DomTree and PostDomTree occasionally crash on this scenario. This patch fixes `MergeBlockIntoPredecessor`, making it conformant to this precondition. Reviewers: kuhar, brzycki, chandlerc Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58444 llvm-svn: 355105	2019-02-28 16:47:18 +00:00
Bjorn Pettersson	a490753ab0	Add support for computing "zext of value" in KnownBits. NFCI Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099	2019-02-28 15:45:29 +00:00
Eric Christopher	df9622f2af	Temporarily revert "ArgumentPromotion should copy all metadata to new Function" and the dependent patch "Refine ArgPromotion metadata handling" as they're causing segfaults in argument promotion. This reverts commits r354032 and r353537. llvm-svn: 355060	2019-02-28 01:11:12 +00:00
Reid Kleckner	eab9f78208	[InstrProf] Use separate comdat group for data and counters Summary: I hadn't realized that instrumentation runs before inlining, so we can't use the function as the comdat group. Doing so can create relocations against discarded sections when references to discarded __profc_ variables are inlined into functions outside the function's comdat group. In the future, perhaps we should consider standardizing the comdat group names that ELF and COFF use. It will save object file size, since __profv_$sym won't appear in the symbol table again. Reviewers: xur, vsk Subscribers: eraman, hiraditya, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D58737 llvm-svn: 355044	2019-02-27 23:38:44 +00:00
Alina Sbirlea	8601cb2c88	[MemorySSA] Make insertDef insert corresponding phi nodes. Summary: The original assumption for the insertDef method was that it would not materialize Defs out of no-where, hence it will not insert phis needed after inserting a Def. However, when cloning an instruction (use case used in LICM), we do materialize Defs "out of no-where". If the block receiving a Def has at least one other Def, then no processing is needed. If the block just received its first Def, we must check where Phi placement is needed. The only new usage of insertDef is in LICM, hence the trigger for the bug. But the original goal of the method also fails to apply for the move() method. If we move a Def from the entry point of a diamond to either the left or right blocks, then the merge block must add a phi. While this usecase does not currently occur, or may be viewed as an incorrect transformation, MSSA must behave corectly given the scenario. Resolves PR40749 and PR40754. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58652 llvm-svn: 355040	2019-02-27 22:20:22 +00:00
Rong Xu	d2acd4c4e2	Recommit r354930 "[PGO] Context sensitive PGO (part 1)" Fixed UBSan failures. llvm-svn: 355005	2019-02-27 17:24:33 +00:00
Vlad Tsyrklevich	3def5ac311	Revert "[PGO] Context sensitive PGO (part 1)" This reverts commit r354930, it was causing UBSan failures. llvm-svn: 354953	2019-02-27 03:45:28 +00:00
Vedant Kumar	2d2c28123a	[HotColdSplit] Disable splitting for sanitized functions Splitting can make sanitizer errors harder to understand, as the trapping instruction may not be in the function where the bug was detected. rdar://48142697 llvm-svn: 354931	2019-02-26 22:55:46 +00:00
Rong Xu	a0b71feeed	[PGO] Context sensitive PGO (part 1) Current PGO profile counts are not context sensitive. The branch probabilities for the inlined functions are kept the same for all call-sites, and they might be very different from the actual branch probabilities. These suboptimal profiles can greatly affect some downstream optimizations, in particular for the machine basic block placement optimization. In this patch, we propose to have a post-inline PGO instrumentation/use pass, which we called Context Sensitive PGO (CSPGO). For the users who want the best possible performance, they can perform a second round of PGO instrument/use on the top of the regular PGO. They will have two sets of profile counts. The first pass profile will be manly for inline, indirect-call promotion, and CGSCC simplification pass optimizations. The second pass profile is for post-inline optimizations and code-gen optimizations. A typical usage: // Regular PGO instrumentation and generate pass1 profile. > clang -O2 -fprofile-generate source.c -o gen > ./gen > llvm-profdata merge default.profraw -o pass1.profdata // CSPGO instrumentation. > clang -O2 -fprofile-use=pass1.profdata -fcs-profile-generate -o gen2 > ./gen2 // Merge two sets of profiles > llvm-profdata merge default.profraw pass1.profdata -o profile.profdata // Use the combined profile. Pass manager will invoke two PGO use passes. > clang -O2 -fprofile-use=profile.profdata -o use This change touches many components in the compiler. The reviewed patch (D54175) will committed in phrases. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 354930	2019-02-26 22:37:46 +00:00
Alina Sbirlea	bdfb172eb0	[MemorySSA & SimpleLoopUnswitch] Update MemorySSA in ReplaceUsesOfWith. SimpleLoopUnswitch must update MemorySSA when removing instructions. Resolves PR39197. llvm-svn: 354919	2019-02-26 19:44:52 +00:00
Sanjay Patel	df3362d6e8	[InstCombine] canonicalize more unsigned saturated add with 'not' Yet another pattern variation suggested by: https://bugs.llvm.org/show_bug.cgi?id=14613 There are 8 more potential commuted patterns here on top of the 8 that were already handled (rL354221, rL354276, rL354393). We have the obvious commute of the 'add' + commute of the cmp predicate/operands (ugt/ult) + commute of the select operands: Name: base %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ult i32 %x, %y %r = select i1 %c, i32 -1, i32 %a => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: ugt %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ugt i32 %y, %x %r = select i1 %c, i32 -1, i32 %a => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: commute select %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ult i32 %y, %x %r = select i1 %c, i32 %a, i32 -1 => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: ugt + commute select %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ugt i32 %x, %y %r = select i1 %c, i32 %a, i32 -1 => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/den llvm-svn: 354887	2019-02-26 15:18:49 +00:00
Simon Pilgrim	bbece4ffe7	[Vectorizer] Add vectorization support for fixed smul/umul intrinsics This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument. Differential Revision: https://reviews.llvm.org/D58616 llvm-svn: 354790	2019-02-25 15:42:02 +00:00
Sanjay Patel	ab450c2c65	[InstCombine] canonicalize add/sub with bool add A, sext(B) --> sub A, zext(B) We have to choose 1 of these forms, so I'm opting for the zext because that's easier for value tracking. The backend should be prepared for this change after: D57401 rL353433 This is also a preliminary step towards reducing the amount of bit hackery that we do in IR to optimize icmp/select. That should be waiting to happen at a later optimization stage. The seeming regression in the fuzzer test was discussed in: D58359 We were only managing that fold in instcombine by luck, and other passes should be able to deal with that better anyway. llvm-svn: 354748	2019-02-24 16:57:45 +00:00
Matt Arsenault	74315cc012	BreakCriticalEdges: Update PostDominatorTree llvm-svn: 354673	2019-02-22 15:01:41 +00:00
Roman Tereshin	47dd2adc8b	[LowerSwitch][AMDGPU] Do not handle impossible values This patch adds LazyValueInfo to LowerSwitch to compute the range of the value being switched over and reduce the size of the tree LowerSwitch builds to lower a switch. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D58096 llvm-svn: 354670	2019-02-22 14:33:46 +00:00
Chijun Sima	b98fe759b3	[DTU] Refine the interface and logic of applyUpdates Summary: This patch separates two semantics of `applyUpdates`: 1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update. 2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated. Logic changes: Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example, ``` DTU(Lazy) and Edge A->B exists. 1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued 2. Remove A->B 3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended) ``` But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue. Interface changes: The second semantic of `applyUpdates` is separated to `applyUpdatesPermissive`. These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`. Reviewers: kuhar, brzycki, dmgreen, grosser Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58170 llvm-svn: 354669	2019-02-22 13:48:38 +00:00
Alina Sbirlea	58c46a9180	[MemorySSA & LoopPassManager] Resolve PR40038. The correct edge being deleted is not to the unswitched exit block, but to the original block before it was split. That's the key in the map, not the value. The insert is correct. The new edge is to the .split block. The splitting turns OriginalBB into: OriginalBB -> OriginalBB.split. Assuming the orignal CFG edge: ParentBB->OriginalBB, we must now delete ParentBB->OriginalBB, not ParentBB->OriginalBB.split. llvm-svn: 354656	2019-02-22 07:18:37 +00:00
Chijun Sima	43cc9261be	[DTU] Deprecate insertEdge/deleteEdge Summary: This patch converts all existing `insertEdge/deleteEdge` to `applyUpdates` and marks `insertEdge/deleteEdge` as deprecated. Reviewers: kuhar, brzycki Reviewed By: kuhar, brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58443 llvm-svn: 354652	2019-02-22 05:41:43 +00:00
Alina Sbirlea	5806e8c89e	[MemorySSA & LoopPassManager] Update MemorySSA in formDedicatedExitBlocks. MemorySSA is now updated when forming dedicated exit blocks. Resolves PR40037. llvm-svn: 354623	2019-02-21 21:13:34 +00:00
Alina Sbirlea	25fcda84f5	[LoopSimplifyCFG] Update MemorySSA after r353911. Summary: MemorySSA is not properly updated in LoopSimplifyCFG after recent changes. Use SplitBlock utility to resolve that and clear all updates once handleDeadExits is finished. All updates that follow are removal of edges which are safe to handle via the removeEdge() API. Also, deleting dead blocks is done correctly as is, i.e. delete from MemorySSA before updating the CFG and DT. Reviewers: mkazantsev, rtereshin Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58524 llvm-svn: 354613	2019-02-21 19:54:05 +00:00
Alina Sbirlea	91b1a14a23	[EarlyCSE] Cleanup deadcode. [NFCI] Summary: Cleanup nop assignments. Reviewers: george.burgess.iv, davide Subscribers: sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58308 llvm-svn: 354612	2019-02-21 19:49:57 +00:00
Joey Gouly	94650fe5f9	[InferAddressSpaces] Fix fallthrough error llvm-svn: 354580	2019-02-21 13:10:37 +00:00
Joey Gouly	600efea245	[InferAddressSpaces] Fix crash on select of non-ptr operands Check the operands of a select are pointers, to determine if it is an address expression or not. https://reviews.llvm.org/D58226 llvm-svn: 354576	2019-02-21 12:31:36 +00:00
Max Kazantsev	0aad8d4c6c	[LoopSimplifyCFG] Add missing MSSA edge deletion When we create fictive switch in preheader, we should take care about MSSA and delete edge between old preheader and header. llvm-svn: 354547	2019-02-21 05:51:29 +00:00
Wei Mi	3b1ad163ed	[Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabled is false. Right now for inliner and partial inliner, we always pass the address of a valid ORE object to getInlineCost even if RemarkEnabled is false because of no -Rpass is specified. Since ComputeFullInlineCost will be set to true if ORE is non-null in getInlineCost, this introduces the problem that in getInlineCost we cannot return early even if we already know the cost is definitely higher than the threshold. It is a general problem for compile time. This patch fixes that by pass nullptr as the ORE argument if RemarkEnabled is false. Differential Revision: https://reviews.llvm.org/D58399 llvm-svn: 354542	2019-02-21 02:57:52 +00:00
Philip Reames	6b9bc58980	[GVN] Small tweaks to comments, style, and missed vector handling Noticed these while doing a final sweep of the code to make sure I hadn't missed anything in my last couple of patches. The (minor) missed optimization was noticed because of the stylistic fix to avoid an overly specific cast. llvm-svn: 354412	2019-02-20 00:31:28 +00:00
Philip Reames	73a5d1b25e	[GVN] Fix last crasher w/non-integral pointers Same case as for memset and memcpy, but this time for clobbering stores and loads. We still can't allow coercion to or from non-integrals, regardless of the transform. Now that I'm done the whole little sequence, it seems apparent that we'd entirely missed reasoning about clobbers in the original GVN support for non-integral pointers. My appologies, I thought we'd upstreamed all of this, but it turns out we were still carrying a downstream hack which hid all of these issues. My chanks to Cherry Zhang for helping debug. llvm-svn: 354407	2019-02-20 00:15:54 +00:00
Philip Reames	ce8442166e	[GVN] Fix a crash bug w/non-integral pointers and memtransfers Problem is very similiar to the one fixed for memsets in r354399, we try to coerce a value to non-integral type, and then crash while try to do so. Since we shouldn't be doing such coercions to start with, easy fix. From inspection, I see two other cases which look to be similiar and will follow up with most test cases and fixes if confirmed. llvm-svn: 354403	2019-02-19 23:49:38 +00:00
Philip Reames	04deb81215	[GVN] Fix a non-integral pointer bug w/vector types GVN generally doesn't forward structs or array types, but it will forward vector types to non-vectors and vice versa. As demonstrated in tests, we need to inhibit the same set of transforms for vector of non-integral pointers as for non-integral pointers themselves. llvm-svn: 354401	2019-02-19 23:19:51 +00:00
Philip Reames	26002d519f	[GVN] Fix a crash bug around non-integral pointers If we encountered a location where we tried to forward the value of a memset to a load of a non-integral pointer, we crashed. Such a forward is not legal in general, but we can forward null pointers. Test for both cases are included. llvm-svn: 354399	2019-02-19 23:07:15 +00:00
Sanjay Patel	581944d605	[InstCombine] reduce even more unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: uaddsat, -1 fval %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %notx, %y %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a Name: uaddsat, -1 fval + ult %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %y, %notx %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/nTp llvm-svn: 354393	2019-02-19 22:14:21 +00:00

1 2 3 4 5 ...

21395 Commits