archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
David Green	2c4ca6832f	[InstCombine] Signed saturation patterns This adds an instcombine matcher for code that attempts to perform signed saturating arithmetic by casting to a higher type. Unsigned cases are already matched, this adds extra matches for the more complex signed cases, which involves matching the min(max(add a b)) nodes with proper extends to ensure legality. Differential Revision: https://reviews.llvm.org/D68651 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375505 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-22 15:39:47 +00:00
David Green	6e8533b056	[InstCombine] Signed saturation tests. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375503 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-22 14:49:40 +00:00
Petar Avramovic	e6fb6db568	[MIParser] Set RegClassOrRegBank during instruction parsing MachineRegisterInfo::createGenericVirtualRegister sets RegClassOrRegBank to static_cast<RegisterBank *>(nullptr). MIParser on the other hand doesn't. When we attempt to constrain Register Class on such VReg, additional COPY is generated. This way we avoid COPY instructions showing in test that have MIR input while they are not present with llvm-ir input that was used to create given MIR for a -run-pass test. Differential Revision: https://reviews.llvm.org/D68946 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375502 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-22 14:25:37 +00:00
Petar Avramovic	ee57dd4921	[MIPS GlobalISel] Select MSA vector generic and builtin add Select vector G_ADD for MIPS32 with MSA. We have to set bank for vector operands to fprb and selectImpl will do the rest. __builtin_msa_addv_<format> will be transformed into G_ADD in legalizeIntrinsic and selected in the same way. __builtin_msa_addvi_<format> will be directly selected into ADDVI_<format> in legalizeIntrinsic. MIR tests for it have unnecessary additional copies. Capture current state of tests with run-pass=legalizer with a test in test/CodeGen/MIR/Mips. Differential Revision: https://reviews.llvm.org/D68984 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375501 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-22 13:51:57 +00:00
Nemanja Ivanovic	4a0e8977ca	[PowerPC] Turn on CR-Logical reducer pass This re-commits r375152 which was pulled in r375233 because it broke the EXPENSIVE_CHECKS bot on Windows. The reason for the failure was a bug in the pass that the commit turned on by default. This patch fixes that bug and turns the pass back on. This patch has been verified on the buildbot that originally failed thanks to Simon Pilgrim. Differential revision: https://reviews.llvm.org/D52431 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375497 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-22 12:20:38 +00:00
Eugene Leviant	55b4ec9194	[ThinLTO] Don't internalize during promotion Differential revision: https://reviews.llvm.org/D69107 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375493 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-22 09:24:12 +00:00
Simon Pilgrim	1e4c44b0d4	[X86][SSE] Add OR(EXTRACTELT(X,0),OR(EXTRACTELT(X,1))) -> MOVMSK+CMP reduction combine git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375463 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 22:36:31 +00:00
Simon Pilgrim	d8c44a50c7	[X86][SSE] Add OR(EXTRACTELT(X,0),OR(EXTRACTELT(X,1))) movmsk v2X64 tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375462 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 22:36:07 +00:00
Austin Kerbow	34e11f6a0a	AMDGPU/GlobalISel: Legalize fast unsafe FDIV Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69231 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375460 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 22:18:26 +00:00
Roman Lebedev	d8ca3c0d57	[CVP] No-wrap deduction for `shl` Summary: This is the last `OverflowingBinaryOperator` for which we don't deduce flags. D69217 taught `ConstantRange::makeGuaranteedNoWrapRegion()` about it. The effect is better than of the `mul` patch (D69203): \| statistic \| old \| new \| delta \| % change \| \| correlated-value-propagation.NumAddNUW \| 7145 \| 7144 \| -1 \| -0.0140% \| \| correlated-value-propagation.NumAddNW \| 12126 \| 12125 \| -1 \| -0.0082% \| \| correlated-value-propagation.NumAnd \| 443 \| 446 \| 3 \| 0.6772% \| \| correlated-value-propagation.NumNSW \| 5986 \| 7158 \| 1172 \| 19.5790% \| \| correlated-value-propagation.NumNUW \| 10512 \| 13304 \| 2792 \| 26.5601% \| \| correlated-value-propagation.NumNW \| 16498 \| 20462 \| 3964 \| 24.0272% \| \| correlated-value-propagation.NumShlNSW \| 0 \| 1172 \| 1172 \| \| \| correlated-value-propagation.NumShlNUW \| 0 \| 2793 \| 2793 \| \| \| correlated-value-propagation.NumShlNW \| 0 \| 3965 \| 3965 \| \| \| instcount.NumAShrInst \| 13824 \| 13790 \| -34 \| -0.2459% \| \| instcount.NumAddInst \| 277584 \| 277586 \| 2 \| 0.0007% \| \| instcount.NumAndInst \| 66061 \| 66056 \| -5 \| -0.0076% \| \| instcount.NumBrInst \| 709153 \| 709147 \| -6 \| -0.0008% \| \| instcount.NumICmpInst \| 483709 \| 483708 \| -1 \| -0.0002% \| \| instcount.NumSExtInst \| 79497 \| 79496 \| -1 \| -0.0013% \| \| instcount.NumShlInst \| 40691 \| 40654 \| -37 \| -0.0909% \| \| instcount.NumSubInst \| 61997 \| 61996 \| -1 \| -0.0016% \| \| instcount.NumZExtInst \| 68208 \| 68211 \| 3 \| 0.0044% \| \| instcount.TotalBlocks \| 843916 \| 843910 \| -6 \| -0.0007% \| \| instcount.TotalInsts \| 7387528 \| 7387448 \| -80 \| -0.0011% \| Reviewers: nikic, reames, sanjoy, timshen Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69277 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375455 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 21:31:19 +00:00
Quentin Colombet	c1a274e767	[GISel][CombinerHelper] Add a combine turning shuffle_vector into concat_vectors Teach the CombinerHelper how to turn shuffle_vectors, that concatenate vectors, into concat_vectors and add this combine to the AArch64 pre-legalizer combiner. Differential Revision: https://reviews.llvm.org/D69149 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375452 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 20:39:58 +00:00
Matt Arsenault	208bbb1797	AMDGPU: Use CopyToReg for interp intrinsic lowering This doesn't use the default value, so doesn't benefit from the hack to help optimize it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375450 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 19:53:49 +00:00
Matt Arsenault	fdc698b726	AMDGPU: Erase redundant redefs of m0 in SIFoldOperands Only handle simple inter-block redefs of m0 to the same value. This avoids interference from redefs of m0 in SILoadStoreOptimzer. I was initially teaching that pass to ignore redefs of m0, but having them not exist beforehand is much simpler. This is in preparation for deleting the current special m0 handling in SIFixSGPRCopies to allow the register coalescer to handle the difficult cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375449 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 19:53:46 +00:00
Matt Arsenault	ab2c9f7c79	AMDGPU: Stop adding m0 implicit def to SGPR spills r375293 removed the SGPR spilling with scalar stores path, so this is no longer necessary. This also always had the defect of adding the def even when this path wasn't in use. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375448 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 19:42:29 +00:00
Stanislav Mekhanoshin	6e6c5e79c9	[AMDGPU] Select AGPR in PHI operand legalization If a PHI defines AGPR legalize its operands to AGPR. At the moment we can get an AGPR PHI with VGPR operands. I am not aware of any problems as it seems to be handled gracefully in RA, but this is not right anyway. It also slightly decreases VGPR pressure in some cases because we do not have to a copy via VGPR. Differential Revision: https://reviews.llvm.org/D69206 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375446 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 19:25:27 +00:00
Sander de Smalen	f5e25f84fa	Reverted r375425 as it broke some buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375444 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 19:11:40 +00:00
Roman Lebedev	c0f32bd552	[NFC][CVP] Add `shl` no-wrap deduction test coverage git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375441 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 18:35:26 +00:00
Simon Pilgrim	aa0a5ffb56	[PowerPC] Regenerate test for D52431 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375435 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 17:45:51 +00:00
Sander de Smalen	4548d296c1	[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2) Commit message from D66935: This patch fixes a bug exposed by D65653 where a subsequent invocation of `determineCalleeSaves` ends up with a different size for the callee save area, leading to different frame-offsets in debug information. In the invocation by PEI, `determineCalleeSaves` tries to determine whether it needs to spill an extra callee-saved register to get an emergency spill slot. To do this, it calls 'estimateStackSize' and manually adds the size of the callee-saves to this. PEI then allocates the spill objects for the callee saves and the remaining frame layout is calculated accordingly. A second invocation in LiveDebugValues causes estimateStackSize to return the size of the stack frame including the callee-saves. Given that the size of the callee-saves is added to this, these callee-saves are counted twice, which leads `determineCalleeSaves` to believe the stack has become big enough to require spilling an extra callee-save as emergency spillslot. It then updates CalleeSavedStackSize with a larger value. Since CalleeSavedStackSize is used in the calculation of the frame offset in getFrameIndexReference, this leads to incorrect offsets for variables/locals when this information is recalculated after PEI. This patch fixes the lldb unit tests in `functionalities/thread/concurrent_events/*` Changes after D66935: Ensures AArch64FunctionInfo::getCalleeSavedStackSize does not return the uninitialized CalleeSavedStackSize when running `llc` on a specific pass where the MIR code has already been expected to have gone through PEI. Instead, getCalleeSavedStackSize (when passed the MachineFrameInfo) will try to recalculate the CalleeSavedStackSize from the CalleeSavedInfo. In debug mode, the compiler will assert the recalculated size equals the cached size as calculated through a call to determineCalleeSaves. This fixes two tests: test/DebugInfo/AArch64/asan-stack-vars.mir test/DebugInfo/AArch64/compiler-gen-bbs-livedebugvalues.mir that otherwise fail when compiled using msan. Reviewed By: omjavaid, efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D68783 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375425 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 17:12:56 +00:00
Jay Foad	a3253c0261	Pre-commit test cases for D64713. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375418 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 15:01:59 +00:00
David Green	8c479efe73	[ARM] Extra qdadd patterns This adds some new qdadd patterns to go along with the other recently added qadd's. Differential Revision: https://reviews.llvm.org/D68999 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375414 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 14:06:49 +00:00
David Green	68b7d2e092	[ARM] Add qadd lowering from a sadd_sat This lowers a sadd_sat to a qadd by treating it as legal. Also adds qsub at the same time. The qadd instruction sets the q flag, but we already have many cases where we do not model this in llvm. Differential Revision: https://reviews.llvm.org/D68976 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375411 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 12:33:46 +00:00
George Rimar	cc870b19ca	[llvm/Object] - Make ELFObjectFile::getRelocatedSection return Expected<section_iterator> It returns just a section_iterator currently and have a report_fatal_error call inside. This change adds a way to return errors and handle them on caller sides. The patch also changes/improves current users and adds test cases. Differential revision: https://reviews.llvm.org/D69167 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375408 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 11:06:38 +00:00
George Rimar	685ae98920	[obj2yaml] - Fix a comment. NFC. I forgot to address this nit before committing.. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375405 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 10:40:43 +00:00
George Rimar	63b9649758	[obj2yaml] - Stop triggering UB when dumping corrupted strings. We have a following code to find quote type: if (isspace(S.front()) \|\| isspace(S.back())) ... Problem is that: "int isspace( int ch ): The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to EOF." (https://en.cppreference.com/w/cpp/string/byte/isspace) This patch shows how this UB can be triggered and fixes an issue. Differential revision: https://reviews.llvm.org/D69160 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375404 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 10:38:03 +00:00
Sam Elliott	7fc698b06a	[MemCpyOpt] Fixing Incorrect Code Motion while Handling Aggregate Type Values Summary: When MemCpyOpt is handling aggregate type values, if an instruction (let's call it P) between the targeting load (L) and store (S) clobbers the source pointer of L, it will try to hoist S before P. This process will also hoist S's data dependency instructions. However, the current implementation has a bug that if one of S's dependency instructions is //also// a user of P, MemCpyOpt will not prevent it from being hoisted above P and cause a use-before-define error. For example, in the newly added test file (i.e. `aggregate-type-crash.ll`), it will try to hoist both `store %my_struct %1, %my_struct* %3` and its dependent, `%3 = bitcast i8* %2 to %my_struct`, above `%2 = call i8 @my_malloc(%my_struct* %0)`. Creating the following BB: ``` entry: %1 = bitcast i8* %4 to %my_struct* %2 = bitcast %my_struct* %1 to i8* %3 = bitcast %my_struct* %0 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %2, i8* align 4 %3, i64 8, i1 false) %4 = call i8* @my_malloc(%my_struct* %0) ret void ``` Where there is a use-before-define error between `%1` and `%4`. Update: The compiler for the Pony Programming Language [also encounter the same bug](https://github.com/ponylang/ponyc/issues/3140) Patch by Min-Yih Hsu (myhsu) Reviewers: eugenis, pcc, dblaikie, dneilson, t.p.northover, lattner Reviewed By: eugenis Subscribers: lenary, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66060 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375403 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 10:00:34 +00:00
David Green	32893f0502	[ARM] Lower sadd_sat to qadd8 and qadd16 Lower the target independent signed saturating intrinsics to qadd8 and qadd16. This custom lowers them from a sadd_sat, catching the node early before it is promoted. It also adds a QADD8b and QADD16b node to mean the bottom "lane" of a qadd8/qadd16, so that we can call demand bits on it to show that it does not use the upper bits. Also handles QSUB8 and QSUB16. Differential Revision: https://reviews.llvm.org/D68974 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375402 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 09:53:38 +00:00
David Green	7f7fc30ded	[ARM] Add and adjust saturation tests for upcoming qadd changes. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375401 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 09:43:37 +00:00
Roman Lebedev	3ddad0b0b1	[CVP] Deduce no-wrap on `mul` Summary: `ConstantRange::makeGuaranteedNoWrapRegion()` knows how to deal with `mul` since rL335646, there is exhaustive test coverage. This is already used by CVP's `processOverflowIntrinsic()`, and by SCEV's `StrengthenNoWrapFlags()` That being said, currently, this doesn't help much in the end: \| statistic \| old \| new \| delta \| percentage \| \| correlated-value-propagation.NumMulNSW \| 4 \| 275 \| 271 \| 6775.00% \| \| correlated-value-propagation.NumMulNUW \| 4 \| 1323 \| 1319 \| 32975.00% \| \| correlated-value-propagation.NumMulNW \| 8 \| 1598 \| 1590 \| 19875.00% \| \| correlated-value-propagation.NumNSW \| 5715 \| 5986 \| 271 \| 4.74% \| \| correlated-value-propagation.NumNUW \| 9193 \| 10512 \| 1319 \| 14.35% \| \| correlated-value-propagation.NumNW \| 14908 \| 16498 \| 1590 \| 10.67% \| \| instcount.NumAddInst \| 275871 \| 275869 \| -2 \| 0.00% \| \| instcount.NumBrInst \| 708234 \| 708232 \| -2 \| 0.00% \| \| instcount.NumMulInst \| 43812 \| 43810 \| -2 \| 0.00% \| \| instcount.NumPHIInst \| 316786 \| 316784 \| -2 \| 0.00% \| \| instcount.NumTruncInst \| 62165 \| 62167 \| 2 \| 0.00% \| \| instcount.NumUDivInst \| 2528 \| 2526 \| -2 \| -0.08% \| \| instcount.TotalBlocks \| 842995 \| 842993 \| -2 \| 0.00% \| \| instcount.TotalInsts \| 7376486 \| 7376478 \| -8 \| 0.00% \| (^ test-suite plain, tests still pass) Reviewers: nikic, reames, luqmana, sanjoy, timshen Reviewed By: reames Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69203 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375396 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 08:21:44 +00:00
Piotr Sobczak	ec10cb25d9	[InstCombine] Allow values with multiple users in SimplifyDemandedVectorElts Summary: Allow for ignoring the check for a single use in SimplifyDemandedVectorElts to be able to simplify operands if DemandedElts is known to contain the union of elements used by all users. It is a responsibility of a caller of SimplifyDemandedVectorElts to supply correct DemandedElts. Simplify a series of extractelement instructions if only a subset of elements is used. Reviewers: reames, arsenm, majnemer, nhaehnle Reviewed By: nhaehnle Subscribers: wdng, jvesely, nhaehnle, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67345 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375395 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 08:12:47 +00:00
Yevgeny Rouban	53fb41197c	[IR] Fix mayReadFromMemory() for writeonly calls Current implementation of Instruction::mayReadFromMemory() returns !doesNotAccessMemory() which is !ReadNone. This does not take into account that the writeonly attribute also indicates that the call does not read from memory. The patch changes the predicate to !doesNotReadMemory() that reflects the intended behavior. Differential Revision: https://reviews.llvm.org/D69086 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375389 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 06:52:08 +00:00
Yonghong Song	adc79ba5db	[BPF] fix indirect call assembly code Currently, for indirect call, the assembly code printed out as callx <imm> This is not right, it should be callx <reg> Fixed the issue with proper format. Differential Revision: https://reviews.llvm.org/D69229 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375386 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 03:22:03 +00:00
Johannes Doerfert	463d8212b2	[Attributor] Teach AANoCapture to use information in-flight more aggressively AAReturnedValues, AAMemoryBehavior, and AANoUnwind, can provide information that helps during the tracking or even justifies no-capture. We now use this information and enable no-capture in some test cases designed a long while a ago for these cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375382 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-21 00:48:42 +00:00
Craig Topper	f5c1edb2b9	[X86] Check Subtarget.hasSSE3() before calling shouldUseHorizontalOp and emitting X86ISD::FHADD in LowerUINT_TO_FP_i64. This was a regression from r375341. Fixes PR43729. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375381 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 23:54:19 +00:00
Philip Reames	9efd72ad02	[IndVars] Eliminate loop exits with equivalent exit counts We can end up with two loop exits whose exit counts are equivalent, but whose textual representation is different and non-obvious. For the sub-case where we have a series of exits which dominate one another (common), eliminate any exits which would iterate after a previous exit on the exiting iteration. As noted in the TODO being removed, I'd always thought this was a good idea, but I've now seen this in a real workload as well. Interestingly, in review, Nikita pointed out there's let another oppurtunity to leverage SCEV's reasoning. If we kept track of the min of dominanting exits so far, we could discharge exits with EC >= MDE. This is less powerful than the existing transform (since later exits aren't considered), but potentially more powerful for any case where SCEV can prove a >= b, but neither a == b or a > b. I don't have an example to illustrate that oppurtunity, but won't be suprised if we find one and return to handle that case as well. Differential Revision: https://reviews.llvm.org/D69009 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375379 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 23:38:02 +00:00
Roman Lebedev	b93a52f5fe	[InstCombine] conditional sign-extend of high-bit-extract: 'or' pattern. In this pattern, all the "magic" bits that we'd `add` are all high sign bits, and in the value we'd be adding to they are all unset, not unexpectedly, so we can have an `or` there: https://rise4fun.com/Alive/ups It is possible that `haveNoCommonBitsSet()` should be taught about this pattern so that we never have an `add` variant, but the reasoning would need to be recursive (because of that `select`), so i'm not really sure that would be worth it just yet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375378 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 20:52:06 +00:00
Roman Lebedev	150b0bedb7	[NFC][InstCombine] conditional sign-extend of high-bit-extract: 'and' pat. can be 'or' pattern. In this pattern, all the "magic" bits that we'd add are all high sign bits, and in the value we'd be adding to they are all unset, not unexpectedly, so we can have an `or` there: https://rise4fun.com/Alive/ups git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375377 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 20:51:37 +00:00
Nikita Popov	e568120da3	[InstCombine] Fold uadd.sat(a, b) == 0 and usub.sat(a, b) == 0 This adds folds for comparing uadd.sat/usub.sat with zero: * uadd.sat(a, b) == 0 => a == 0 && b == 0 => (a \| b) == 0 * usub.sat(a, b) == 0 => a <= b And inverted forms for !=. Differential Revision: https://reviews.llvm.org/D69224 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375374 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 20:19:42 +00:00
Nikita Popov	3ab0cb15c0	[InstCombine] Add tests for uadd/sub.sat(a, b) == 0; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375372 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 19:50:31 +00:00
Roman Lebedev	32d24a3249	[InstCombine] Shift amount reassociation in shifty sign bit test (PR43595) Summary: This problem consists of several parts: * Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`. This is trivial, and easy to do, we have a fold for it. * Shift amount reassociation - if we have two identical shifts, and we can simplify-add their shift amounts together, then we likely can just perform them as a single shift. But this is finicky, has one-use restrictions, and shift opcodes must be identical. But there is a super-pattern where both of these work together. to produce sign bit test from two shifts + comparison. We do indeed already handle this in most cases. But since we get that fold transitively, it has one-use restrictions. And what's worse, in this case the right-shifts aren't required to be identical, and we can't handle that transitively: If the total shift amount is bitwidth-1, only a sign bit will remain in the output value. But if we look at this from the perspective of two shifts, we can't fold - we can't possibly know what bit pattern we'd produce via two shifts, it will be some kind of a mask produced from original sign bit, but we just can't tell it's shape: https://rise4fun.com/Alive/cM0 https://rise4fun.com/Alive/9IN But it will only contain sign bit and zeros. So from the perspective of sign bit test, we're good: https://rise4fun.com/Alive/FRz https://rise4fun.com/Alive/qBU Superb! So the simplest solution is to extend `reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a sudo-analysis mode that will ignore extra-uses, and will only check whether a) those are two right shifts and b) they end up with bitwidth(x)-1 shift amount and return either the original value that we sign-checking, or null. This does not have any functionality change for the existing `reassociateShiftAmtsOfTwoSameDirectionShifts()`. All that being said, as disscussed in the review, this yet again increases usage of instsimplify in instcombine as utility. Some day that may need to be reevaluated. https://bugs.llvm.org/show_bug.cgi?id=43595 Reviewers: spatel, efriedma, vsk Reviewed By: spatel Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68930 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375371 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 19:38:50 +00:00
Matt Arsenault	2df5f8ca5d	AMDGPU: Increase vcc liveness scan threshold Avoids a test regression in a future patch. Also add debug printing on this case, so I waste less time debugging folds in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375367 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 17:44:17 +00:00
Matt Arsenault	f154896069	AMDGPU: Split flat offsets that don't fit in DAG We handle it this way for some other address spaces. Since r349196, SILoadStoreOptimizer has been trying to do this. This is after SIFoldOperands runs, which can change the addressing patterns. It's simpler to just split this earlier. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375366 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 17:34:44 +00:00
Matt Arsenault	2f75f81688	AMDGPU: Add baseline tests for flat offset splitting git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375364 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 16:33:21 +00:00
George Rimar	38d4353156	[yaml2obj][obj2yaml] - Do not create a symbol table by default. This patch tries to resolve problems faced in D68943 and uses some of the code written by Konrad Wilhelm Kleine in that patch. Previously, yaml2obj tool always created a .symtab section. This patch changes that. With it we only create it when have a "Symbols:" tag in the YAML document or when we need to create it because it is used by another section(s). obj2yaml follows the new behavior and does not print "Symbols:" anymore when there is no symbol table. Differential revision: https://reviews.llvm.org/D69041 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375361 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 14:47:17 +00:00
Matt Arsenault	b0113baebf	AMDGPU: Don't error on calls to null or undef Calls to constants should probably be generally handled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375356 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 07:46:04 +00:00
Philip Reames	c624032000	[SCEV] Simplify umin/max of zext and sext of the same value This is a common idiom which arises after induction variables are widened, and we have two or more exit conditions. Interestingly, we don't have instcombine or instsimplify support for this either. Differential Revision: https://reviews.llvm.org/D69006 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375349 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-19 17:23:02 +00:00
Sanjay Patel	0d04cbb578	[TargetLowering][DAGCombine][MSP430] add/use hook for Shift Amount Threshold (1/2) Provides a TLI hook to allow targets to relax the emission of shifts, thus enabling codegen improvements on targets with no multiple shift instructions and cheap selects or branches. Contributes to a Fix for PR43559: https://bugs.llvm.org/show_bug.cgi?id=43559 Patch by: @joanlluch (Joan LLuch) Differential Revision: https://reviews.llvm.org/D69116 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375347 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-19 16:57:02 +00:00
Sanjay Patel	896dde46ee	[MSP430] Shift Amount Threshold in DAGCombine (Baseline Tests); NFC Patch by: @joanlluch (Joan LLuch) Differential Revision: https://reviews.llvm.org/D69099 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375345 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-19 16:29:32 +00:00
Simon Pilgrim	200d59f918	[X86][SSE] lowerV16I8Shuffle - tryToWidenViaDuplication - undef unpack args tryToWidenViaDuplication lowers using the shuffle_v8i16(unpack_v16i8(shuffle_v8i16(x),shuffle_v8i16(x))) pattern, but the unpack only needs the even/odd 16i8 args if the original v16i8 shuffle mask references the even/odd elements - which isn't true for many extension style shuffles. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375342 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-19 13:18:02 +00:00
Simon Pilgrim	39580be1ee	[X86][SSE] LowerUINT_TO_FP_i64 - only use HADDPD for size/fast-hops We were always generating a single source HADDPD, but really we should only do this if shouldUseHorizontalOp says its a good idea. Differential Revision: https://reviews.llvm.org/D69175 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375341 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-19 11:53:48 +00:00

1 2 3 4 5 ...

65934 Commits