RPCS3/llvm - llvm - Free-Git: DMCA Non-Compliant

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-07-01 21:04:04 -04:00

Author	SHA1	Message	Date
Pavel Labath	146f4968ec	MinidumpYAML: Add support for the memory info list stream Summary: The implementation is fairly straight-forward and uses the same patterns as the existing streams. The yaml form does not attempt to preserve the data in the "gaps" that can be created by setting a larger-than-required header or entry size in the stream header, because the existing consumer (lldb) does not make use of the information in the gap in any way, and attempting to preserve that would make the implementation more complicated. Reviewers: amccarth, jhenderson, clayborg Subscribers: llvm-commits, lldb-commits, markmentovai, zturner, JosephTremoulet Tags: #llvm Differential Revision: https://reviews.llvm.org/D68645 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374337 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-10 13:05:46 +00:00
David Green	18330b6155	[ARM] VQADD instructions This selects MVE VQADD from the vector llvm.sadd.sat or llvm.uadd.sat intrinsics. Differential Revision: https://reviews.llvm.org/D68566 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374336 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-10 13:05:04 +00:00
Sanjay Patel	467596061a	[AArch64][x86] add tests for (v)select bit magic; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374334 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-10 12:53:24 +00:00
Mirko Brkusanin	0472427e26	[Mips] Fix 374055 EXPENSIVE_CHECKS build was failing on new test. This is fixed by marking $ra register as undef. Test now has -verify-machineinstrs to check for operand flags. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374320 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-10 12:02:14 +00:00
Thomas Preud'homme	434ca7ad29	[test] Use system locale for mri-utf8.test Summary: llvm-ar's mri-utf8.test test relies on the en_US.UTF-8 locale to be installed for its last RUN line to work. If not installed, the unicode string gets encoded (interpreted) as ascii which fails since the most significant byte is non zero. This commit changes the test to only rely on the system being able to encode the pound sign in its default encoding (e.g. UTF-16 for Microsoft Windows) by always opening the file via input/output redirection. This avoids forcing a given locale to be present and supported. A Byte Order Mark is also added to help recognizing the encoding of the file and its endianness. Reviewers: gbreynoo, MaskRay, rupprecht, JamesNagurne, jfb Subscribers: dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68472 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374318 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-10 11:48:30 +00:00
Oliver Stannard	65b47b2f6b	[IfCvt][ARM] Optimise diamond if-conversion for code size Currently, the heuristics the if-conversion pass uses for diamond if-conversion are based on execution time, with no consideration for code size. This adds a new set of heuristics to be used when optimising for code size. This is mostly target-independent, because the if-conversion pass can see the code size of the instructions which it is removing. For thumb, there are a few passes (insertion of IT instructions, selection of narrow branches, and selection of CBZ instructions) which are run after if conversion and affect these heuristics, so I've added target hooks to better predict the code-size effect of a proposed if-conversion. Differential revision: https://reviews.llvm.org/D67350 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374301 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-10 09:58:28 +00:00
Matt Arsenault	45cd9c275c	AMDGPU: Use SGPR_128 instead of SReg_128 for vregs SGPR_128 only includes the real allocatable SGPRs, and SReg_128 adds the additional non-allocatable TTMP registers. There's no point in allocating SReg_128 vregs. This shrinks the size of the classes regalloc needs to consider, which is usually good. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374284 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-10 07:11:33 +00:00
Craig Topper	9590c4473f	[X86] Add test case for trunc_packus_v16i32_v16i8 with avx512vl+avx512bw and prefer-vector-width=256 and min-legal-vector-width=256. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374283 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-10 06:25:00 +00:00
Johannes Doerfert	794c434225	[Attributor] Handle `null` differently in capture and alias logic Summary: `null` in the default address space (=AS 0) cannot be captured nor can it alias anything. We make this clear now as it can be important for callbacks and other cases later on. In addition, this patch improves the debug output for noalias deduction. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68624 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374280 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-10 05:33:21 +00:00
Chen Zheng	910a8c5515	[PowerPC] add testcase for ppc loop instr form prep - NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374273 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-10 03:00:15 +00:00
Reid Kleckner	d399c97984	[codeview] Try to avoid emitting .cv_loc with line zero Summary: Visual Studio doesn't like it while stepping. It kicks you out of the source view of the file being stepped through and tries to fall back to the disassembly view. Fixes PR43530 The fix is incomplete, because it's possible to have a basic block with no source locations at all. In this case, we don't emit a .cv_loc, but that will result in wrong stepping behavior in the debugger if the layout predecessor of the location-less BB has an unrelated source location. We could try harder to find a valid location that dominates or post-dominates the current BB, but in general it's a dataflow problem, and one still might not exist. I left a FIXME about this. As an alternative, we might want to consider having the middle-end check if its emitting codeview and get it to stop using line zero. Reviewers: akhuang Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68747 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374267 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-10 01:06:01 +00:00
Thomas Lively	9d0b7a9236	[WebAssembly] Fix tests missed in rL374235 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374259 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 23:06:38 +00:00
Matt Arsenault	41cd9eb045	AMDGPU: Don't fold copies to physregs In a future patch, this will help cleanup m0 handling. The register coalescer handles copies from a register that materializes an immediate, but doesn't handle move immediates itself. The virtual register uses will often be allocated to the same register, so there end up being no real copy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374257 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 22:51:42 +00:00
Matt Arsenault	4f540a9a67	AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer This was ignoring the register bank of the input pointer, and isUniformMMO seems overly aggressive. This will now conservatively assume a VGPR in cases where the incoming bank hasn't been determined yet (i.e. is from a loop phi). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374255 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 22:44:49 +00:00
Matt Arsenault	64f5ca7e80	GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374252 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 22:44:43 +00:00
Stanislav Mekhanoshin	7d6a3303df	[AMDGPU] Fixed dpp combine of VOP1 If original instruction did not have source modifiers they were not added to the new DPP instruction as well, even if needed. Differential Revision: https://reviews.llvm.org/D68729 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374241 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 22:02:58 +00:00
Cameron McInally	153c5a24ad	[IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator Also update Clang to call Builder.CreateFNeg(...) for UnaryMinus. Differential Revision: https://reviews.llvm.org/D61675 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374240 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 21:52:15 +00:00
Thomas Lively	44cde01a89	[WebAssembly] Make returns variadic Summary: This is necessary and sufficient to get simple cases of multiple return working with multivalue enabled. More complex cases will require block and loop signatures to be generalized to potentially be type indices as well. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68684 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374235 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 21:42:08 +00:00
Wei Mi	d37615673f	[SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format Currently for Text, Binary and ExtBinary format profiles, when we compile a module with samplefdo, even if there is no function showing up in the profile, we have to load all the function profiles from the profile input. That is a waste of compile time. CompactBinary format profile has already had the support of loading function profiles on demand. In this patch, we add the support to load profile on demand for ExtBinary format. It will work no matter the sections in ExtBinary format profile are compressed or not. Experiment shows it reduces the time to compile a server benchmark by 30%. When profile remapping and loading function profiles on demand are both used, extra work needs to be done so that the loading on demand process will take the name remapping into consideration. It will be addressed in a follow-up patch. Differential Revision: https://reviews.llvm.org/D68601 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374233 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 21:36:03 +00:00
David Blaikie	230cf52e6e	llvm-dwarfdump: Support multiple debug_loclists contributions Also fixing the incorrect "offset" field being computed/printed for each location list. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374232 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 21:25:28 +00:00
Sanjay Patel	f6329e2f94	[ConstProp] add tests for extractelement with undef index; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374210 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 20:14:17 +00:00
Sanjay Patel	3572515419	[InstCombine] add another test for gep inbounds; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374190 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 17:52:26 +00:00
Thomas Lively	8a11fbf479	[WebAssembly] Add builtin and intrinsic for v8x16.swizzle Summary: This clang builtin and corresponding LLVM intrinsic are necessary to expose the exact semantics of the underlying WebAssembly instruction to users. LLVM produces a poison value if the dynamic swizzle indices are greater than the vector size, but the WebAssembly instruction sets the corresponding output lane to zero. Users who depend on this behavior can safely use this builtin. Depends on D68527. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68531 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374189 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 17:45:47 +00:00
Thomas Lively	4a07687da6	[WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering Summary: Adds the new v8x16.swizzle SIMD instruction as specified at https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#swizzling-using-variable-indices. In addition to adding swizzles as a candidate lowering in LowerBUILD_VECTOR, also rewrites and simplifies the lowering to minimize the number of replace_lanes necessary rather than trying to minimize code size. This leads to more uses of v128.const instead of splats, which is expected to increase performance. The new code will be easier to tune once V8 implements all the vector construction operations, and it will also be easier to add new candidate instructions in the future if necessary. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68527 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374188 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 17:39:19 +00:00
Kevin P. Neal	f62ff89b28	[FPEnv][NFC] Change test to conform to strictfp attribute rules. In particular, the function definition is not marked strictfp despite containing a function marked strictfp. Also, if any function call is marked strictfp then all function calls in that function must be marked. This change to move the one strictfp call to a new properly marked function meets all the new rules. Tested with a stricter version of D68233. Reviewed by: spatel Approved by: spatel Differential Revision: https://reviews.llvm.org/D68713 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374186 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 17:24:56 +00:00
Sanjay Patel	006958b5ff	[SLP] respect target register width for GEP vectorization (PR43578) We failed to account for the target register width (max vector factor) when vectorizing starting from GEPs. This causes vectorization to proceed to obviously illegal widths as in: https://bugs.llvm.org/show_bug.cgi?id=43578 For x86, this also means that SLP can produce rogue AVX or AVX512 code even when the user specifies a narrower vector width. The AArch64 test in ext-trunc.ll appears to be better using the narrower width. I'm not exactly sure what getelementptr.ll is trying to do, but it's testing with "-slp-threshold=-18", so I'm not worried about those diffs. The x86 test is an over-reduction from SPEC h264; this patch appears to restore the perf loss caused by SLP when using -march=haswell. Differential Revision: https://reviews.llvm.org/D68667 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374183 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 16:32:49 +00:00
Momchil Velikov	f4f16f170b	[AArch64] Ensure no tagged memory is left in the unallocated portion of the stack This patch makes sure that if we tag some memory, we untag that memory before the function returns/throws via any exit, reachable from the tag operation. For that we place the untag operation either at: a) the lifetime end call for the alloca, if that call post-dominates the lifetime start call (where the tag operation is placed), or it (the lifetime end call) dominates all reachable exits, otherwise b) at the reachable exits Differential Revision: https://reviews.llvm.org/D68469 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374182 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 16:31:50 +00:00
Jonas Devlieghere	f4d80f04c0	Re-land "[dsymutil] Fix handling of common symbols in multiple object files." The original patch got reverted because it hit a long-standing legacy issue on Windows that prevents files from being named `com`. Thanks Kristina & Jeremy for pointing this out. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374178 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 16:19:13 +00:00
Alina Sbirlea	83774c7e99	[MemorySSA] Make the use of moveAllAfterMergeBlocks consistent. Summary: The rule for the moveAllAfterMergeBlocks API si for all instructions from `From` to have been moved to `To`, while keeping the CFG edges (and block terminators) unchanged. Update all the callsites for moveAllAfterMergeBlocks to follow this. Pending follow-up: since the same behavior is needed everytime, merge all callsites into one. The common denominator may be the call to `MergeBlockIntoPredecessor`. Resolves PR43569. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68659 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374177 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 15:54:24 +00:00
David Green	4086525278	Add and adjust saturating tests. NFC This adds some extra testing to the existing [su][add/sub]_sat X86 and AArch64 tests and adds equivalent tests for ARM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374169 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 14:17:38 +00:00
Sjoerd Meijer	0625648970	[LV] Emitting SCEV checks with OptForSize When optimising for size and SCEV runtime checks need to be emitted to check overflow behaviour, the loop vectorizer can run in this assert: LoopVectorize.cpp:2699: void llvm::InnerLoopVectorizer::emitSCEVChecks( llvm::Loop , llvm::BasicBlock ): Assertion `!BB->getParent()->hasOptSize() && "Cannot SCEV check stride or overflow when opt We should not generate predicates while optimising for size because code will be generated for predicates such as these SCEV overflow runtime checks. This should fix PR43371. Differential Revision: https://reviews.llvm.org/D68082 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374166 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 13:19:41 +00:00
Simon Pilgrim	61fe65b495	[CostModel][X86] Add tests for insertelement to non-immediate vector element indices git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374161 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 12:36:34 +00:00
Simon Pilgrim	3a6057eae9	[CostModel][X86] Add tests for extractelement from non-immediate vector element indices git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374160 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 12:36:22 +00:00
David Green	18faf2fc64	[ARM] Add saturating arithmetic tests for MVE. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374159 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 12:29:51 +00:00
James Molloy	102de6b935	[TableGen] Fix crash when using HwModes in CodeEmitterGen When an instruction has an encoding definition for only a subset of the available HwModes, ensure we just avoid generating an encoding rather than crash. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374150 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 09:15:34 +00:00
Clement Courbet	bf17945077	[llvm-exegesis] Explore LEA addressing modes. Summary: This will help for PR32326. This shows the well-known issue with `RBP` and `R13` as base registers. Reviewers: gchatelet Subscribers: tschuett, llvm-commits, RKSimon, andreadb Tags: #llvm Differential Revision: https://reviews.llvm.org/D68646 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374146 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 08:49:13 +00:00
Jeremy Morse	33c4130e26	Revert r374139, "[dsymutil] Fix handling of common symbols in multiple object files." The added test files ("com", "com1.o", "com2.o") are reserved names on Windows, and makes 'git checkout' fail with a filesystem error. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374144 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 08:27:48 +00:00
Jonas Devlieghere	74eabf1da0	[dsymutil] Fix handling of common symbols in multiple object files. For common symbols the linker emits only a single symbol entry in the debug map. This caused dsymutil to not relocate common symbols when linking DWARF coming form object files that did not have this entry. This patch fixes that by keeping track of common symbols in the object files and synthesizing a debug map entry for them using the address from the main binary. Differential revision: https://reviews.llvm.org/D68680 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374139 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-09 04:16:18 +00:00
Bill Wendling	98dec20e19	[IA] Add tests for a few other edge cases Test with the last eight bits within the range [7F, FF] and with lower-case hex letters. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374124 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 22:06:09 +00:00
Jonas Devlieghere	eaa022e285	[dsymutil] Improve verbose output (NFC) The verbose output for finding relocations assumed that we'd always dump the DIE after (which starts with a newline) and therefore didn't include one itself. However, this isn't always true, leading to garbled output. This patch adds a newline to the verbose output and adds a line that says that the DIE is being kept (which isn't obvious otherwise). It also adds a 0x prefix to the relocations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374123 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 22:03:13 +00:00
Roman Lebedev	43122072b2	[CVP} Replace SExt with ZExt if the input is known-non-negative Summary: zero-extension is far more friendly for further analysis. While this doesn't directly help with the shift-by-signext problem, this is not unrelated. This has the following effect on test-suite (numbers collected after the finish of middle-end module pass manager): \| Statistic \| old \| new \| delta \| percent change \| \| correlated-value-propagation.NumSExt \| 0 \| 6026 \| 6026 \| +100.00% \| \| instcount.NumAddInst \| 272860 \| 271283 \| -1577 \| -0.58% \| \| instcount.NumAllocaInst \| 27227 \| 27226 \| -1 \| 0.00% \| \| instcount.NumAndInst \| 63502 \| 63320 \| -182 \| -0.29% \| \| instcount.NumAShrInst \| 13498 \| 13407 \| -91 \| -0.67% \| \| instcount.NumAtomicCmpXchgInst \| 1159 \| 1159 \| 0 \| 0.00% \| \| instcount.NumAtomicRMWInst \| 5036 \| 5036 \| 0 \| 0.00% \| \| instcount.NumBitCastInst \| 672482 \| 672353 \| -129 \| -0.02% \| \| instcount.NumBrInst \| 702768 \| 702195 \| -573 \| -0.08% \| \| instcount.NumCallInst \| 518285 \| 518205 \| -80 \| -0.02% \| \| instcount.NumExtractElementInst \| 18481 \| 18482 \| 1 \| 0.01% \| \| instcount.NumExtractValueInst \| 18290 \| 18288 \| -2 \| -0.01% \| \| instcount.NumFAddInst \| 139035 \| 138963 \| -72 \| -0.05% \| \| instcount.NumFCmpInst \| 10358 \| 10348 \| -10 \| -0.10% \| \| instcount.NumFDivInst \| 30310 \| 30302 \| -8 \| -0.03% \| \| instcount.NumFenceInst \| 387 \| 387 \| 0 \| 0.00% \| \| instcount.NumFMulInst \| 93873 \| 93806 \| -67 \| -0.07% \| \| instcount.NumFPExtInst \| 7148 \| 7144 \| -4 \| -0.06% \| \| instcount.NumFPToSIInst \| 2823 \| 2838 \| 15 \| 0.53% \| \| instcount.NumFPToUIInst \| 1251 \| 1251 \| 0 \| 0.00% \| \| instcount.NumFPTruncInst \| 2195 \| 2191 \| -4 \| -0.18% \| \| instcount.NumFSubInst \| 92109 \| 92103 \| -6 \| -0.01% \| \| instcount.NumGetElementPtrInst \| 1221423 \| 1219157 \| -2266 \| -0.19% \| \| instcount.NumICmpInst \| 479140 \| 478929 \| -211 \| -0.04% \| \| instcount.NumIndirectBrInst \| 2 \| 2 \| 0 \| 0.00% \| \| instcount.NumInsertElementInst \| 66089 \| 66094 \| 5 \| 0.01% \| \| instcount.NumInsertValueInst \| 2032 \| 2030 \| -2 \| -0.10% \| \| instcount.NumIntToPtrInst \| 19641 \| 19641 \| 0 \| 0.00% \| \| instcount.NumInvokeInst \| 21789 \| 21788 \| -1 \| 0.00% \| \| instcount.NumLandingPadInst \| 12051 \| 12051 \| 0 \| 0.00% \| \| instcount.NumLoadInst \| 880079 \| 878673 \| -1406 \| -0.16% \| \| instcount.NumLShrInst \| 25919 \| 25921 \| 2 \| 0.01% \| \| instcount.NumMulInst \| 42416 \| 42417 \| 1 \| 0.00% \| \| instcount.NumOrInst \| 100826 \| 100576 \| -250 \| -0.25% \| \| instcount.NumPHIInst \| 315118 \| 314092 \| -1026 \| -0.33% \| \| instcount.NumPtrToIntInst \| 15933 \| 15939 \| 6 \| 0.04% \| \| instcount.NumResumeInst \| 2156 \| 2156 \| 0 \| 0.00% \| \| instcount.NumRetInst \| 84485 \| 84484 \| -1 \| 0.00% \| \| instcount.NumSDivInst \| 8599 \| 8597 \| -2 \| -0.02% \| \| instcount.NumSelectInst \| 45577 \| 45913 \| 336 \| 0.74% \| \| instcount.NumSExtInst \| 84026 \| 78278 \| -5748 \| -6.84% \| \| instcount.NumShlInst \| 39796 \| 39726 \| -70 \| -0.18% \| \| instcount.NumShuffleVectorInst \| 100272 \| 100292 \| 20 \| 0.02% \| \| instcount.NumSIToFPInst \| 29131 \| 29113 \| -18 \| -0.06% \| \| instcount.NumSRemInst \| 1543 \| 1543 \| 0 \| 0.00% \| \| instcount.NumStoreInst \| 805394 \| 804351 \| -1043 \| -0.13% \| \| instcount.NumSubInst \| 61337 \| 61414 \| 77 \| 0.13% \| \| instcount.NumSwitchInst \| 8527 \| 8524 \| -3 \| -0.04% \| \| instcount.NumTruncInst \| 60523 \| 60484 \| -39 \| -0.06% \| \| instcount.NumUDivInst \| 2381 \| 2381 \| 0 \| 0.00% \| \| instcount.NumUIToFPInst \| 5549 \| 5549 \| 0 \| 0.00% \| \| instcount.NumUnreachableInst \| 9855 \| 9855 \| 0 \| 0.00% \| \| instcount.NumURemInst \| 1305 \| 1305 \| 0 \| 0.00% \| \| instcount.NumXorInst \| 10230 \| 10081 \| -149 \| -1.46% \| \| instcount.NumZExtInst \| 60353 \| 66840 \| 6487 \| 10.75% \| \| instcount.TotalBlocks \| 829582 \| 829004 \| -578 \| -0.07% \| \| instcount.TotalFuncs \| 83818 \| 83817 \| -1 \| 0.00% \| \| instcount.TotalInsts \| 7316574 \| 7308483 \| -8091 \| -0.11% \| TLDR: we produce -0.11% less instructions, -6.84% less `sext`, +10.75% more `zext`. To be noted, clearly, not all new `zext`'s are produced by this fold. (And now i guess it might have been interesting to measure this for D68103 :S) Reviewers: nikic, spatel, reames, dberlin Reviewed By: nikic Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68654 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374112 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 20:29:48 +00:00
Roman Lebedev	e2796a6d06	[CVP][NFC] Revisit sext vs. zext test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374111 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 20:29:36 +00:00
Yonghong Song	7be5366d0d	[BPF] do compile-once run-everywhere relocation for bitfields A bpf specific clang intrinsic is introduced: u32 __builtin_preserve_field_info(member_access, info_kind) Depending on info_kind, different information will be returned to the program. A relocation is also recorded for this builtin so that bpf loader can patch the instruction on the target host. This clang intrinsic is used to get certain information to facilitate struct/union member relocations. The offset relocation is extended by 4 bytes to include relocation kind. Currently supported relocation kinds are enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; for __builtin_preserve_field_info. The old access offset relocation is covered by FIELD_BYTE_OFFSET = 0. An example: struct s { int a; int b1:9; int b2:4; }; enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; void bpf_probe_read(void , unsigned, const void ); int field_read(struct s arg) { unsigned long long ull = 0; unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET); unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE); #ifdef USE_PROBE_READ bpf_probe_read(&ull, size, (const void )arg + offset); unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ lshift = lshift + (size << 3) - 64; #endif #else switch(size) { case 1: ull = (unsigned char )((void )arg + offset); break; case 2: ull = (unsigned short )((void )arg + offset); break; case 4: ull = (unsigned int )((void )arg + offset); break; case 8: ull = (unsigned long long )((void )arg + offset); break; } unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #endif ull <<= lshift; if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS)) return (long long)ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); } There is a minor overhead for bpf_probe_read() on big endian. The code and relocation generated for field_read where bpf_probe_read() is used to access argument data on little endian mode: r3 = r1 r1 = 0 r1 = 4 <=== relocation (FIELD_BYTE_OFFSET) r3 += r1 r1 = r10 r1 += -8 r2 = 4 <=== relocation (FIELD_BYTE_SIZE) call bpf_probe_read r2 = 51 <=== relocation (FIELD_LSHIFT_U64) r1 = (u64 )(r10 - 8) r1 <<= r2 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) r0 = r1 r0 >>= r2 r3 = 1 <=== relocation (FIELD_SIGNEDNESS) if r3 == 0 goto LBB0_2 r1 s>>= r2 r0 = r1 LBB0_2: exit Compare to the above code between relocations FIELD_LSHIFT_U64 and FIELD_LSHIFT_U64, the code with big endian mode has four more instructions. r1 = 41 <=== relocation (FIELD_LSHIFT_U64) r6 += r1 r6 += -64 r6 <<= 32 r6 >>= 32 r1 = (u64 )(r10 - 8) r1 <<= r6 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) The code and relocation generated when using direct load. r2 = 0 r3 = 4 r4 = 4 if r4 s> 3 goto LBB0_3 if r4 == 1 goto LBB0_5 if r4 == 2 goto LBB0_6 goto LBB0_9 LBB0_6: # %sw.bb1 r1 += r3 r2 = (u16 )(r1 + 0) goto LBB0_9 LBB0_3: # %entry if r4 == 4 goto LBB0_7 if r4 == 8 goto LBB0_8 goto LBB0_9 LBB0_8: # %sw.bb9 r1 += r3 r2 = (u64 )(r1 + 0) goto LBB0_9 LBB0_5: # %sw.bb r1 += r3 r2 = (u8 )(r1 + 0) goto LBB0_9 LBB0_7: # %sw.bb5 r1 += r3 r2 = (u32 )(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 r2 <<= r1 r1 = 60 r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit Considering verifier is able to do limited constant propogation following branches. The following is the code actually traversed. r2 = 0 r3 = 4 <=== relocation r4 = 4 <=== relocation if r4 s> 3 goto LBB0_3 LBB0_3: # %entry if r4 == 4 goto LBB0_7 LBB0_7: # %sw.bb5 r1 += r3 r2 = (u32 )(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 <=== relocation r2 <<= r1 r1 = 60 <=== relocation r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit For native load case, the load size is calculated to be the same as the size of load width LLVM otherwise used to load the value which is then used to extract the bitfield value. Differential Revision: https://reviews.llvm.org/D67980 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374099 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 18:23:17 +00:00
Matt Arsenault	53208ba3c0	AMDGPU: Fix i16 arithmetic pattern redundancy There were 2 problems here. First, these patterns were duplicated to handle the inverted shift operands instead of using the commuted PatFrags. Second, the point of the zext folding patterns don't apply to the non-0ing high subtargets. They should be skipped instead of inserting the extension. The zeroing high code would be emitted when necessary anyway. This was also emitting unnecessary zexts in cases where the high bits were undefined. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374092 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 17:36:38 +00:00
Jinsong Ji	cf65f7210c	Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize" Also Revert "[LoopVectorize] Fix non-debug builds after rL374017" This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a. This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04. The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374091 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 17:32:56 +00:00
Sanjay Patel	d698704619	[SLP] add test with prefer-vector-width function attribute; NFC (PR43578) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374090 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 17:18:32 +00:00
Tom Stellard	80fd3bf4c9	AMDGPU: Add offsets to MMO when lowering buffer intrinsics Summary: Without offsets on the MachineMemOperands (MMOs), MachineInstr::mayAlias() will return true for all reads and writes to the same resource descriptor. This leads to O(N^2) complexity in the MachineScheduler when analyzing dependencies of buffer loads and stores. It also limits the SILoadStoreOptimizer from merging more instructions. This patch reduces the compile time of one pathological compute shader from 12 seconds to 1 second. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65097 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374087 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 17:04:51 +00:00
Hideto Ueno	03c92d2e97	[Attributor][Fix] Temporary fix for windows build bot failure D65402 causes test failure related to attributor-max-iterations. This commit removes attributor-max-iterations-verify for now. I'll examine the factor and the flag should be reverted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374086 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 17:01:56 +00:00
Roman Lebedev	1f32e45f41	[NFC][CVP] Add tests where we can replace sext with zext If the sign bit of the value that is being sign-extended is not set, i.e. the value is non-negative (s>= 0), then zero-extension will suffice, and is better for analysis: https://rise4fun.com/Alive/a8PD git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374075 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 16:21:13 +00:00
Amaury Sechet	265327e7cf	(Re)generate various tests. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374074 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-08 16:16:26 +00:00

... 5 6 7 8 9 ...

65934 Commits