llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-28 00:07:22 +00:00

Author	SHA1	Message	Date
Igor Kudrin	2e7956d059	[DWARF] Add support for 64-bit DWARF in .debug_names. Differential Revision: https://reviews.llvm.org/D72900	2020-01-31 16:12:35 +07:00
Sebastian Neubauer	c93b68ebf4	Fix typo Summary: Fix typo Subscribers: jvesely, nhaehnle, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73458	2020-01-31 08:48:22 +01:00
Jonas Devlieghere	b9974cc091	[SmallString] Use data() instead of begin() (NFC) Both begin() and data() do the same thing for the SmallString case, but the std::string and llvm::StringRef constructors that are being called are defined as taking a pointer and size. Addresses Craig Topper's feedback in https://reviews.llvm.org/D73640	2020-01-30 20:15:38 -08:00
Quentin Colombet	90b879f9d1	[GISel][KnownBits] Fix a bug where we could run out of stack space One of the exit criteria of computeKnownBits is whether we reach the max recursive call depth. Before this patch we would check that the depth is exactly equal to max depth to exit. Depth may get bigger than max depth if it gets passed to a different GISelKnownBits object. This may happen when say a generic part uses a GISelKnownBits object with some max depth, but then we hit TL.computeKnownBitsForTargetInstr which creates a new GISelKnownBits object with a different and smaller depth. In that situation, when we hit the max depth check for the first time in the target specific GISelKnownBits object, depth may already be bigger than the current max depth. Hence we would continue to compute the known bits, until we ran through the full depth of the chain of computation or ran out of stack space. For instance, let say we have GISelKnownBits Info(/MaxDepth/ = 10); Info.getKnownBits(Foo) // 9 recursive calls to computeKnownBitsImpl. // Then we hit a target specific instruction. // The target specific GISelKnownBits does this: GISelKnownBits TargetSpecificInfo(/MaxDepth/ = 6) TargetSpecificInfo.computeKnownBitsImpl() // <-- next max depth checks would // always return false. This commit does not have any test case, none of the in-tree targets use computeKnownBitsForTargetInstr.	2020-01-30 19:30:39 -08:00
Fangrui Song	7e59988a3f	[llvm-objcopy][test] Fix tests when path contains "bar" Differential Revision: https://reviews.llvm.org/D72358	2020-01-30 17:56:12 -08:00
Fangrui Song	3a5d1f8cce	[X86][ELF] Prefer to lower MC_GlobalAddress operands to .Lfoo$local For a MC_GlobalAddress reference to a dso_local external GlobalValue with a definition, emit .Lfoo$local to avoid a relocation. -fno-pic and -fpie can infer dso_local but -fpic cannot. In the future, we can explore the possibility of inferring dso_local with -fpic. As the description of D73228 says, LLVM's existing IPO optimization behaviors (like -fno-semantic-interposition) and a previous assembly behavior give us enough license to be aggressive here. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D73230	2020-01-30 17:52:35 -08:00
Leonard Chan	0458fa3960	[SafeStack][DebugInfo] Insert DW_OP_deref in correct location This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585 where a DW_OP_deref was placed at the end of a dwarf expression, resulting in corrupt symbols when debugging. This is an attempt to reland with a few fixes for buildbot since I haven't merged from master in a bit. Differential Revision: https://reviews.llvm.org/D73526	2020-01-30 17:09:42 -08:00
Amara Emerson	15108f4c5b	[GlobalISel][IRTranslator] When translating vector geps, splat the base pointer if required. We can have geps that have a scalar base pointer, and a vector index value, which means that the base pointer must be splatted into a vector of pointers. This fixes crashes on arm64 GlobalISel with optimizations enabled.	2020-01-30 16:27:27 -08:00
Leonard Chan	383dbddb2a	Revert "[SafeStack][DebugInfo] Insert DW_OP_deref in correct location" This reverts commit fff6a1b0f1fe57b46379001db75952d2a06eab1f. This was breaking a bunch of buildbots.	2020-01-30 16:18:41 -08:00
Mehdi Amini	6f3d452cd5	MSVC Buggy version detection: turn pre-processor error into CMake configuration time check This allows consumer to override in a cleaner way while still prevent them from hitting bug without knowing they run an unsupported configuration. Recommit after fix by Christopher Tetreault to add parens and ${} to cmake check to work around CMake configure time "unknown arguments specified" issue Differential Revision: https://reviews.llvm.org/D73677 Differential Revision: https://reviews.llvm.org/D73751	2020-01-31 00:11:55 +00:00
Leonard Chan	4199a00ff4	[SafeStack][DebugInfo] Insert DW_OP_deref in correct location This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585 where a DW_OP_deref was placed at the end of a dwarf expression, resulting in corrupt symbols when debugging. Differential Revision: https://reviews.llvm.org/D73526	2020-01-30 15:58:37 -08:00
Matt Arsenault	15841df48b	Revert "AMDGPU: Cleanup and fix SMRD offset handling" This reverts commit 17dbc6611df9044d779d85b3d545bd37e5dd5200. A test is failing on some bots	2020-01-30 15:39:51 -08:00
Mehdi Amini	3117224788	Revert "MSVC Buggy version detection: turn pre-processor error into CMake configuration time check" This reverts commit b4fac782462c26baa94798e5fdb58e6810bd336b. It broke the MSVC bot	2020-01-30 23:38:36 +00:00
Matt Arsenault	d86fd3ab4c	AMDGPU: Cleanup and fix SMRD offset handling I believe this also fixes bugs with CI 32-bit handling, which was incorrectly skipping offsets that look like signed 32-bit values. Also validate the offsets are dword aligned before folding.	2020-01-30 15:04:21 -08:00
Matt Arsenault	30934ef72a	CodeGen: Use Register	2020-01-30 15:01:56 -08:00
Jessica Paquette	14d9aae6eb	[AArch64][GlobalISel] Fold in G_ANYEXT/G_ZEXT into TB(N)Z This is similar to the code in getTestBitOperand in AArch64ISelLowering. Instead of implementing all of the TB(N)Z optimizations at once, this patch implements the simplest case first. The way that this is set up should make it fairly easy to add the rest as we go along. The idea here is that after determining that we can use a TB(N)Z, we can continue looking through instructions and perform further folding. In this case, when we have a G_ZEXT or G_ANYEXT where the extended bits are not used, we can fold it into the TB(N)Z. Differential Revision: https://reviews.llvm.org/D73673	2020-01-30 14:51:26 -08:00
Amara Emerson	3ef82d3c0d	[AArch64][GlobalISel] Disallow vectors in convertPtrAddToAdd. Found by inspection, but there's no test for this yet because G_PTR_ADD is currently illegal for vectors. I'll add the test at a later time when the legalizer support has landed.	2020-01-30 14:50:44 -08:00
Nikita Popov	f657486e36	[InstCombine] Remove unnecessary worklist add; NFCI Again, this will already be added by IRBuilder.	2020-01-30 23:24:59 +01:00
David Tenty	6eec8f4ef3	[NFC] Fix check prefix add in fcanonicalize-elimination.ll The test fix added by "D39306: Fix CodeGen/AMDGPU/fcanonicalize-elimination.ll on FreeBSD 11.0" uses a test prefix which is not actually used in the FileCheck stanza. Thus the problem originally encountered still exists and the tests fails for host triples that contain "1.0", including AIX 7.1.0.	2020-01-30 17:19:49 -05:00
Mehdi Amini	1d01960a71	MSVC Buggy version detection: turn pre-processor error into CMake configuration time check This allows consumer to override in a cleaner way while still prevent them from hitting bug without knowing they run an unsupported configuration. Differential Revision: https://reviews.llvm.org/D73677	2020-01-30 22:17:21 +00:00
Matt Arsenault	d10e273920	AMDGPU: Replace subtarget check with an assert This is already checked by the pattern subtarget predicate.	2020-01-30 14:15:26 -08:00
Matt Arsenault	598ce52712	AMDGPU: Don't use separate cache arguments for s_buffer_load node There's not much value to this separate node from the intrinsic. Make the operand structure the same as the intrinsic, so we can reuse the same pattern for GlobalISel.	2020-01-30 14:15:26 -08:00
Nikita Popov	9706f49b77	[InstCombine] Remove unnecessary worklist add; NFCI The IRBuilder will automatically add instructions to the worklist. Adding it manually is unnecessary, but may mess up worklist order.	2020-01-30 23:06:28 +01:00
Nikita Popov	9d1434bbf2	[InstCombine] Create new insts in foldICmpEqIntrinsicWithConstant; NFCI In line with current conventions, create new instructions rather than modify two operands in place and performing manual worklist management. This should be NFC apart from possible worklist order changes.	2020-01-30 23:03:16 +01:00
hsmahesha	7806a32f60	[AMDGPU] Add file headers for few files where it is missing. Summary: Added file headers for files which implement iterative lightweight scheduling strategies. Which is basically an exercise which I undertook in order to get used to LLVM development process. Reviewers: arsenm, vpykhtin, cdevadas Reviewed By: vpykhtin Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73417	2020-01-31 02:06:41 +05:30
Sean Fertile	967d7e85f5	[AIX] Minor cleanup in AsmPrinter. [NFC] - Extends the comments related to function descriptors, noting how they are only used on AIX. - Changes the condition used to gate the creation of the current function symbol in AsmPrinter::SetupMachineFunction to reflect being AIX specific. The creation of the symbol is different because of AIXs linkage conventions, not because AIX uses function descriptors. Differential Revision: https://reviews.llvm.org/D73115	2020-01-30 14:15:02 -05:00
Fangrui Song	d99d69ab22	[AArch64] -fpatchable-function-entry=N,0: place patch label after BTI Summary: For -fpatchable-function-entry=N,0 -mbranch-protection=bti, after 9a24488cb67a90f889529987275c5e411ce01dda, we place the NOP sled after the initial BTI. ``` .Lfunc_begin0: bti c nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lfunc_begin0 ``` This patch adds a label after the initial BTI and changes the __patchable_function_entries entry to reference the label: ``` .Lfunc_begin0: bti c .Lpatch0: nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lpatch0 ``` This placement is compatible with the resolution in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 . A local linkage function whose address is not taken does not need a BTI. Placing the patch label after BTI has the advantage that code does not need to differentiate whether the function has an initial BTI. Reviewers: mrutland, nickdesaulniers, nsz, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73680	2020-01-30 11:11:52 -08:00
Huihui Zhang	ac4a5d6e8a	[ConstantFold][SVE][NFC] Add test for select instruction in scalable vector. Side notes from D73669, no need to guard the iteration on vectors, as it is explicitly looking for a ConstantVector/ConstantDataVector, which is not expected to be scalable at the moment. So, add the test only.	2020-01-30 10:56:12 -08:00
Huihui Zhang	8898d47dbb	[ConstantFold][SVE] Fix constant folding for scalable vector unary operations. Summary: Similar to issue D71445. Scalable vector should not be evaluated element by element. Add support to handle scalable vector UndefValue. Reviewers: sdesmalen, efriedma, apazos, huntergr, willlovett Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73678	2020-01-30 10:45:15 -08:00
Danilo Carvalho Grael	95d6d6606f	[AArch64][SVE] Add remaining SVE2 mla indexed intrinsics. Summary: Add remaining SVE2 mla indexed intrinsics: - sqdmlalb, sqdmlalt, sqdmlslb, sqdmlslt Add suffix _lanes and switch immediate types to i32 for all mla indexed intrinsics to align with ACLE builtin definitions. Reviewers: efriedma, sdesmalen, cameron.mcinally, c-rhodes, rengolin, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, arphaman, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D73633	2020-01-30 13:32:11 -05:00
Teresa Johnson	c96a29c643	[ThinLTO] Disable "Always import constants" due to compile time issues Summary: Disable the always importing of constants introduced in D70404 by default under a new internal option, since it is causing order of magnitude compile time regressions during the thin link. Will continue investigating why the regressions occur. Reviewers: evgeny777, wmi Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73724	2020-01-30 10:12:48 -08:00
Whitney Tsang	7242e653d2	[LoopFusion] Move instructions from FC1.GuardBlock to FC0.GuardBlock and from FC0.ExitBlock to FC1.ExitBlock when proven safe. Summary: Currently LoopFusion give up when the second loop nest guard block or the first loop nest exit block is not empty. For example: if (0 < N) { for (int i = 0; i < N; ++i) {} x+=1; } y+=1; if (0 < N) { for (int i = 0; i < N; ++i) {} } The above example should be safe to fuse. This PR moves instructions in FC1 guard block (e.g. y+=1;) to FC0 guard block, or instructions in FC0 exit block (e.g. x+=1;) to FC1 exit block, which then LoopFusion is able to fuse them. Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel, bmahjour, etiotto Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D73641	2020-01-30 18:02:22 +00:00
Nikita Popov	402abf7a14	[AArch64][ARM] Always expand ordered vector reductions (PR44600) fadd/fmul reductions without reassoc are lowered to VECREDUCE_STRICT_FADD/FMUL nodes, which don't have legalization support. Until that is in place, expand these intrinsics on ARM and AArch64. Other targets always expand the vector reduction intrinsics. Additionally expand fmax/fmin reductions without nonan flag on AArch64, as the backend asserts that the flag is present when lowering VECREDUCE_FMIN/FMAX. This fixes https://bugs.llvm.org/show_bug.cgi?id=44600. Differential Revision: https://reviews.llvm.org/D73135	2020-01-30 18:40:24 +01:00
Roman Lebedev	9f93b68733	[NFC][IndVarSimplify] Autogenerate exit_value_test2.ll check lines	2020-01-30 20:11:02 +03:00
Yonghong Song	3b5deb36b6	[BPF] fix a bug in BPFMISimplifyPatchable pass with -O0 The recommended optimization level for BPF programs is O2 since (1). BPF is running inside the kernel and linux kernel won't work at -O0 level, and (2). Verifier is not able to handle O0 code properly, e.g., potential large stack size and a lot of spills. But we should keep -O0 at least compiling. This patch fixed a bug in BPFMISimplifyPatchable phase where with -O0, a segmentation fault will happen for a simple program like: int test(int a, int b) { return a + b; } A test case is added to capture such a case. Differential Revision: https://reviews.llvm.org/D73681	2020-01-30 08:28:39 -08:00
jasonliu	f77176c4ec	[XCOFF][AIX] Support basic relocation type on AIX Summary: This patch intends to support three most common relocation type on AIX: R_POS, R_TOC, R_RBR. These three relocation type will be needed for object file generation on AIX for small code model. We will have follow up patches to bring relocation support for large code model on AIX. Reviewers: hubert.reinterpretcast, daltenty, DiggerLin Differential Revision: https://reviews.llvm.org/D72027	2020-01-30 15:59:09 +00:00
LLVM GN Syncbot	0ef9aaaf35	[gn build] Port 601687bf731	2020-01-30 15:06:10 +00:00
Alex Richardson	a939b77799	Bring back the tests for update_cc_tests_checks.py The tests were removed in 287307a0c60b68099d5f9dd22ac1db2a42593533 to avoid a dependency on python3. update_cc_tests_checks.py also works with python2 so restore the tests without the python3 dependency.	2020-01-30 14:58:25 +00:00
Stefan Pintilie	d06999c3fa	[PowerPC][Future] Branch Distance Estimation For Prefixed Instructions By adding the prefixed instructions the branch distances are no longer computed correctly. Since prefixed instructions cannot cross a 64 byte boundary we have to assume that a prefixed instruction may have a nop prepended to it. This patch tries to take that nop into consideration when computing the size of basic blocks. Differential Revision: https://reviews.llvm.org/D72572	2020-01-30 08:54:33 -06:00
David Stenberg	35ae52256a	[InstCombine][DebugInfo] Fold constants wrapped in metadata Summary: When constant folding, constants that are wrapped in metadata were not folded. This could lead to dbg.values being the only user of a constant expression, due to the non-dbg uses having been rewritten, resulting in the constant later on being removed by some other pass. This occurred with the attached test case, in which the non-rewritten GEP in the dbg.value intrinsic was later on removed by globalopt. This patch makes the code look through metadata and fold such constants. I guess that we in the future may want to allow dbg.values using GEPs and other constant expressions to be emittable even if there are no non-dbg uses, but for example SelectionDAG does not support that. Reviewers: jmorse, aprantl, vsk, davide Reviewed By: aprantl, vsk, davide Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73630	2020-01-30 15:50:16 +01:00
Matt Arsenault	3e74a9ce5e	AMDGPU/GlobalISel: Don't use pointless getConstantVRegVal This is always a G_CONSTANT already	2020-01-30 09:38:43 -05:00
Nemanja Ivanovic	4962abfeb6	Fix helptext for opt/llc after 14fc20ca6 The commit https://reviews.llvm.org/rG14fc20ca6 added some options to the X86 back end that cause the help text for opt/llc to become much harder to read. The issue is that the cl::value_desc is part of the option name and is used to compute the indentation of the description text (i.e. the maximum length option name is what everything aligns to). Since the commit puts a large number of characters into that text, everything is aligned to that width. This patch just reformats the option so that the description is contained in the description and the list of possible values is within the angle brackets. Note: the readability issue of the helptext was fixed in commit 70cbf8c71c510077baadcad305fea6f62e830b06, but the re-formatting wasn't added on that commit so I am still committing this. Differential revision: https://reviews.llvm.org/D73267	2020-01-30 08:35:55 -06:00
Hans Wennborg	8196b233b2	Drop arm triple from test/CodeGen/AArch64/global-merge-hidden-minsize.ll Because it's in the AArch64/ directory, it runs in cases where the arm target may not be available, see comment on D73235.	2020-01-30 15:02:38 +01:00
John Brawn	c5104e8c6c	[FPEnv][AArch64] Add lowering and instruction selection for strict conversions Strict fp-to-int and int-to-fp conversions can be handled in the same way that the non-strict versions are (by using the appropriate instruction or converting to a function call when we have no instruction). Differential Revision: https://reviews.llvm.org/D73625	2020-01-30 13:50:06 +00:00
Matt Arsenault	186987890d	GlobalISel: Implement s32->s64 G_FPTOSI lowering Port directly from DAG version. The lowering for G_FPTOUI used to fail on AMDGPU because it uses G_FPTOSI.	2020-01-30 08:47:07 -05:00
Matt Arsenault	71624e0f4c	AMDGPU/GlobalISel: Handle s64->s64 G_FPTOSI/G_FPTOUI	2020-01-30 08:46:37 -05:00
Matt Arsenault	99132ddff0	AMDGPU/GlobalISel: Custom lower G_LOG/G_LOG10 I'm pretty sure this is wrong and we should expand these in a correct way, but this matches the existing behavior.	2020-01-30 08:38:50 -05:00
Matt Arsenault	18e2339bcb	AMDGPU/GlobalISel: Legalize unpacked d16 image operations On targets that don't have the normal packed f16 layout, handle these during legalization. Directly modify the register types. We can infer this was a d16 load based on the mem operand size during selection. A16 operands should possibly be handled here as well, but don't worry about that yet.	2020-01-30 08:36:11 -05:00
Matt Arsenault	e6a65d5ecb	AMDGPU/GlobalISel: Only map VOP operands to VGPRs This trivially avoids violating the constant bus restriction. Previously this was allowing one SGPR in the first source operand, which technically also avoided violating this for most operations (but not for special cases reading vcc). We do need to write some new, smarter operand folds to pick the optimal SGPR to use in some kind of post-isel fold, but that's purely an optimization. I was originally thinking we would pick which operands should be SGPRs in RegBankSelect, but I think this isn't really manageable. There would be additional complexity to handle every G_* instruction, and then any nontrivial instruction patterns would need to know when to avoid violating it, which is likely to be very error prone. I think having all inputs being canonically copies to VGPRs will simplify the operand folding logic. The current folding we do is backwards, and only considers one operand at a time, relative to operands it already has. It therefore poorly handles the case where there is already a constant bus operand user. If all operands are copies, it's somewhat simpler to consider all input operands at once to choose the optimal constant bus user. Since the failure mode for constant bus violations is now a verifier error and not an selection failure, this moves towards a place where we can turn on the fallback mode. The SGPR copy folding optimizations can be left for later.	2020-01-30 08:32:35 -05:00
Dominik Montada	eaccf3c98c	[GlobalISel] (fix) Use pointer type size for offset constant when lowering stores Commit 9965b12fd1b was supposed to change the offset constant when lowering load/stores, but only introduced this change for loads. This patch adds the same fix for stores.	2020-01-30 08:32:35 -05:00

... 3 4 5 6 7 ...

191245 Commits