llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-12-11 17:08:42 +00:00

Author	SHA1	Message	Date
Matthias Springer	49e3700069	[mlir][tensor] Move BufferizableOpInterface impl to tensor dialect This is in preparation of unifying the existing bufferization with One-Shot bufferization. A subsequent commit will replace `tensor-bufferize`'s implementation with the BufferizableOpInterface-based implementation and move over missing test cases. Differential Revision: https://reviews.llvm.org/D117984	2022-01-24 23:54:49 +09:00
Matt Arsenault	18aabae8e2	AMDGPU: Fix assertion on fixed stack objects with VGPR->AGPR spills These have negative / out of bounds frame index values and would assert when trying to set the BitVector. Fixed stack objects can't be colored away so ignore them.	2022-01-24 09:45:41 -05:00
Bjorn Pettersson	354b2c36ee	Pre-commit test cases for (sra (load)) -> (sextload) folds. NFC Add test case to show missing folds for (sra (load)) -> (sextload). Differential Revision: https://reviews.llvm.org/D116929	2022-01-24 15:30:55 +01:00
Matt Arsenault	99e8e17313	Reapply "Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction" This reverts commit `a97e20a3a8`.	2022-01-24 09:26:52 -05:00
Matt Arsenault	7a5b0a2934	Reapply "IR: Make getRetAlign check callee function attributes" Reapply `3d2d208f6a`, reverted in `a97e20a3a8`	2022-01-24 09:26:51 -05:00
ksyx	5e5efd8a91	[clang-format] Fix SeparateDefinitionBlocks issues - Fixes https://github.com/llvm/llvm-project/issues/53227 that wrongly indents multiline comments - Fixes wrong detection of single-line opening braces when used along with those only opening scopes, causing crashes due to duplicated replacements on the same token: void foo() { { int x; } } - Fixes wrong recognition of first line of definition when the line starts with block comment, causing crashes due to duplicated replacements on the same token for this leads toward skipping the line starting with inline block comment: /* Some descriptions about function / /inline*/ void bar() { } - Fixes wrong recognition of enum when used as a type name rather than starting definition block, causing crashes due to duplicated replacements on the same token since both actions for enum and for definition blocks were taken place: void foobar(const enum EnumType e) { } - Change to use function keyword for JavaScript instead of comparing strings - Resolves formatting conflict with options EmptyLineAfterAccessModifier and EmptyLineBeforeAccessModifier (prompts with --dry-run (-n) or --output-replacement-xml but no observable change) - Recognize long (len>=5) uppercased name taking a single line as return type and fix the problem of adding newline below it, with adding new token type FunctionLikeOrFreestandingMacro and marking tokens in UnwrappedLineParser: void afunc(int x) { return; } TYPENAME func(int x, int y) { // ... } - Remove redundant and repeated initialization - Do no change to newlines before EOF Reviewed By: MyDeveloperDay, curdeius, HazardyKnusperkeks Differential Revision: https://reviews.llvm.org/D117520	2022-01-24 14:23:20 +00:00
Paul Walker	34aedbe90d	[AArch64] Regenerate CHECK lines for llvm/test/CodeGen/AArch64/sve2-int-mul.ll	2022-01-24 14:11:19 +00:00
Simon Pilgrim	0553f5e61a	[X86] Add cmp-equality bool reductions PR53379 test coverage	2022-01-24 14:05:10 +00:00
Simon Pilgrim	4436d4cd7c	[X86] Rename cmp-with-zero bool reductions Explicitly name them icmp0_* - I'm intending to add PR53379 test coverage shortly	2022-01-24 14:05:10 +00:00
Simon Pilgrim	f7079bf9ee	[X86] Fix v8i8 -> v8i16 typo in bool reductions We were supposed to be testing <8 x i16> reductions	2022-01-24 14:05:09 +00:00
serge-sans-paille	25e8f5f827	Add missing STLExtras.h include from lldb/unittests/TestingSupport/MockTildeExpressionResolver.cpp	2022-01-24 15:03:11 +01:00
Fraser Cormack	d42678b453	[RISCV] Add side-effect-free vsetvli intrinsics This patch introduces new intrinsics that enable the use of vsetvli in contexts where only the returned vector length is of interest. The pre-existing intrinsics are marked with side-effects, which prevents even trivial optimizations on/across them. These intrinsics are intended to be used in situations where the vector length is fed in turn to RVV intrinsics or to vector-predication intrinsics during loop vectorization, for example. Those codegen paths ensure that instructions are generated with their own implicit vsetvli, so the vector length and vtype can be relied upon to be correct. No corresponding C builtins are planned at this stage, though that is a possibility for the future if the need arises. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117910	2022-01-24 13:52:08 +00:00
Sjoerd Meijer	ada6d78a78	[LoopFlatten] Address FIXME about getTripCountFromExitCount. NFC. Together with the previous commit which mainly documents better LoopFlatten's overall strategy, this addresses a concern added as a FIXME comment in D110587; the code refactoring (NFC) introduces functions (also for the SCEV usage) to make this clearer.	2022-01-24 13:46:19 +00:00
Sjoerd Meijer	f6ac8088b0	[LoopFlatten] Added comments about usage of various Loop APIs. NFC.	2022-01-24 13:46:19 +00:00
serge-sans-paille	a0d5e938fe	Add missing include llvm/ADT/STLExtras	2022-01-24 14:41:24 +01:00
Evgeny Shulgin	589a939072	Add `isConstinit` matcher Support C++20 constinit variables for AST Matchers.	2022-01-24 08:35:42 -05:00
Nathan Sidwell	6184e565ad	[demangler][NFC] Refactor some parsing There's some unnecessary code duplication in the parser. This refactors that and deploys boolean variables to avoid the duplication. These also happen to help adding module demangling (with an updated mangling scheme). 1a) The grammar requires some lookahead concerning <template-args>. We may discover an <unscoped-name> is actually <unscoped-template-name> <template-args>. (When <unscoped-name> was a substitution, there must be a following <template-args>.) Refactor parseName to only have one code path looking for the 'I' indicating <template-args>. 1b) While there I altered the control flow to hold the result in a variable, rather than tail call. Made it easier to debug (and of course an optimizer will DTRT here anyway). 2a) An <unscoped-name> can have an St or StL prefix. No need for completely separate code paths handling the following unqualified-name though. 2b) Also no need to look for both 'St' and 'StL' separately. Look for 'St' and then conditionally swallow an 'L'. 3) We get a similar issue as #1a when parsing a typeName. Here I just change the control flow slightly to bring the 'break' out to the end of the 'S' block and embed the early return inside an if. That's more in keeping with the code style. 4) Although NFC, there's a new testcase as that's not covered by the existing demangler tests and is significant in the #1a case above. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D117879	2022-01-24 05:28:38 -08:00
Nathan Sidwell	897d1bb659	[demangler] write-protect non-canonical source To try and avoid undesired changes to the non-canonical demangler sources, change the cp-to-llvm script to (a) write-protect the target files and (b) prepend 'do not edit' comments that are significant to emacs[*], and hopefully humans. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D118008	2022-01-24 05:28:38 -08:00
Nathan Sidwell	38ffea9b4c	[demangler] Resync demangler sources Recent commits changed llvm/include/llvm/Demangle without also changing libcxxabi/src/Demangle, which is the canonical source location. This resyncs those commits to the libcxxabi directory. Commits: * `1f9e18b656` * `f53d359816` * `065044c443` Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D117990.diff	2022-01-24 05:28:38 -08:00
Valentin Clement	853e79d8d8	[flang] Update tco tool pipline and add translation to LLVM IR tco is a tool to test the FIR to LLVM IR pipeline of the Flang compiler. This patch update tco pipelines and adds the translation to LLVM IR. A simple test is added to make sure the tool is working with a simple FIR program. More tests will be upstream in follow up patch from the fir-dev branch. This patch is part of the upstreaming effort from fir-dev branch. Reviewed By: kiranchandramohan, awarzynski, schweitz, mehdi_amini Differential Revision: https://reviews.llvm.org/D117781 Co-authored-by: Eric Schweitz <eschweitz@nvidia.com> Co-authored-by: Jean Perier <jperier@nvidia.com> Co-authored-by: Andrzej Warzynski <andrzej.warzynski@arm.com>	2022-01-24 14:16:27 +01:00
serge-sans-paille	5f290c090a	Move STLFunctionalExtras out of STLExtras Only using that change in StringRef already decreases the number of preoprocessed lines from 7837621 to 7776151 for LLVMSupport Perhaps more interestingly, it shows that many files were relying on the inclusion of StringRef.h to have the declaration from STLExtras.h. This patch tries hard to patch relevant part of llvm-project impacted by this hidden dependency removal. Potential impact: - "llvm/ADT/StringRef.h" no longer includes <memory>, "llvm/ADT/Optional.h" nor "llvm/ADT/STLExtras.h" Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831	2022-01-24 14:13:21 +01:00
Florian Hahn	b2a8eff45c	[LV] Make some tests more robust by adding missing users.	2022-01-24 13:04:09 +00:00
Simon Pilgrim	0e70dd858e	[X86] Add PR46249 test case showing poorly widened select predicate mask	2022-01-24 12:59:30 +00:00
Sebastian Neubauer	f1e36474b9	[AMDGPU][NFC] Fix debug prints Print the instructions instead of pointers.	2022-01-24 13:55:00 +01:00
Evgeniy Brevnov	b4b6d6374e	[NFC] New test case for BasicAA and memcy/memmove with deopt New test checks results of BasicAA for llvm.memcpy./llvm.memmove. intrinsics in presence of deopt bundle. By specification expected result for unrelated global memory should be Ref. Currently this is not the case and will be fixed in upcoming patches. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D118031	2022-01-24 19:53:29 +07:00
Groverkss	b754d09fde	[MLIR][Presburger] Refactor duplicate division merging to Utils This patch moves merging of duplicate divisions to presburger utility functions. This is required to support division merging in structures other than IntegerPolyhedron. Reviewed By: arjunp Differential Revision: https://reviews.llvm.org/D118001	2022-01-24 18:12:13 +05:30
SForeKeeper	70f83f3084	[RISCV] add support for zbkx subextension in MC layer. This patch adds support for zbkx extension from K extension(v1.0.0) in MC layer. Instructions with same functionality and same encoding is defined in the bitmanip extension. It defines {Xperm8, Xperm4} as instruction aliases for xperm.* in Zbp extension. When Zbkx is enabled while Zbp is not, xperm.h will not be available. When Zbkx and Zbp are both enabled, the instructions will be decoded in Zbp format. [[ https://reviews.llvm.org/D94999 \| D94999 ]] this is the patch that introduces xperm.* instructions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117889	2022-01-24 20:38:46 +08:00
Florian Hahn	b7f69b8d46	[LV] Name values and blocks in same induction tests (NFC). This reduces the churn in the test in future updates due to numbering changes.	2022-01-24 12:28:43 +00:00
LLVM GN Syncbot	54f1d95066	[gn build] Port `3696c70e67`	2022-01-24 12:06:49 +00:00
Kerry McLaughlin	8082ab2fc3	[LoopVectorize] Support epilogue vectorisation of loops with reductions isCandidateForEpilogueVectorization will currently return false for loops which contain reductions. This patch removes this restriction and makes the following changes to support epilogue vectorisation with reductions: - `fixReduction`: If fixReduction is being called during vectorisation of the epilogue, the phi node it creates will need to additionally carry incoming values from the middle block of the main loop. - `createEpilogueVectorizedLoopSkeleton`: The incoming values of the phi created by fixReduction are updated after the vec.epilog.iter.check block is added. The phi is also moved to the preheader of the epilogue. - `processLoop`: The start value of any VPReductionPHIRecipes are updated before vectorising the epilogue loop. The getResumeInstr function added to the ILV will return the resume instruction associated with the recurrence descriptor. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D116928	2022-01-24 12:03:31 +00:00
Simon Pilgrim	e4074432d5	[X86] Remove avx512f integer and/or/xor/min/max reduction intrinsics and use generic equivalents None of these have any reordering issues, and they still emit the same reduction intrinsics without any change in the existing test coverage: llvm-project\clang\test\CodeGen\X86\avx512-reduceIntrin.c llvm-project\clang\test\CodeGen\X86\avx512-reduceMinMaxIntrin.c Differential Revision: https://reviews.llvm.org/D117881	2022-01-24 11:57:53 +00:00
Adrian Vogelsgesang	3696c70e67	[clang-tidy] Add `readability-container-contains` check This commit introduces a new check `readability-container-contains` which finds usages of `container.count()` and `container.find() != container.end()` and instead recommends the `container.contains()` method introduced in C++20. For containers which permit multiple entries per key (`multimap`, `multiset`, ...), `contains` is more efficient than `count` because `count` has to do unnecessary additional work. While this this performance difference does not exist for containers with only a single entry per key (`map`, `unordered_map`, ...), `contains` still conveys the intent better. Reviewed By: xazax.hun, whisperity Differential Revision: http://reviews.llvm.org/D112646	2022-01-24 12:57:18 +01:00
Simon Pilgrim	3e50593b18	[X86] Remove `__builtin_ia32_pmax/min` intrinsics and use generic `__builtin_elementwise_max/min` D111985 added the generic `__builtin_elementwise_max` and `__builtin_elementwise_min` intrinsics with the same integer behaviour as the SSE/AVX instructions This patch removes the `__builtin_ia32_pmax/min` intrinsics and just uses `__builtin_elementwise_max/min` - the existing tests see no changes: ``` __m256i test_mm256_max_epu32(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_max_epu32 // CHECK: call <8 x i32> @llvm.umax.v8i32(<8 x i32> %{{.}}, <8 x i32> %{{.}}) return _mm256_max_epu32(a, b); } ``` This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`). Sibling patch to D117791 Differential Revision: https://reviews.llvm.org/D117798	2022-01-24 11:40:29 +00:00
Matthias Springer	b2499bf3e8	[mlir][bufferize][NFC] Refactor createAlloc function signature Pass a ValueRange instead of an ArrayRef<Value> for better compatibility. Also provide an additional function overload that automatically deallocates the buffer if specified. Differential Revision: https://reviews.llvm.org/D118025	2022-01-24 20:25:35 +09:00
Simon Pilgrim	e5147f82e1	[X86] Remove __builtin_ia32_pabs intrinsics and use generic __builtin_elementwise_abs D111986 added the generic `__builtin_elementwise_abs()` intrinsic with the same integer absolute behaviour as the SSE/AVX instructions (abs(INT_MIN) == INT_MIN) This patch removes the `__builtin_ia32_pabs` intrinsics and just uses `__builtin_elementwise_abs` - the existing tests see no changes: ``` __m256i test_mm256_abs_epi8(__m256i a) { // CHECK-LABEL: test_mm256_abs_epi8 // CHECK: [[ABS:%.]] = call <32 x i8> @llvm.abs.v32i8(<32 x i8> %{{.*}}, i1 false) return _mm256_abs_epi8(a); } ``` This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`). Differential Revision: https://reviews.llvm.org/D117791	2022-01-24 11:25:21 +00:00
Bjorn Pettersson	46cacdbb21	[DAGCombiner] Adjust some checks in DAGCombiner::reduceLoadWidth In code review for D117104 two slightly weird checks were found in DAGCombiner::reduceLoadWidth. They were typically checking if BitsA was a mulitple of BitsB by looking at (BitsA & (BitsB - 1)), but such a comparison actually only make sense if BitsB is a power of two. The checks were related to the code that attempted to shrink a load based on the fact that the loaded value would be right shifted. Afaict the legality of the value types is checked later (typically in isLegalNarrowLdSt), so the existing checks were both overly conservative as well as being wrong whenever ExtVTBits wasn't a power of two. The latter was a situation triggered by a number of lit tests so we could not just assert on ExtVTBIts being a power of two). When attempting to simply remove the checks I found some problems, that seems to have been guarded by the checks (maybe just out of luck). A typical example would be a pattern like this: t1 = load i96* ptr t2 = srl t1, 64 t3 = truncate t2 to i64 When DAGCombine is visiting the truncate reduceLoadWidth is called attempting to narrow the load to 64 bits (ExtVT := MVT::i64). Then the SRL is detected and we set ShAmt to 64. In the past we've bailed out due to i96 not being a multiple of 64. If we simply remove that check then we would end up replacing the load with a new load that would read 64 bits but with a base pointer adjusted by 64 bits. So we would read 32 bits the wasn't accessed by the original load. This patch will instead utilize the fact that the logical left shift can be folded away by using a zextload. Thus, the pattern above will now be combined into t3 = load i32* ptr+offset, zext to i64 Another case is shown in the X86/shift-folding.ll test case: t1 = load i32* ptr t2 = srl i32 t1, 8 t3 = truncate t2 to i16 In the past we bailed out due to the shift count (8) not being a multiple of 16. Now the narrowing kicks in and we get t3 = load i16* ptr+offset Differential Revision: https://reviews.llvm.org/D117406	2022-01-24 12:22:04 +01:00
Bjorn Pettersson	12a499eb00	Pre-commit test case for trunc+lshr+load folds This is a pre-commit of test cases relevant for D117406. @srl_load_narrowing1 is showing a pattern that could be folded into a more narrow load. @srl_load_narrowing2 is showing a similar pattern that happens to be optimized already, but that happens in two steps (first triggering a combine based on SRL and later another combine based on TRUNCATE). Differential Revision: https://reviews.llvm.org/D117588	2022-01-24 12:22:03 +01:00
David Spickett	3e6be0241b	[lldb] Update release notes with non-address bit handling changes This adds the "memory find" (https://reviews.llvm.org/D117299) and "memory tag" (https://reviews.llvm.org/D117672) commands and puts them all in one list.	2022-01-24 11:19:49 +00:00
Fraser Cormack	af773a1818	[RISCV][VP] Lower VP_MERGE to RVV instructions This patch adds lowering of the llvm.vp.merge.* intrinsic (ISD::VP_MERGE) to RVV vmerge/vfmerge instructions. It introduces a special pseudo form of vmerge which allows a tied merge operand, allowing us to specify the tail elements as being equal to the "on false" operand, using a tied-def constraint and a "tail undisturbed" policy. While this strategy allows us to often lower the intrinsic to just one instruction, it may be less efficient in fixed-vector types as the number of tail elements may extend far beyond the length of the fixed vector. Another strategy could be to use a vmerge/vfmerge instruction with an AVL equal to the length of the vector type, and manipulate the condition operand such that mask elements greater than the operation's EVL are false. I've also observed inefficient codegen in which our 'VF' patterns don't match raw floating-point SPLAT_VECTORs, which occur in scalable-vector code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117561	2022-01-24 11:05:05 +00:00
Fraser Cormack	e7926e8d97	[RISCV] Match VF variants for masked VFRDIV/VFRSUB This patch follows up on D117697 to help the simple binary operations behave similarly in the presence of masks. It also enables CGP sinking support for vp.fdiv and vp.fsub intrinsics, now that VFRDIV and VFRSUB are consistently matched with a LHS splat for masked and unmasked variants. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117783	2022-01-24 10:59:43 +00:00
Simon Pilgrim	577a6dc9a1	[X86] getVectorMaskingNode - fix indentation. NFC. clang-format	2022-01-24 11:08:41 +00:00
Abinav Puthan Purayil	912af6b570	[AMDGPU][GlobalISel] Remove the post ':' part of vreg operands in fsh combine tests.	2022-01-24 16:30:40 +05:30
Andrzej Warzynski	022600334d	[flang] Update the description of `!fir.coordinate_of` This change was suggested in one of the comments for https://reviews.llvm.org/D115333. Basically, the following usage is valid, but the current wording suggests otherwise: ``` %1 = fir.coordinate_of %a, %k : (!fir.ref<!fir.array<10 x 10 x i32>>, index) -> !fir.ref<!fir.array<10 x i32>> ``` A test is also added to better document this particular case. Differential revision: https://reviews.llvm.org/D115929	2022-01-24 10:45:52 +00:00
David Spickett	7d19566c3b	[lldb] Ignore non-address bits in "memory find" arguments This removes the non-address bits before we try to use the addresses. Meaning that when results are shown, those results won't show non-address bits either. This follows what "memory read" has done. On the grounds that non-address bits are a property of a pointer, not the memory pointed to. I've added testing and merged the find and read tests into one file. Note that there are no API side changes because "memory find" does not have an equivalent API call. Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D117299	2022-01-24 10:42:49 +00:00
Jay Foad	aa50b93e7c	[AMDGPU][GlobalISel] Add more sign/zero/any-extension tests Add s1 to s16 cases, and for sgprs s1 to s64 and s32 to s64.	2022-01-24 10:16:51 +00:00
Jay Foad	906ebd5830	[AMDGPU][GlobalISel] Regenerate checks in inst-select-*ext.mir	2022-01-24 10:16:51 +00:00
Peter Smith	a08447d0de	[LLD][ELF][AArch64] Update test with incorrect REQUIRES line [NFC] D54759 introduced aarch64-combined-dynrel.s and aarch64-combined-dynrel-ifunc.s . Unfortunately the requires line at the top was AArch64 instead of aarch64 which means they were never run. Update the tests to use aarch64 and fix to match current lld output. Differential Revision: https://reviews.llvm.org/D117896	2022-01-24 10:04:28 +00:00
Nikita Popov	0d1308a7b7	[AArch64][GlobalISel] Support returned argument with multiple registers The call lowering code assumed that a returned argument could only consist of one register. Pass an ArrayRef<Register> instead of Register to make sure that all parts get assigned. Fixes https://github.com/llvm/llvm-project/issues/53315. Differential Revision: https://reviews.llvm.org/D117866	2022-01-24 10:55:28 +01:00
Nikita Popov	e7c9a6cae0	[SDAG] Don't move DBG_VALUE instructions after insertion point during scheduling (PR53243) EmitSchedule() shouldn't be touching instructions after the provided insertion point. The change introduced in D83561 performs a scan to the end of the block, and thus may move unrelated instructions. In particular, this ends up moving instructions that have been produced by FastISel and will later be deleted. Moving them means that more instructions than intended are removed. Fix this by stopping the iteration when the insertion point is reached. Fixes https://github.com/llvm/llvm-project/issues/53243. Differential Revision: https://reviews.llvm.org/D117489	2022-01-24 10:50:49 +01:00
Sander de Smalen	4f8fdf7827	[ISEL] Canonicalise constant splats to RHS. SelectionDAG::getNode() canonicalises constants to the RHS if the operation is commutative, but it doesn't do so for constant splat vectors. Doing this early helps making certain folds on vector types, simplifying the code required for target DAGCombines that are enabled before Type legalization. Somewhat to my surprise, DAGCombine doesn't seem to traverse the DAG in a post-order DFS, so at the time of doing some custom fold where the input is a MUL, DAGCombiner::visitMUL hasn't yet reordered the constant splat to the RHS. This patch leads to a few improvements, but also a few minor regressions, which I traced down to D46492. When I tried reverting this change to see if the changes were still necessary, I ran into some segfaults. Not sure if there is some latent bug there. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117794	2022-01-24 09:38:36 +00:00

1 2 3 4 5 ...

412116 Commits