llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2025-03-01 14:58:18 +00:00

Author	SHA1	Message	Date
Chris Bieneman	0c3f51c042	Re-land [DX] Add support for PSV signature elements The pipeline state data captured in the PSV0 section of the DXContainer file encodes signature elements which are read by the runtime to map inputs and outputs from the GPU program. This change adds support for generating and parsing signature elements with testing driven through the ObjectYAML tooling. Reviewed By: bogner Differential Revision: https://reviews.llvm.org/D157671 Initially landed as 8c567e64f808f7a818965c6bc123fedf7db7336f, and reverted in 4d800633b2683304a5431d002d8ffc40a1815520. ../llvm/include/llvm/BinaryFormat/DXContainerConstants.def ../llvm/test/ObjectYAML/DXContainer/PSVv1-amplification.yaml ../llvm/test/ObjectYAML/DXContainer/PSVv1-compute.yaml ../llvm/test/ObjectYAML/DXContainer/PSVv1-domain.yaml ../llvm/test/ObjectYAML/DXContainer/PSVv1-geometry.yaml ../llvm/test/ObjectYAML/DXContainer/PSVv1-vertex.yaml ../llvm/test/ObjectYAML/DXContainer/PSVv2-amplification.yaml ../llvm/test/ObjectYAML/DXContainer/PSVv2-compute.yaml ../llvm/test/ObjectYAML/DXContainer/PSVv2-domain.yaml ../llvm/test/ObjectYAML/DXContainer/PSVv2-geometry.yaml ../llvm/test/ObjectYAML/DXContainer/PSVv2-vertex.yaml	2023-08-16 14:26:13 -05:00
Jim Ingham	d268ba3808	Test follow-up to 2e7aa2ee34eb53347396731dc8a3b2dbc6a3df45 The TestEvents.py test I added for ShadowListeners fails on Windows. Since there's no reason to believe the ShadowListeners feature has different behavior from the other event-based tests here, I copied the skips & expected_flakey's from the other tests in that file to this one.	2023-08-16 12:19:07 -07:00
Lei Zhang	73ddc4474b	[mlir][vector] Enable distribution over multiple dimensions This commit starts enabling vector distruction over multiple dimensions. It requires delinearize the lane ID to match the expected rank. shape_cast and transfer_read now can properly handle multiple dimensions. Reviewed By: hanchung Differential Revision: https://reviews.llvm.org/D157931	2023-08-16 12:08:43 -07:00
Craig Topper	42dad521e3	[RISCV] Add RISCVII::getRoundModeOpNum to reduce code duplication. NFC	2023-08-16 12:00:02 -07:00
Chris Bieneman	4d800633b2	Revert "[DX] Add support for PSV signature elements" This reverts commit 8c567e64f808f7a818965c6bc123fedf7db7336f.	2023-08-16 13:52:26 -05:00
Chris Bieneman	8c567e64f8	[DX] Add support for PSV signature elements The pipeline state data captured in the PSV0 section of the DXContainer file encodes signature elements which are read by the runtime to map inputs and outputs from the GPU program. This change adds support for generating and parsing signature elements with testing driven through the ObjectYAML tooling. Reviewed By: bogner Differential Revision: https://reviews.llvm.org/D157671	2023-08-16 13:38:20 -05:00
Blue Gaston	b5c2075081	[Sanitizers][Driverkit] Stop using Sanitizer Allocator64 on Driverkit Before refactoring this code, all arm64 were set to use the 32bit allocator. This patch reverts back that behavior for DriverKit. Because we target DriverKit as the target OS, rather than a specific platform, reverting back to the previous behavior is preferred to fix a failure we are seeing on embedded platforms. Though it may be more correct in the future to match the allocator to the platform being used. rdar://113649286 Differential Revision: https://reviews.llvm.org/D158028	2023-08-16 11:29:36 -07:00
Valentin Clement	1640b80d6f	[flang][openacc] Lower gang, vector, worker, seq and nohost for acc routine Lower clauses to the routine info op. Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D158007	2023-08-16 11:22:40 -07:00
Daniel Hoekwater	2c43d591c6	[CodeGen] Move function splitting tests from X86 to Generic (NFC) Machine function splitting will become available for AArch64; since MFS is no longer X86-only, the tests for generic behavior should live somewhere other than tests/CodeGen/X86. MFS implementation doesn't vary much across platforms, and most tests should be identical between X86 and AArch64 besides instruction selection, so the tests can live together in tests/CodeGen/Generic. Differential Revision: https://reviews.llvm.org/D157563	2023-08-16 18:11:23 +00:00
Valentin Clement	0e7649698a	[flang][openacc] Fix post deallocate suffix The wrong suffix was applied Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D158098	2023-08-16 11:09:42 -07:00
Hanhan Wang	8b68cec9c0	[mlir][tensor] Add producer fusion for tensor.pack op. We are able to fuse the pack op only if inner tiles are not tiled or they are fully used. Otherwise, it could generate a sequence of non-trivial ops. Differential Revision: https://reviews.llvm.org/D157932	2023-08-16 11:02:59 -07:00
Owen Pan	063c42e919	[clang-format] Handle NamespaceMacro string arg for FixNamespaceComments Fixes #63795. Differential Revision: https://reviews.llvm.org/D157568	2023-08-16 10:45:54 -07:00
Matt Arsenault	c9d0d15e69	AMDGPU: Refine some rsq formation tests Drop unnecessary flags and metadata, add contract flags that should be necessary.	2023-08-16 13:37:03 -04:00
Jim Ingham	2e7aa2ee34	Replace the singleton "ShadowListener" with a primary and N secondary Listeners Before the addition of the process "Shadow Listener" you could only have one Listener observing the Process Broadcaster. That was necessary because fetching the Process event is what switches the public process state, and for the execution control logic to be manageable you needed to keep other listeners from causing this to happen before the main process control engine was ready. Ismail added the notion of a "ShadowListener" - which allowed you ONE extra process listener. This patch inverts that setup by designating the first listener as primary - and giving it priority in fetching events. Differential Revision: https://reviews.llvm.org/D157556	2023-08-16 10:35:32 -07:00
LLVM GN Syncbot	329979cf37	[gn build] Port 2459ed67805c	2023-08-16 17:25:23 +00:00
Nico Weber	e87d68ce8f	[gn] port 23d1b6577a50	2023-08-16 13:25:02 -04:00
Dhruv Chawla	de059a2ea2	[NFC][ValueTracking] Remove calls to computeKnownBits for non-intrinsic CallInsts in isKnownNonZeroFromOperator For non-intrinsic CallInsts, computeKnownBits only handles range metadata and checking getReturnedArgOperand(). Both of these are now handled in isKnownNonZero, so there is no need to fall through to a call to computeKnownBits anymore. Differential Revision: https://reviews.llvm.org/D158095	2023-08-16 22:52:13 +05:30
Kazushi (Jam) Marukawa	922ac64b04	[VE] Avoid vectorizing store/load in scalar mode Avoid vectorizing store and load instructions in scalar mode. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D158049	2023-08-17 02:15:54 +09:00
V Donaldson	1fd72321a4	[flang] Runtime assigned format errors Generate a runtime error message for a reference to an invalid assigned format such as: if (.true.) print n end	2023-08-16 10:14:34 -07:00
Craig Topper	0805310b50	[RISCV] Fix spelling Ctypto->Crypto. NFC	2023-08-16 10:11:05 -07:00
Kazu Hirata	6e6014a260	[Analysis] Fix an unused variable warning This patch fixes: llvm/lib/Analysis/LoopAccessAnalysis.cpp:2001:12: error: unused variable 'MinDepDistBytesOld' [-Werror,-Wunused-variable]	2023-08-16 10:09:40 -07:00
V Donaldson	04e6129d32	[flang] Separate module procedure variant Accept "module procedure" (as well as module function/subroutine) in a separate module procedure definition, such as "bb1" in: module mm interface module subroutine mm1 end subroutine end interface end module submodule(mm) bb interface module subroutine bb1 end subroutine end interface contains module procedure mm1 call bb1 end procedure module procedure bb1 print*, 'bb1' end procedure end submodule use mm call mm1 end	2023-08-16 10:07:07 -07:00
Michael Maitland	87ddd3a191	[LAA] Rename and fix semantics of MaxSafeDepDistBytes to MinDepDistBytes `MaxSafeDepDistBytes` was not correct based on its name an semantics in instances when there was a non-unit stride loop. For example, ``` for (int k = 0; k < len; k+=3) { a[k] = a[k+4]; a[k+2] = a[k+6]; } ``` Here, the smallest dependence distance is 24 bytes, but only vectorizing 8 bytes is safe. `MaxSafeVectorWidthInBits` reported the correct number of bits that could be vectorized as 64 bits. The semantics of of `MaxSafeDepDistBytes` should be: The smallest dependence distance in bytes in the loop. This may not be the same as the maximum number of bytes that are safe to operate on simultaneously. The name of this variable should reflect those semantics and its docstring should be updated accordingly, `MinDepDistBytes`. A debug message that used `MaxSafeDepDistBytes` to signify to the user how many bytes could be accessed in parallel is updated to use `MaxSafeVectorWidthInBits` instead. That way, the same message if communicated to the user, just in different units. This patch makes sure that when `MinDepDistBytes` is modified in a way that should impact `MaxSafeVectorWidthInBits`, that we update the latter accordingly. This patch also clarifies why `MaxSafeVectorWidthInBits` does not to be updated when `MinDepDistBytes` is (i.e. in the case of a forward dependency). Differential Revision: https://reviews.llvm.org/D156158	2023-08-16 09:53:35 -07:00
Nicholas Guy	d65feccb12	[ARM] Set preferred function alignment Aligning functions yields small performance gains on embedded cores, moreso with numerous small function calls. Similar to aligning loops, if the function can fit within a single cache line then the performance overhead of fetching more instructions can be limited. Differential Revision: https://reviews.llvm.org/D157514	2023-08-16 17:31:21 +01:00
Ingo Müller	d7e26b5620	[mlir][linalg][transform][python] Fix mix-in for MaskedVectorize. Fix forward bug in dac19b457e2cfd139e0e5cc29872ba3c65b7510f, which uses the vertical bar operator for type hints, which is only supported by Python 3.10 and later, and thus breaks the builds on Python 3.8.	2023-08-16 16:27:46 +00:00
Siu Chi Chan	d40fd9e1d9	Fix typo in module inliner priority flag Change-Id: If4a830fdacf1b0e7b7634f48f648427d5ec7ea21 Reviewed By: kazu, arsenm Differential Revision: https://reviews.llvm.org/D158013	2023-08-16 12:26:06 -04:00
Jonas Devlieghere	5afa519c1a	[lldb] Print better error message when sphinx_automodapi is not installed Print an error message with instructions on how to install sphinx_automodapi. Differential revision: https://reviews.llvm.org/D158022	2023-08-16 09:14:42 -07:00
David Green	a047dfe0d5	[AArch64][GISel] Lower EXT of 0 to a COPY This allows us to select G_SHUFFLE_VECTOR with identity masks (possibly including undef elements), but avoid the actual EXT instruction if the shift amount is 0.	2023-08-16 17:12:15 +01:00
Dhruv Chawla	d53b3df570	[InstCombine] Remove unneeded isa<PHINode> check in foldOpIntoPhi This check is redundant as it is covered by the call to isPotentiallyReachable. Depends on D155726. Differential Revision: https://reviews.llvm.org/D155718	2023-08-16 21:09:08 +05:30
Dhruv Chawla	e549d578cc	[InstCombine] Test cases for D155718 Differential Revision: https://reviews.llvm.org/D155726	2023-08-16 21:09:04 +05:30
Akash Banerjee	5d9ccd7a96	[OpenMP] Migrate dispatch related utility functions from Clang codegen to OMPIRBuilder Migrate createForStaticInitFunction, createDispatchInitFunction, createDispatchNextFunction and createDispatchFiniFunction from Clang CodeGen to OMPIRBuilder. Differential Revision: https://reviews.llvm.org/D157994	2023-08-16 16:35:28 +01:00
Joseph Huber	5717329f1a	[Libomptarget] Disable deadlocking bug49334.cpp test on AMDGPU This test hangs on AMDGPU sporadically, disable it for the time being. Fixes: https://github.com/llvm/llvm-project/issues/64733 Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D158082	2023-08-16 10:24:00 -05:00
Benjamin Maxwell	0d3abdc263	[mlir][Linalg] Fix formatting of generated docs markdown This patch prevents `mlir-linalg-ods-yaml-gen` from adding extra whitespace around the summary and description fields. This broke the _italics_ of the summary as _ this _ is not recognised by markdown. It also meant the first line of the description was in a code block as it was indented two spaces. The separator between summary and description has also been updated to two newlines. This was already followed and prevents line wrapping the summary putting part of it in the description. These issues can be currently seen at: https://mlir.llvm.org/docs/Dialects/Linalg/ Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D157853	2023-08-16 15:08:51 +00:00
Ingo Müller	67c092c8c8	[mlir][transform][python] Add test for AnyValueType binding. I had forgotten to commit that test as part of https://reviews.llvm.org/D157638. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D158074	2023-08-16 15:07:48 +00:00
Ingo Müller	dac19b457e	[mlir][linalg][transform][python] Add mix-in for MaskedVectorize. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D157735	2023-08-16 15:07:46 +00:00
Ingo Müller	2d3dcd4aec	[mlir][linalg][transform][python] Add mix-in for BufferizeToAllocOp. Re-apply https://reviews.llvm.org/D157704. The original patch broke the tests on Python 3.8 and got reverted by 0c4aad050c23254c3c612e860e1278961d161aef. This patch replaces the usage of the vertical bar operator for type hints with `Union`. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D158075	2023-08-16 15:07:43 +00:00
Felix	a94c44cc0a	[clang-tidy] Added a new option to lambda-function-name to ignore warnings in macro expansion Improved check lambda-function-name with option IgnoreMacros to ignore warnings in macro expansion. Relates to #62857 (https://github.com/llvm/llvm-project/issues/62857) Reviewed By: PiotrZSL Differential Revision: https://reviews.llvm.org/D157829	2023-08-16 15:02:56 +00:00
Eduard Zingerman	8f28e8069c	[BPF] support for BPF_ST instruction in codegen Generate store immediate instruction when CPUv4 is enabled. For example: $ cat test.c struct foo { unsigned char b; unsigned short h; unsigned int w; unsigned long d; }; void bar(volatile struct foo p) { p->b = 1; p->h = 2; p->w = 3; p->d = 4; } $ clang -O2 --target=bpf -mcpu=v4 test.c -c -o - \| llvm-objdump -d - ... 0000000000000000 <bar>: 0: 72 01 00 00 01 00 00 00 (u8 )(r1 + 0x0) = 0x1 1: 6a 01 02 00 02 00 00 00 (u16 )(r1 + 0x2) = 0x2 2: 62 01 04 00 03 00 00 00 (u32 )(r1 + 0x4) = 0x3 3: 7a 01 08 00 04 00 00 00 (u64 *)(r1 + 0x8) = 0x4 4: 95 00 00 00 00 00 00 00 exit Take special care to: - apply `BPFMISimplifyPatchable::checkADDrr` rewrite for BPF_ST - validate immediate value when BPF_ST write is 64-bit: BPF interprets `(BPF_ST \| BPF_MEM \| BPF_DW)` writes as writes with sign extension. Thus it is fine to generate such write when immediate is -1, but it is incorrect to generate such write when immediate is +0xffff_ffff. This commit was previously reverted in e66affa17e32. The reason for revert was an unrelated bug in BPF backend, triggered by test case added in this commit if LLVM is built with LLVM_ENABLE_EXPENSIVE_CHECKS. The bug was fixed in D157806. Differential Revision: https://reviews.llvm.org/D140804	2023-08-16 17:51:28 +03:00
Philip Reames	3c2a66973e	[RISCVInsertVSETVLI] Generalize scalar extract (vmv.x.s, and vmx.f.s) hamdling vmv.x.s and vmv.f.s are unconditional. They read the low element of a vector register (not vector group), and function even when VL=0 or VSTART>0. As such, they are don't care with respect to both VL and LMUL. We'd previously had handling in the forward pass only via the NoRegister mechanusm. (The only instructions with SEW but without VL are these extracts.) This patch moves that handling into getDemanded so that the backwards pass benefits as well. Differential Revision: https://reviews.llvm.org/D157991	2023-08-16 07:50:59 -07:00
Soumi Manna	bd1ddc5850	[NFC][OpenMP] Initialize pointer field Reviewed By: tahonermann Differential Revision: https://reviews.llvm.org/D157989	2023-08-16 07:47:24 -07:00
Philip Reames	b06e52c32f	[RISCVInsertVSETVLI] Default to VL=1 for scalar extracts We were defaulting to VL=0 when we didn't otherwise have a vsetv nearby. Instead, let's use VL=1. VL=0 is very much a cornercase in hardware, and let's avoid if we can. Differential Revision: https://reviews.llvm.org/D158015	2023-08-16 07:35:00 -07:00
Joseph Huber	0f386e693b	[libc][fix] Fix test after changing logic for generic stdio Summary: The previous patch accidentally broke the logic for adding the `generic` subdirectory. Fix this so the CPU build works properly.	2023-08-16 09:29:29 -05:00
Joseph Huber	1e573f378c	[libc] Implement fopen, fclose, and fread on the GPU This patch implements the `fopen`, `fclose`, and `fread` functions on the GPU. These are pretty much re-implemented from what existed but using the new interface. Having this subset allows us to test the interface a bit more strenuously since we can write and read to a file. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D157622	2023-08-16 09:14:38 -05:00
Matt Arsenault	7c4aa3b37e	AMDGPU: InstCombine amdgcn.rcp(amdgcn.sqrt) -> amdgcn.rsq We currently have some wrong combines in the backend that approximately do this. https://reviews.llvm.org/D158002	2023-08-16 10:04:13 -04:00
Matt Arsenault	f19ee76f35	AMDGPU: Add baseline tests for rcp to rsq fold	2023-08-16 10:03:49 -04:00
Florian Hahn	5816d2ab28	[SimplifyCFG] Add tests for sinking load/store with swifterror operand. Add test coverage for sinking/hoisting loads/stores with swifterror pointers. Currently this isn't handled correctly by SimplifyCFG and causes a verifier error.	2023-08-16 14:51:29 +01:00
Matt Arsenault	66ee794064	AMDGPU: Fix verifier error on splatted opencl fmin/fmax and ldexp calls Apparently the spec has overloads for fmin/fmax and ldexp with one of the operands as scalar. We need to broadcast the scalars to the vector type. https://reviews.llvm.org/D158077	2023-08-16 09:42:26 -04:00
Bjorn Pettersson	0c4c961008	[LinkAllPasses] Remove unused header includes. NFCI This patch removes some includes from LinkAllPasses.h, that appears to be unused. Those should have been removed earlier when the corresponding legacy PM passes were removed. InstSimplifyPass is a bit special since the legacy PM version of the pass still exists. But since createInstSimplifyLegacyPass is defined in Scalar.h and not in InstSimplifyPass.h that particular include isn't needed anyway.	2023-08-16 15:24:19 +02:00
Timm Bäder	871ee94141	[clang][ExprConst] Use call source range for 'in call to' diags Differential Revision: https://reviews.llvm.org/D156604	2023-08-16 15:22:29 +02:00
Matthias Springer	878950b82c	[mlir][bufferization] Simplify `getBufferType` `getBufferType` computes the bufferized type of an SSA value without bufferizing any IR. This is useful for predicting the bufferized type of iter_args of a loop. To avoid endless recursion (e.g., in the case of "scf.for", the type of the iter_arg depends on the type of init_arg and the type of the yielded value; the type of the yielded value depends on the type of the iter_arg again), `fixedTypes` was used to fall back to "fixed" type. A simpler way is to maintain an "invocation stack". `getBufferType` implementations can then inspect the invocation stack to detect repetitive computations (typically when computing the bufferized type of a block argument). Also improve error messages in case of inconsistent memory spaces inside of a loop. Differential Revision: https://reviews.llvm.org/D158060	2023-08-16 15:02:07 +02:00

1 2 3 4 5 ...

471499 Commits