llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2025-01-19 06:38:28 +00:00

Author	SHA1	Message	Date
Simon Pilgrim	8e58fdd1e3	[X86] Fix masked store scheduler ports for skylake models Only uses port2+3 for agen, and was missing port4 for the actual store Noticed while investigating the skylake vs icelake diffs for Issue #62602	2023-06-01 11:59:31 +01:00
Nimish Mishra	e5aa6eeb4c	[flang][OpenMP] Verify support for private/firstprivate on unstructured sections Verification of support for lowering private/firstprivate clauses on unstructured sections. Differential Revision: https://reviews.llvm.org/D145352 Reviewed By: TIFitis	2023-06-01 16:05:23 +05:30
Ritanya B Bharadwaj	453e02ca09	[OpenMP] Add support for declare target initializer expressions Initial support for OpenMP 5.0 declare target "as if" behavior for "initializer expressions". OpenMP 5.0, 2.12.7 declare target. Reviewed By: Alexey Differential Revision: https://reviews.llvm.org/D146418	2023-06-01 05:27:23 -05:00
David Green	eb764a7f38	[AArch64] Increase the cost of i1 inserts / extracts i1 inserts will need an extra cset, and i1 extracts need a cmp (or tst) in order to be used. This increase the cost of them a little to account for those extra instructions. https://godbolt.org/z/3c5z4G7Mh Differential Revision: https://reviews.llvm.org/D151189	2023-06-01 10:54:53 +01:00
Nikita Popov	b9e328fd91	[InstCombine] Fix worklist management in rewriteGEPAsOffset() more thoroughly We need to add the replaced instruction itself to the worklist as well. We want to remove the old instructions, but can't easily do so directly, as the icmp is also one of the users and we need to retain it until the fold has finished.	2023-06-01 11:00:49 +02:00
Antonio Abbatangelo	b7e110fcfe	[X86] Align stack to 16-bytes on 32-bit with X86_INTR call convention Adds a dynamic stack alignment to functions under the interrupt call convention on x86-32. This fixes the issue where the stack can be misaligned on entry, since x86-32 makes no guarantees about the stack pointer position when the interrupt service routine is called. The alignment is done by overriding X86RegisterInfo::shouldRealignStack, and by setting the correct alignment in X86FrameLowering::calculateMaxStackAlign. This forces the interrupt handler to be dynamically aligned, generating the appropriate `and` instruction in the prologue and `lea` in the epilogue. The `no-realign-stack` attribute can be used as an opt-out. Fixes #26851 Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D151400	2023-06-01 17:00:34 +08:00
Timm Bäder	710749f786	[clang][Interp] Optionally cast comparison result to non-bool Our comparison opcodes always produce a Boolean value and push it on the stack. However, the result of such a comparison in C is int, so the later code expects an integer value on the stack. Work around this problem by casting the boolean value to int in those cases. This is not ideal for C however. The comparison is usually wrapped in a IntegerToBool cast anyway. Differential Revision: https://reviews.llvm.org/D149645	2023-06-01 10:36:33 +02:00
Nikita Popov	63babf54c2	[InstCombine] Fix worklist management in transformToIndexedCompare() Use replaceInstUsesWith() rather than plain RAUW to make sure the old instructions are added back to the worklist for DCE.	2023-06-01 10:35:13 +02:00
zhuna	53a483cee8	[DWP] add overflow check for llvm-dwp tools if offset overflow Now, if the offset overflow happens, we just silently ignore it. We will generate a bad dwp file, which will crash the gdb or make it undefined behavior, and hard to address the root cause. So, we need to produce some messages if overflow happens. Reviewed By: ayermolo, dblaikie, steven.zhang Differential Revision: https://reviews.llvm.org/D144565	2023-06-01 16:32:52 +08:00
David Green	e79fac2968	[AArch64] Adjust costs of i1 and/or/xor reductions This expands the reduction cost of i1 and/or/xor, so that larger type sizes get handled by the existing code. For i1 reductions - and will use maxv, or will use minv and xor will use addv, plus the cost of legalizing the type for larger vectors using and/or/xor. The i1 vectors will be legalized to higher width integers (say v16i8), which this overrides the cost of. As with all i1 vectors there is a chance that the types the i1 vector is created with and how it is used will not match, introducing extra extends that are not necessarily costmodelled. https://godbolt.org/z/6Gc9K6b7T Differential Revision: https://reviews.llvm.org/D151184	2023-06-01 09:28:48 +01:00
Andrzej Warzynski	a5b3677ddc	[mlir][transform] Add support for expressing scalable tile sizes This patch enables specifying scalable tile sizes when using the Transform dialect to drive tiling, e.g.: ``` %1, %loop = transform.structured.tile %0 [[4]] ``` This is implemented by extending the TileOp with a dedicated attribute for "scalability" and by updating various parsing hooks. At the moment, only the trailing tile size can be scalable. The following is not yet supported: ``` %1, %loop = transform.structured.tile %0 [[4], [4]] ``` This change is a part of larger effort to enable scalable vectorisation in Linalg. See this RFC for more context: * https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/ Differential Revision: https://reviews.llvm.org/D150944	2023-06-01 09:28:03 +01:00
Nikita Popov	cd888e6ffe	[InstCombine] Fix worklist management in foldPHIArgIntToPtrToPHI() Make sure the old operand is added back to the worklist for DCE.	2023-06-01 10:20:12 +02:00
Petr Hosek	1d6a2c5357	Revert "[BOLT][CMake] Redo the build and install targets" This reverts commit f99a7d3e38095cfdaf7e729289a8894dd31c7efa since it broke the bolt-aarch64-ubuntu-clang-shared bot.	2023-06-01 08:03:50 +00:00
Balázs Kéri	4f0436dd15	[clang][analyzer] Merge apiModeling.StdCLibraryFunctions and StdCLibraryFunctionArgs checkers into one. Main reason for this change is that these checkers were implemented in the same class but had different dependency ordering. (NonNullParamChecker should run before StdCLibraryFunctionArgs to get more special warning about null arguments, but the apiModeling.StdCLibraryFunctions was a modeling checker that should run before other non-modeling checkers. The modeling checker changes state in a way that makes it impossible to detect a null argument by NonNullParamChecker.) To make it more simple, the modeling part is removed as separate checker and can be only used if checker StdCLibraryFunctions is turned on, that produces the warnings too. Modeling the functions without bug detection (for invalid argument) is not possible. The modeling of standard functions does not happen by default from this change on. Reviewed By: Szelethus Differential Revision: https://reviews.llvm.org/D151225	2023-06-01 09:54:35 +02:00
Nikita Popov	dfb369399d	[ValueTracking] Directly use KnownBits shift functions Make ValueTracking directly call the KnownBits shift helpers, which provides more precise results. Unfortunately, ValueTracking has a special case where sometimes we determine non-zero shift amounts using isKnownNonZero(). I have my doubts about the usefulness of that special-case (it is only tested in a single unit test), but I've reproduced the special-case via an extra parameter to the KnownBits methods. Differential Revision: https://reviews.llvm.org/D151816	2023-06-01 09:46:16 +02:00
Matthias Springer	34cf67aef5	[mlir][tensor] TrackingListener: Find replacement ops through cast-like ExtractSliceOps Certain ExtractSliceOps, that do extract all elements from the destination, are treated like casts when looking for replacement ops. Such ExtractSliceOps are typically rank expansions. Differential Revision: https://reviews.llvm.org/D151804	2023-06-01 09:00:56 +02:00
Matthias Springer	26864d8fb4	[mlir][tensor] Add pattern to drop redundant insert_slice rank expansion Drop insert_slice rank expansions if they are directly followed by an inverse rank reduction. Differential Revision: https://reviews.llvm.org/D151800	2023-06-01 08:47:53 +02:00
Manas	69db592f76	[mlir][arith] Disallow zero ranked tensors for select's condition Zero ranked tensor (say tensor<i1>) when used for arith.select's condition, crashes optimizer during bufferization. This patch puts a constraint on condition to be either scalar or of matching shape as to its result. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D151270	2023-06-01 12:12:46 +05:30
Petr Hosek	2a9e6363ef	Revert "[Fuchsia] Pass through LLVM_ENABLE_HTTPLIB to stage 2" This reverts commit 80614e162222e857d8767174284701aec69381c4.	2023-06-01 06:04:16 +00:00
Petr Hosek	f99a7d3e38	[BOLT][CMake] Redo the build and install targets The existing BOLT install targets are broken on Windows becase they don't properly handle the output extension. We cannot use the existing LLVM macros since those make assumptions that don't hold for BOLT. This change instead implements custom macros following the approach used by Clang and LLD. Differential Revision: https://reviews.llvm.org/D151595	2023-06-01 06:01:39 +00:00
Phoebe Wang	801dd8870f	[X86][BF16] Fix 2 crashes with vector broadcast Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D151808	2023-06-01 13:38:47 +08:00
Guillaume Chatelet	ae5c472410	[libc] Reduce math tests runtime Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D151798	2023-06-01 05:01:56 +00:00
wangpc	78a2240172	[RISCV][NFC] Add isF argument to SchedSEWSet So that we can remove `SchedSEWSetF` and simplify some code. Reviewed By: michaelmaitland Differential Revision: https://reviews.llvm.org/D151790	2023-06-01 12:44:49 +08:00
Piyou Chen	eabf1d367f	[RISCV] check pointer before dereference Encountered ASAN crash and found it dereference without check pointer. Reviewed By: kito-cheng, eklepilkina Differential Revision: https://reviews.llvm.org/D151716	2023-05-31 21:35:07 -07:00
Joshua Cao	6ed152aff4	[SCEV] Compute AddRec range computations using different type BECount Before this patch, we can only use the MaxBECount for an AddRec's range computation if the MaxBECount has <= bit width of the AddRec. This patch reasons that if a MaxBECount has > bit width, and is <= the max value of AddRec's bit width, we can still use the MaxBECount. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D151698	2023-05-31 21:05:17 -07:00
Joshua Cao	46c59a55e7	[SCEV][NFC] Refactor range computation for AddRec to pass around APInt	2023-05-31 21:03:20 -07:00
Joshua Cao	ff471dcf76	[SCEV] Fix verification of SCEV multiples.	2023-05-31 21:00:22 -07:00
Craig Topper	490cd1164c	[RISCV] Update some tests that used "interrupt"="user". NFC Support for this was removed previously. Change them to "supervisor" since they were testing generic "interrupt" things.	2023-05-31 20:31:24 -07:00
zhanglimin	fe6716a498	[Analysis][LoongArch] Add sign extension for i32 parameters and returns In LoongArch ABI spec, we can see that in the LP64D ABI, unsigned 32-bit types, such as unsigned int, are stored in general-purpose registers as proper sign extensions of their 32-bit values. Reference: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html#_abi_lp64d Reviewed By: SixWeining, xen0n Differential Revision: https://reviews.llvm.org/D151794	2023-06-01 11:14:17 +08:00
Kevin Gleason	0ee4875ddf	[mlir][bytecode] Error if requested bytecode version is unsupported Currently desired bytecode version is clamped to the maximum. This allows requesting bytecode versions that do not exist. We have added callsite validation for this in StableHLO to ensure we don't pass an invalid version number, probably better if this is managed upstream. If a user wants to use the current version, then omitting `setDesiredBytecodeVersion` is the best way to do that (as opposed to providing a large number). Adding this check will also properly error on older version numbers as we increment the minimum supported version. Silently claming on minimum version would likely lead to unintentional forward incompatibilities. Separately, due to bytecode version being `int64_t` and using methods to read/write uints, we can generate payloads with invalid version numbers: ``` mlir-opt file.mlir --emit-bytecode --emit-bytecode-version=-1 \| mlir-opt <stdin>:0:0: error: bytecode version 18446744073709551615 is newer than the current version 5 ``` This is fixed with version bounds checking as well. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D151838	2023-05-31 19:20:42 -07:00
Jason Molenda	21dfaf60a7	Setting to control addressable bits in high memory On AArch64, it is possible to have a program that accesses both low (0x000...) and high (0xfff...) memory, and with pointer authentication, you can have different numbers of bits used for pointer authentication depending on whether the address is in high or low memory. This adds a new target.process.highmem-virtual-addressable-bits setting which the AArch64 Mac ABI plugin will use, when set, to always set those unaddressable high bits for high memory addresses, and will use the existing target.process.virtual-addressable-bits setting for low memory addresses. This patch does not change the existing behavior when only target.process.virtual-addressable-bits is set. In that case, the value will apply to all addresses. Not yet done is recognizing metadata in a live process connection (gdb-remote qHostInfo) or a Mach-O corefile LC_NOTE to set the correct number of addressing bits for both memory ranges. That will be a future change. Differential Revision: https://reviews.llvm.org/D151292 rdar://109746900	2023-05-31 18:38:34 -07:00
Ellis Hoag	bf8fe1c38f	Fix clang driver tests for cspgo in lld The tests introduced by https://reviews.llvm.org/D151589 were failing because I guess some test platforms don't have `lld`. Similar tests add `-B%S/Inputs/lld` to the clang commands so lets try this here to fix the tests. ``` clang: error: invalid linker name in argument '-fuse-ld=lld' ```	2023-05-31 18:21:41 -07:00
LLVM GN Syncbot	7eebfddffc	[gn build] Port dc124cda7c78	2023-06-01 01:15:34 +00:00
Nikolas Klauser	b1dc43aa3a	[libc++] Optimize for_each for segmented iterators ``` --------------------------------------------------- Benchmark old new --------------------------------------------------- bm_for_each/1 3.00 ns 2.98 ns bm_for_each/2 4.53 ns 4.57 ns bm_for_each/3 5.82 ns 5.82 ns bm_for_each/4 6.94 ns 6.91 ns bm_for_each/5 7.55 ns 7.75 ns bm_for_each/6 7.06 ns 7.45 ns bm_for_each/7 6.69 ns 7.14 ns bm_for_each/8 6.86 ns 4.06 ns bm_for_each/16 11.5 ns 5.73 ns bm_for_each/64 43.7 ns 4.06 ns bm_for_each/512 356 ns 7.98 ns bm_for_each/4096 2787 ns 53.6 ns bm_for_each/32768 20836 ns 438 ns bm_for_each/262144 195362 ns 4945 ns bm_for_each/1048576 685482 ns 19822 ns ``` Reviewed By: ldionne, Mordante, #libc Spies: arichardson, libcxx-commits Differential Revision: https://reviews.llvm.org/D151274	2023-05-31 18:15:25 -07:00
Nikolas Klauser	dc124cda7c	[libc++] Introduce __for_each_segment and use it in copy/move This simplifies the code inside copy/move and makes it easier to apply the optimization to other algorithms. Reviewed By: ldionne, Mordante, #libc Spies: arichardson, libcxx-commits Differential Revision: https://reviews.llvm.org/D151265	2023-05-31 18:15:20 -07:00
Ellis Hoag	85af42df5d	[lld] add context-sensitive PGO options for MachO Enable support for CSPGO for lld MachO targets. Since lld MachO does not support `-plugin-opt=`, we need to create the `--cs-profile-generate` and `--cs-profile-path=` options and propagate them in `Darwin.cpp`. These flags are not supported by ld64. Also outline code into `getLastCSProfileGenerateArg()` to share between `CommonArgs.cpp` and `Darwin.cpp`. CSPGO is already implemented for ELF (https://reviews.llvm.org/D56675) and COFF (https://reviews.llvm.org/D98763). Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D151589	2023-05-31 17:53:46 -07:00
David Blaikie	ed7be0d4d1	lldb: Fix cross-cu-reference test to explicitly request that feature	2023-06-01 00:35:39 +00:00
David Blaikie	e731a2678c	[DebugInfo][Split DWARF][LTO]: Ensure only a single CU is emitted Split DWARF doesn't handle LTO of any form (roughly there's an assumption that each dwo file will have one CU - it's not explicitly documented, nor explicitly handled, so the ecosystem isn't really well understood/tested/etc). This had previously been handled by implementing (& disabling by default) the `-split-dwarf-cross-cu-references` flag, which would disable use of ref_addr across two dwo CUs. This worked for a while, at least in LTO (it didn't address Split DWARF+Full LTO, but that's an unlikely combination, as the benefits of Split DWARF are more limited in a full LTO build) - because the only source of cross-CU references was inlined functions, so by making those non-cross-CU (by moving the referenced inlined function DWARF description into the referencing CU) the result was one CU per dwo. But recently the Function Specialization pass was added to the ThinLTO pipeline, which caused imported functions that may not be inlined to be emitted by a backend compile. This meant foreign CU entities (not just abstract origins/cross-CU referenced entities)/standalone foreign CUs could be emitted by a backend compile. The end result was, due to a bug* in binutils dwp (I think basically it saw two CUs in a single dwo and reprocessed the offsets in the shared debug_str_offsets.dwo section) this situation lead to corrupted strings. So to make this more robust, I've generalized the definition of the `-split-dwarf-cross-cu-references` flag (perhaps it should be renamed at this point, but it's /really/ niche, doubt anyone's using it - more or less there for experimentation when we get around to figuring out spec'ing LTO+Split DWARF) to mean "single CU in a dwo file" and added more general handling for this. There's certainly some weird corner cases that could come up in terms of "how do we choose which CU to put everything in" - for now it's "first come, first served" which is probably going to be OK for ThinLTO - the base module will have the first functions and first CU, imported fragments will come after that. For LTO the choice will be fairly arbitrary - but, again, essentially whichever module comes first. * Arguably a bug in binutils dwp, but since the feature isn't well specified, I'd rather avoid dabbling in this uncertain area and ensure LLVM doesn't produce especially novel DWARF (dwos with multiple CUs) regardless of whether binutils dwp would/should be fixed. I'm not confident debuggers could read such a dwo file well, etc.	2023-06-01 00:21:00 +00:00
Jan Svoboda	0038d6c7fe	[clang] NFCI: Use `DirectoryEntryRef` in framework lookup This removes one use of the deprecated `DirectoryEntry::getName()`.	2023-05-31 16:04:46 -07:00
Jan Svoboda	2d817d0368	[clang] NFCI: Use the `*Ref()` variant on search paths This removes some uses of the deprecated `DirectoryEntry::getName()`.	2023-05-31 16:04:45 -07:00
Jan Svoboda	5be0e83635	[clang] NFCI: Use `FileEntryRef` in `PPLexerChange` This removes some uses of the deprecated `FileEntry::getName()`.	2023-05-31 16:04:45 -07:00
Daniel Thornburgh	80614e1622	[Fuchsia] Pass through LLVM_ENABLE_HTTPLIB to stage 2	2023-05-31 15:56:20 -07:00
Mike Rostecki	2addaeda18	[docs] Use ExecutorAddr::toPtr() in ORC documentation. The partial move from JITTargetAddress to ExecutorAddr in 8b1771bd9f30 did not update the ORC or Kaleidoscope documents. This patch fixes the inconsistency. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D150458	2023-05-31 15:15:55 -07:00
Joseph Huber	349c0aacb3	[OpenMP] Remove 'keep_alive' functionality from the device RTL The OpenMP DeviceRTL uses a hacky workaround to keep certain runtime calls alive. This used a function that prevented them from being optimized out. We needed this hack because the 'OpenMPOpt' pass likes to introduce new runtime calls into the TU. This then interacted badly with the method of linking the bitcode file per-TU like we do with Nvidia. The OpenMPOpt pass would then generate a runtime call to a function that was never linked in. This should not be a problem anymore because we unconditionally link in the `libomptarget.devicertl.a` runtime library. This should thus only extract symbols that are undefined. So, if we do end up with an unresolved reference it will be resolved by the static library. The downside to this is that if we are doing non-LTO NVPTX compilation that introduces one of these calls it will be linked outside the module and therefore provide the overhead of an external function call. However, removing this flag should make optimizing things easier. We will need to see if that performance is a problem. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D151324	2023-05-31 17:12:43 -05:00
Jan Svoboda	54e47724cf	[clang] NFCI: Use `DirectoryEntryRef` in `PrecompiledPreamble` This removes some uses of the deprecated `DirectoryEntry::getName()`.	2023-05-31 14:54:42 -07:00
Jan Svoboda	6587d9d87a	[clang] NFCI: Use `DirectoryEntryRef` for `ModuleMap::BuiltinIncludeDir` This removes some uses of the deprecated `DirectoryEntry::getName()`.	2023-05-31 14:54:42 -07:00
Jan Svoboda	20edfaeef7	[clang] NFCI: Use `DirectoryEntryRef` in `ASTWriter` This removes the call to deprecated `DirectoryEntry::getName()`.	2023-05-31 14:54:42 -07:00
Adrian Prantl	7de43526e3	HostInfoMacOS: Add a utility function for finding an SDK-specific tool This is an API needed by swift-lldb. https://reviews.llvm.org/D151591	2023-05-31 14:46:35 -07:00
Adrian Prantl	a5e9f2c81e	Factor out xcrun into a function (NFC) https://reviews.llvm.org/D151588	2023-05-31 14:46:35 -07:00
Peter Klausler	f513bd8088	[flang] CUDA Fortran - part 4/5: definability and characteristics Extend the definability and procedure characteristics checking infrastructure in semantics to check for context-dependent CUDA object definability violations and problems with CUDA attribute incompatibility in procedure interfaces. Depends on https://reviews.llvm.org/D150159, https://reviews.llvm.org/D150161, & https://reviews.llvm.org/D150162. Differential Revision: https://reviews.llvm.org/D150163	2023-05-31 14:25:38 -07:00

... 3 4 5 6 7 ...

463140 Commits