llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2025-02-12 04:43:48 +00:00

Author	SHA1	Message	Date
Arthur Eubanks	ff4fcbb5f4	[test] Add test for null_pointer_is_valid and Inliner instsimplify interaction As requested in D151254 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D153435	2023-06-21 14:00:53 -07:00
Vitaly Buka	96928abb4d	[NFC][sanitizer] Pass user region into OnMapSecondary	2023-06-21 13:50:42 -07:00
Vitaly Buka	38dfcf96df	[NFC][sanitizer] Add OnMapSecondary callback Now it implemented as OnMap everywhere, but in follow up patches we can optimize Asan handler.	2023-06-21 13:33:41 -07:00
Florian Hahn	04a7c672ab	[PhaseOrdering] Add test showing mis-compile caused by 17fdaccccf. The test shows a mis-compile where @test gets incorrectly simplified to unreachable. The test case is reduced from a ThinLTO build of Clang, with only the relevant pass sequence included.	2023-06-21 21:15:14 +01:00
Vitaly Buka	42adbb1b2d	[NFC][sanitizer] Remove MapUnmapCallback from sanitizer_flat_map.h It's used by test only to test "test-only" code.	2023-06-21 13:14:30 -07:00
Shubham Sandeep Rastogi	e734a12b60	Emit DW_LLE_base_address + DW_LLE_offset_pair for DWARF v5 This patch tries to reduce the size of the debug_loclist section by replacing the DW_LLE_start_length opcodes currently emitted by dsymutil in favor of using DW_LLE_base_address + DW_LLE_offset_pair instead. The DW_LLE_start_length is one AddressSize followed by a ULEB per entry, whereas, the DW_LLE_base_address + DW_LLE_offset_pair will use one AddressSize for the base address, and then the DW_LLE_offset_pair is a pair of ULEBs. This will be more efficient where a loclist fragment has many entries. Differential Revision: https://reviews.llvm.org/D153080	2023-06-21 12:43:29 -07:00
Guozhi Wei	1bcb6a3da2	[MBP] Enable duplicating return block to remove jump to return Sometimes LLVM generates branch to return instruction, like PR63227. It is because in function MachineBlockPlacement::canTailDuplicateUnplacedPreds we avoid duplicating a BB into another already placed BB to prevent destroying computed layout. But if the successor BB is a return block, duplicating it will only reduce taken branches without hurt to any other branches. Differential Revision: https://reviews.llvm.org/D153093	2023-06-21 18:54:31 +00:00
Vitaly Buka	c172210492	[NFC][asan] Move AsanStats update Deallocate is a more appropiate place to update free count.	2023-06-21 11:50:45 -07:00
LLVM GN Syncbot	d036bf0711	[gn build] Port 1ee4d880e876	2023-06-21 18:41:47 +00:00
Tim Besard	1ee4d880e8	NVPTX: Lower unreachable to exit to allow ptxas to accurately reconstruct the CFG. PTX does not have a notion of `unreachable`, which results in emitted basic blocks having an edge to the next block: ``` block1: call @does_not_return(); // unreachable block2: // ptxas will create a CFG edge from block1 to block2 ``` This may result in significant changes to the control flow graph, e.g., when LLVM moves unreachable blocks to the end of the function. That's a problem in the context of divergent control flow, as `ptxas` uses the CFG to determine divergent regions, while some intructions may not be executed divergently. For example, `bar.sync` is not allowed to be executed divergently on Pascal or earlier. If we start with the following: ``` entry: // start of divergent region @%p0 bra cont; @%p1 bra unlikely; ... bra.uni cont; unlikely: ... // unreachable cont: // end of divergent region bar.sync 0; bra.uni exit; exit: ret; ``` it is transformed by the branch-folder and block-placement passes to: ``` entry: // start of divergent region @%p0 bra cont; @%p1 bra unlikely; ... bra.uni cont; cont: bar.sync 0; bra.uni exit; unlikely: ... // unreachable exit: // end of divergent region ret; ``` After moving the `unlikely` block to the end of the function, it has an edge to the `exit` block, which widens the divergent region and makes the `bar.sync` instruction happen divergently. That causes wrong computations, as we've been running into for years with Julia code (which emits a lot of `trap` + `unreachable` code all over the place). To work around this, add an `exit` instruction before every `unreachable`, as `ptxas` understands that exit terminates the CFG. Note that `trap` is not equivalent, and only future versions of `ptxas` will model it like `exit`. Another alternative would be to emit a branch to the block itself, but emitting `exit` seems like a cleaner solution to represent `unreachable` to me. Also note that this may not be sufficient, as it's possible that the block with unreachable control flow is branched to from different divergent regions, e.g. after block merging, in which case it may still be the case that `ptxas` could reconstruct a CFG where divergent regions are merged (I haven't confirmed this, but also haven't encountered this pattern in the wild yet): ``` entry: // start of divergent region 1 @%p0 bra cont1; @%p1 bra unlikely; bra.uni cont1; cont1: // intended end of divergent region 1 bar.sync 0; // start of divergent region 2 @%p2 bra cont2; @%p3 bra unlikely; bra.uni cont2; cont2: // intended end of divergent region 2 bra.uni exit; unlikely: ... exit; exit: // possible end of merged divergent region? ``` I originally tried to avoid the above by cloning paths towards `unreachable` and splitting the outgoing edges, but that quickly became too complicated. I propose we go with the simple solution first, also because modern GPUs with more flexible hardware thread schedulers don't even suffer from this issue. Finally, although I expect this to fix most of https://bugs.llvm.org/show_bug.cgi?id=27738, I do still encounter miscompilations with Julia's unreachable-heavy code when targeting these older GPUs using an older `ptxas` version (specifically, from CUDA 11.4 or below). This is likely due to related bugs in `ptxas` which have been fixed since, as I have filed several reproducers with NVIDIA over the past couple of years. I'm not inclined to look into fixing those issues over here, and will instead be recommending our users to upgrade CUDA to 11.5+ when using these GPUs. Also see: - https://github.com/JuliaGPU/CUDAnative.jl/issues/4 - https://github.com/JuliaGPU/CUDA.jl/issues/1746 - https://discourse.llvm.org/t/llvm-reordering-blocks-breaks-ptxas-divergence-analysis/71126 Reviewed By: jdoerfert, tra Differential Revision: https://reviews.llvm.org/D152789	2023-06-21 11:40:31 -07:00
Med Ismail Bennani	0c5b632071	[lldb] Fix failure in TestStackCoreScriptedProcess on x86_64 This patch should address the failure of TestStackCoreScriptedProcess that is happening specifically on x86_64. It turns out that in 1370a1cb5b97, I changed the way we extract integers from a `StructuredData::Dictionary` and in order to get a stop info from the scripted process, we call a method that returns a `SBStructuredData` containing the stop reason data. TestStackCoreScriptedProcess` was failing specifically on x86_64 because the stop info dictionary contains the signal number, that the `Scripted Thread` was trying to extract as a signed integer where it was actually parsed as an unsigned integer. That caused `GetValueForKeyAsInteger` to return the default value parameter, `LLDB_INVALID_SIGNAL_NUMBER`. This patch address the issue by extracting the signal number with the appropriate type and re-enables the test. Differential Revision: https://reviews.llvm.org/D152848 Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>	2023-06-21 10:57:13 -07:00
cynecx	63538a0879	[MC] Add .pushsection/.popsection support to COFFAsmParser The COFFAsmParser (to my surprise) didn't support the .pushsection and .popsection directives. These directives aren't directly useful, however for frontends that have inline asm support this is really useful. Rust in particular, has support for inline asm, which can be used together with these directives to "emulate" features like static generics. This patch adds support for the two mentioned directives. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D152085	2023-06-21 10:39:56 -07:00
Felipe de Azevedo Piovezan	1704c8d104	[lldb][MachO] Fix section type recognition for new DWARF 5 sections When LLDB needs to access a debug section, it generally calls SectionList::FindSectionByType with the corresponding type (we have one type for each DWARF section). However, the missing entries made some sections be classified as "eSectionTypeOther", which makes all calls to `FindSectionByType` fail. With this patch, a check-lldb build with `-DLLDB_TEST_USER_ARGS=--dwarf-version=5` reports a much lower number of failures: Unsupported : 327 Passed : 2423 Expectedly Failed: 16 Unresolved : 2 Failed : 52 This is down from previously 400~ failures. Differential Revision: https://reviews.llvm.org/D153433	2023-06-21 13:25:10 -04:00
Stella Laurenzo	54db162429	Revert "Define/guard MLIR_STANDALONE_BUILD LLVM_LIBRARY_OUTPUT_INTDIR var." This reverts commit f55fd19b6b565827af5fbf504952dcc35b8b7360. As noted on the original thread, other uses of LLVM_LIBRARY_OUTPUT_INTDIR are optional. Will make a separate patch that makes this use optional as well.	2023-06-21 10:20:35 -07:00
Alex Langford	b4827a3c0a	[lldb][NFCI] Remove ConstString from GDBRemoteCommunicationClient::ConfigureRemoteStructuredData ConstString's benefits are not being utilized here, StringRef is sufficient. Differential Revision: https://reviews.llvm.org/D153177	2023-06-21 10:17:24 -07:00
Tom Eccles	74adc3e0eb	[flang][hlfir] fix missing conversion in transpose simplification It seems just replacing the operation was not replacing all of the uses when the types of the expression before and after this pass differ (due to differing shape information). Now the shape information is always kept the same. This fixes https://github.com/llvm/llvm-project/issues/63399 Differential Revision: https://reviews.llvm.org/D153333	2023-06-21 16:54:58 +00:00
Lorenzo Chelini	9d796d05a1	[MLIR][Linalg] Rename `tile-to-foreach-thread.mlir` (NFC) `ForeachThreadOp` was renamed to `ForallOp`, update the filename to avoid confusion. See: https://reviews.llvm.org/D144242	2023-06-21 18:44:24 +02:00
Adam Paszke	9816cc916f	Fix a memory leak in the Python implementation of bytecode writer The bytecode writer config was heap-allocated, but was never freed, causing ASAN errors. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D153440	2023-06-21 09:40:51 -07:00
Joseph Huber	e0b487bfc0	[libc] Rename and install the RPC server interface This patch prepares the RPC interface to be installed. We place this in the existing `llvm-gpu-none` directory as it will also give us access to the generated `libc` headers for the opcodes. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D153040	2023-06-21 11:26:24 -05:00
Luke Lau	485d25007a	[RISCV] Custom lower fixed vector undef to scalable undef This avoids undefs from being expanded to a build vector of zeroes. As noted by @craig.topper in D153399 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D153411	2023-06-21 17:14:57 +01:00
Joseph Huber	4272d09196	[libc][NFC] Cleanup the RPC server implementation prior to installing This does some simple cleanup prior to landing the patch to install these. Differential Revision: https://reviews.llvm.org/D153439	2023-06-21 11:14:20 -05:00
Petr Hosek	037952f6d4	[libcxx] Include <sys/time.h> in posix_compat.h posix_compat.h uses struct timeval which is defined in <sys/time.h> but it doesn't include it. On most POSIX platforms like Linux or macOS, that headers is transitively included by other headers like <sys/stat.h>, but there are other platforms where this is not the case. Differential Revision: https://reviews.llvm.org/D153384	2023-06-21 16:10:01 +00:00
Craig Topper	aae155c50b	[RISCV] Use a build_vector instead of a chain insert_vector_elts for vXi1 build_vector lowreing. A build_vector is the canonical representation rather than multiple insert_vector_elts. Unfortunately, this regresses quite a few tests now primarily due to not having a vmv.s.x special case, but I hope we can improve this with future patches. Stress testing in our downstream found an infinite loop in DAG combine. This patch breaks the infinite loop. The insert_vector_element chain starts with a fixed vector undef. Fixed vector undef is currently expanded to a build_vector of 0s which gets lowered to a vmv.v.i. The insert chain overwrites all elements so SimplifyDemandedVectorElts turns the vmv.v.i back into undef and the cycle repeats. We probably should custom lower fixed vector undef to scalable vector undef. I think that would also fix the infinite loop, but I didn't test that. Reviewed By: luke Differential Revision: https://reviews.llvm.org/D153399	2023-06-21 08:57:46 -07:00
Fangrui Song	7e334ac0ec	[llvm-objdump][test] Add 2 symbols to adjust-vma.test They will demonstrate some symbol that --adjust-vma= should not adjust. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D153401	2023-06-21 08:52:53 -07:00
Craig Topper	ddf3f1b3b2	[RISCV] Stop isInterleaveShuffle from producing illegal extract_subvectors. The definition for ISD::EXTRACT_SUBVECTOR says the index must be aligned to the known minimum elements of the extracted type. We mostly got away with this but it turns out there are places that depend on this. For example, this code in getNode for ISD::EXTRACT_SUBVECTOR ``` // EXTRACT_SUBVECTOR of CONCAT_VECTOR can be simplified if the pieces of // the concat have the same type as the extract. if (N1.getOpcode() == ISD::CONCAT_VECTORS && N1.getNumOperands() > 0 && VT == N1.getOperand(0).getValueType()) { unsigned Factor = VT.getVectorMinNumElements(); return N1.getOperand(N2C->getZExtValue() / Factor); } ``` This depends on N2C->getZExtValue() being evenly divisible by Factor. Reviewed By: luke Differential Revision: https://reviews.llvm.org/D153380	2023-06-21 08:52:28 -07:00
Lang Hames	13f5125f9d	[tutorials] Add missing ExecutorSymbolDef header. Similar to c118d05f9ed, but applied to the base Kaleidoscope series.	2023-06-21 08:51:17 -07:00
Mehdi Amini	7b4ea67f31	Revert "[mlir][CRunnerUtils] Use explicit execution engine symbol registration." This reverts commit 9119325a5666e557a19f38a05525578b556c215b. A buildbot is broken, probably because of this change breaking the SHARED_LIBS=ON build more.	2023-06-21 17:50:18 +02:00
Alexander Kornienko	c96c85aba2	Revert "[LoopSink] Allow sinking to PHI-use" This reverts commit 54711a6a5872d5f97da4c0a1bd7e58d0546ca701. The commit is causing a clang crash: https://reviews.llvm.org/D152772#4437254	2023-06-21 17:37:11 +02:00
Lang Hames	c118d05f9e	[tutorials][BuildingAJIT] Add missing ExecutorSymbolDef header.	2023-06-21 08:35:37 -07:00
Jannik Silvanus	9741ac5b3b	[clang-format] vim integration: Mention python3 variant of bindings The instructions in the documentation only mentioned how to include bindings for clang-format into vim using python2. Add the instructions for python3 which were already present in the source comments. Differential Revision: https://reviews.llvm.org/D153338 Change-Id: I25fdbd36f0c7e745061908be8e26f68cb31c7dd5	2023-06-21 17:18:36 +02:00
luxufan	7fc0efd0dc	[InstCombine] Add !noundef to match behavior of violating assume The behaviors of violating assume instruction or !nonnull metadata is different. The former is immediate undefined behavior, but the latter is returning poison value. This patch adds !noundef to trigger immediate undefined behavior if !nonnull is violated. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D153400	2023-06-21 23:17:57 +08:00
Luke Lau	438cc10b8e	[IR] Add getAccessType to Instruction There are multiple places in the code where the type of memory being accessed from an instruction needs to be obtained, including an upcoming patch to improve GEP cost modeling. This deduplicates the logic between them. It's not strictly NFC as EarlyCSE/LoopStrengthReduce may catch more intrinsics now. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D150583	2023-06-21 16:17:25 +01:00
Amilendra Kodithuwakku	a685ddf1d1	Revert "[LLD][ELF] Cortex-M Security Extensions (CMSE) Support" This reverts commit c4fea3905617af89d1ad87319893e250f5b72dd6. I am reverting this for now until I figure out how to fix the build bot errors and warnings. Errors: llvm-project/lld/ELF/Arch/ARM.cpp:1300:29: error: expected primary-expression before ‘>’ token osec->writeHeaderTo<ELFT>(++sHdrs); Warnings: llvm-project/lld/ELF/Arch/ARM.cpp:1306:31: warning: left operand of comma operator has no effect [-Wunused-value]	2023-06-21 16:13:44 +01:00
Nikita Popov	565c7525b9	[X86] Add test for PR63430 (NFC)	2023-06-21 17:12:57 +02:00
Matt Arsenault	6e8911e4c6	RISCV: Update test	2023-06-21 11:08:57 -04:00
Qihan Cai	e219dd88d1	[RISCV] Add support for XCVmac extension in CV32E40P Implement XCVmac intrinsics for CV32E40P according to the specification. This is the first commit of a patch-set to upstream the 7 vendor specific extensions of CV32E40P. The patch-set aims at upstreaming the extensions on MC. The following will be on CodeGen, and the final patch-set will be on builtins if possible. The implemented version is on [0]. Contributors: @CharKeaney, Serkan Muhcu, @jeremybennett, @lewis-revill, @liaolucy, @simoncook, @xmj Spec: `62bec66b36/docs/source/instruction_set_extensions.rst` [0] https://github.com/openhwgroup/corev-llvm-project Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D152821	2023-06-21 23:08:49 +08:00
luxufan	7b80a322ab	[CVP] Don't process sext or ashr if value state including undef similar to D152773 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D152774	2023-06-21 22:50:56 +08:00
Joseph Huber	1f99526d9d	[libc][NFC] Move `__has_builtin` to `LIBC_HAS_BUILTIN` Summary: These should use the common `LIBC_HAS_BUILTIN` even if we will only compile this with `clang`.	2023-06-21 09:50:40 -05:00
Matt Arsenault	bb8649691d	X86: Fix asserts only test This test should really check the MIR result rather than rely on the debug output.	2023-06-21 10:40:48 -04:00
Matt Arsenault	80e2c26dfd	RegisterCoalescer: Fix name of pass I finally snapped and fixed this inconsistency.	2023-06-21 10:30:43 -04:00
Amilendra Kodithuwakku	c4fea39056	[LLD][ELF] Cortex-M Security Extensions (CMSE) Support This commit provides linker support for Cortex-M Security Extensions (CMSE). The specification for this feature can be found in ARM v8-M Security Extensions: Requirements on Development Tools. The linker synthesizes a security gateway veneer in a special section; `.gnu.sgstubs`, when it finds non-local symbols `__acle_se_<entry>` and `<entry>`, defined relative to the same text section and having the same address. The address of `<entry>` is retargeted to the starting address of the linker-synthesized security gateway veneer in section `.gnu.sgstubs`. In summary, the linker translates input: ``` .text entry: __acle_se_entry: [entry_code] ``` into: ``` .section .gnu.sgstubs entry: SG B.W __acle_se_entry .text __acle_se_entry: [entry_code] ``` If addresses of `__acle_se_<entry>` and `<entry>` are not equal, the linker considers that `<entry>` already defines a secure gateway veneer so does not synthesize one. If `--out-implib=<out.lib>` is specified, the linker writes the list of secure gateway veneers into a CMSE import library `<out.lib>`. The CMSE import library will have 3 sections: `.symtab`, `.strtab`, `.shstrtab`. For every secure gateway veneer <entry> at address `<addr>`, `.symtab` contains a `SHN_ABS` symbol `<entry>` with value `<addr>`. If `--in-implib=<in.lib>` is specified, the linker reads the existing CMSE import library `<in.lib>` and preserves the entry function addresses in the resulting executable and new import library. Reviewed By: MaskRay, peter.smith Differential Revision: https://reviews.llvm.org/D139092	2023-06-21 14:47:34 +01:00
Louis Dionne	afc5cca0d4	[libc++] Get rid of _LIBCPP_DISABLE_NEW_DELETE_DEFINITIONS Whether we include operator new and delete into libc++ has always been a build time setting, and piggy-backing on a macro like _LIBCPP_DISABLE_NEW_DELETE_DEFINITIONS is inconsistent with how we handle similar cases for e.g. LIBCXX_ENABLE_RANDOM_DEVICE. Instead, simply avoid including new.cpp in the sources of the library when we do not wish to include these operators in the build. This also makes us much closer to being able to share the definitions between libc++ and libc++abi, since we could technically build those definitions into a standalone static library and decide whether we link it into libc++abi.dylib or libc++.dylib. Differential Revision: https://reviews.llvm.org/D153272	2023-06-21 09:01:24 -04:00
Jay Foad	0b8a2eaf62	[AMDGPU] Add some positive tests for merging S_LOAD instructions	2023-06-21 13:56:03 +01:00
Kai Nacke	6e04287183	[SystemZ] Fix regression in test macro-prefix-map-lambda.cpp The failing test comes from https://reviews.llvm.org/D152570. Root cause of the failure is that a string constant on SystemZ has an alignment of 2, not 1. The CSKY target has a similar problem. The solution is to replace the fixed number with a regex. Reviewed By: uweigand, tuliom, Zibi Differential Revision: https://reviews.llvm.org/D153352	2023-06-21 12:53:15 +00:00
Guillaume Chatelet	bd1cba9f4f	Revert D148717 "[libc] Improve memcmp latency and codegen" Once integrated in our codebase the patch triggered a bunch of failing tests. We do not yet understand where the bug is but we revert it to move forward with integration. This reverts commit 5e32765c15ab8df3d2635a2bb5078c5b1d5714d5.	2023-06-21 12:37:14 +00:00
Louis Dionne	9ff36c24a0	[libc++] Guard terminate_successful with TEST_HAS_NO_EXCEPTIONS This one is a bit twisted. Some platforms don't have support for exiting in a clean manner, so they don't provide std::exit(). As a result, defining `terminate_successful()` on those platforms won't work, and the PSTL tests that rely on `terminate_successful()` also won't work. However, we don't have a notion of "no clean termination" in libc++, so we can't properly guard this. Since embedded platforms that don't support clean termination usually also don't enable exceptions, we don't need to be able to run those `terminate_successful` PSTL tests, and guarding the definition of `terminate_successful` with TEST_HAS_NO_EXCEPTIONS works pretty well. This is kind of a hack for the lack of having a concept of "no clean termination" in the library and in the test suite. Differential Revision: https://reviews.llvm.org/D153302	2023-06-21 08:34:51 -04:00
Christian Sigg	699e64c0d9	Revert "[Bazel][mlir] Fix ODR violation introduced in 7ab749c." This reverts commit e83c8c36005f0068841e628612e9e5bce7e2ac9e. Depending only on the support header files is not sufficient.	2023-06-21 14:29:44 +02:00
Pravin Jagtap	8e1e871e2f	[AMDGPU] Preserve dom-tree analysis in atomic optimizer. AMDGPUAtomicOptimizer updates the dominator tree whenever it modified the control flow. Therefore preserving the analysis similar to legacy PM. Reviewed By: arsenm, yassingh, #amdgpu Differential Revision: https://reviews.llvm.org/D153349	2023-06-21 08:02:43 -04:00
Kiran Chandramohan	10e3ed9919	[Flang][Debug] NFC: Correct the REQUIRES line to use system-linux Reviewed By: kkwli0 Differential Revision: https://reviews.llvm.org/D153126	2023-06-21 12:11:03 +01:00
Jay Foad	c68c6c56fc	[AMDGPU] Minor refactoring in SILoadStoreOptimizer::offsetsCanBeCombined	2023-06-21 12:05:47 +01:00

... 3 4 5 6 7 ...

465350 Commits