llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-11-23 13:50:11 +00:00

Author	SHA1	Message	Date
Jie Fu	904cf66ec1	[lldb] Fix build error in lldb-dap.cpp (NFC) llvm-project/lldb/tools/lldb-dap/lldb-dap.cpp:679:5: error: unknown type name 'kkkk' kkkk response["success"] = false; ^	2024-01-20 08:09:20 +08:00
Walter Erquinigo	8bef2f27a0	[lldb-dap] Add a CMake variable for defining a welcome message (#78811 ) lldb-dap instances managed by other extensions benefit from having a welcome message with, for example, a basic user guide or a troubleshooting message. This PR adds a cmake variable for defining such message in a simple way. This message appears upon initialization but before initCommands are executed, as they might cause a failure and prevent the message from being displayed.	2024-01-19 18:55:40 -05:00
Xiangxi Guo (Ryan)	c17aa14f4c	[mlir][index] Fold `cmp(x, x)` when `x` isn't a constant (#78812 ) Such cases show up in the middle of optimizations passes, e.g., after some rewrites and then CSE. The current folder can fold such cases when the inputs are constant; this patch improves it to fold even if the inputs are non-constant.	2024-01-19 15:54:33 -08:00
Petr Hosek	b86d02375e	[libc] Redo the install targets (#78795 ) Prior to this change, we wouldn't build headers that aren't referenced by other parts of the libc which would result in a build error during installation. To address this, we make the header target a dependency of the libc archive. Additionally, we also redo the install targets, moving the install targets closer to build targets and simplifying the hierarchy and generally matching what we do for other runtimes.	2024-01-19 15:45:22 -08:00
erman-gurses	b7360fbe8c	[mlir][amdgpu] Shared memory access optimization pass (#75627 ) It implements transformation to optimize accesses to shared memory. Reference: https://reviews.llvm.org/D127457 _This change adds a transformation and pass to the NvGPU dialect that attempts to optimize reads/writes from a memref representing GPU shared memory in order to avoid bank conflicts. Given a value representing a shared memory memref, it traverses all reads/writes within the parent op and, subject to suitable conditions, rewrites all last dimension index values such that element locations in the final (col) dimension are given by newColIdx = col % vecSize + perm[row](col / vecSize, row) where perm is a permutation function indexed by row and vecSize is the vector access size in elements (currently assumes 128bit vectorized accesses, but this can be made a parameter). This specific transformation can help optimize typical distributed & vectorized accesses common to loading matrix multiplication operands to/from shared memory._	2024-01-19 15:44:45 -08:00
spupyrev	30aa9fb4c1	Revert "[InstrProf] Adding utility weights to BalancedPartitioning (#72717 )" This reverts commit `5954b9dca2` due to broken Windows build	2024-01-19 15:13:47 -08:00
Jonas Devlieghere	593395f0da	[dsymutil] Fix spurious warnings in MachODebugMapParser (#78794 ) When the MachODebugMapParser encounters an object file that cannot be found on disk, it currently leaves the parser in an incoherent state, resulting in spurious warnings that can in turn slow down dsymutil. This fixes #78411. rdar://117515153	2024-01-19 15:10:57 -08:00
Craig Topper	9396891271	[RISCV] Don't look for sext in RISCVCodeGenPrepare::visitAnd. We want to know the upper 33 bits of the And Input are zero. SExt only guarantees they are the same. We originally checked for SExt or ZExt when we were using isImpliedByDomCondition because a ZExt may have been changed to SExt before we visited the And. We are no longer using isImpliedByDomCondition so we can only look for zext with the nneg flag. While here, switch to PatternMatch to simplify the code. Fixes #78783	2024-01-19 14:44:47 -08:00
Craig Topper	66cea7143a	[RISCV] Add test case for #78783 . NFC	2024-01-19 14:44:47 -08:00
Sam Clegg	39e024d9e2	[lld][WebAssembly] Use the archive offset with --whole-archive (#78791 ) This essentially ports `0b1413a8` from the ELF linker.	2024-01-19 14:42:03 -08:00
Aiden Grossman	f9bc1ee3fc	[llvm-objdump] Add support for symbolizing PGOBBAddrMap Info (#76386 ) This patch adds in support for symbolizing PGO information contained within the SHT_LLVM_BB_ADDR_MAP section in llvm-objdump. The outputs are simply the raw values contained within the section.	2024-01-19 14:28:31 -08:00
Arthur Eubanks	86eaf6083b	[X86] Refine X86DAGToDAGISel::isSExtAbsoluteSymbolRef() (#76191 ) We just need to check if the global is large or not. In the kernel code model, globals are in the negative 2GB of the address space, so globals can be a sign extended 32-bit immediate. In other code models, small globals are in the low 2GB of the address space, so sign extending them is equivalent to zero extending them.	2024-01-19 14:11:18 -08:00
bd1976bris	cd05ade13a	Add a "don't override" mapping for -fvisibility-from-dllstorageclass (#74629 ) `-fvisibility-from-dllstorageclass` allows for overriding the visibility of globals from their DLL storage class. The visibility to apply can be customised for the different classes of globals via a set of dependent options that specify the mapping values: - `-fvisibility-dllexport=<value>` - `-fvisibility-nodllstorageclass=<value>` - `-fvisibility-externs-dllimport=<value>` - `-fvisibility-externs-nodllstorageclass=<value>` Currently, one of the existing LLVM visibilities, `hidden`, `protected`, `default`, can be used as a mapping value. This change adds a new mapping value: `keep`, which specifies that the visibility should not be overridden for that class of globals. The behaviour of `-fvisibility-from-dllstorageclass` is otherwise unchanged and existing uses of this set of options will be unaffected. The background to this change is that currently the PS4 and PS5 compilers effectively ignore visibility - dllimport/export is the supported method for export control in C/C++ source code. Now, we would like to support visibility attributes and options in our frontend, in addition to dllimport/export. To support this, we will override the visibility of globals with explicit dllimport/export annotations but use the `keep` setting for globals which do not have an explicit dllimport/export. There are also some minor improvements to the existing options: - Make the `LANGOPS` `BENIGN` as they don't involve the AST. - Correct/clarify the help text for the options.	2024-01-19 21:57:40 +00:00
Sam Clegg	2bfa5ca927	[lld][WebAssembly] Reset context object after each link (#78770 ) This mirrors how the ELF linker works. I wasn't able to find anywhere where this is currently tested. Followup to #78640, which triggered a regression.	2024-01-19 13:51:35 -08:00
Konstantin Varlamov	58780b811c	[libc++][hardening] In production hardening modes, trap rather than abort (#78561 ) In the hardening modes that can be used in production (`fast` and `extensive`), make a failed assertion invoke a trap instruction rather than calling verbose abort. In the debug mode, still keep calling verbose abort to provide a better user experience and to allow us to keep our existing testing infrastructure for verifying assertion messages. Since the debug mode by definition enables all assertions, we can be sure that we still check all the assertion messages in the library when running the test suite in the debug mode. The main motivation to use trapping in production is to achieve better code generation and reduce the binary size penalty. This way, the assertion handler can compile to a single instruction, whereas the existing mechanism with verbose abort results in generating a function call that in general cannot be optimized away (made worse by the fact that it's a variadic function, imposing an additional penalty). See the [RFC](https://discourse.llvm.org/t/rfc-hardening-in-libc/73925) for more details. Note that this mechanism can now be completely [overridden at CMake configuration time](https://github.com/llvm/llvm-project/pull/77883). This patch also significantly refactors `check_assertion.h` and expands its test coverage. The main changes: - when overriding `verbose_abort`, don't do matching inside the function -- just print the error message to `stderr`. This removes the need to set a global matcher and allows to do matching in the parent process after the child finishes; - remove unused logic for matching source locations and for using wildcards; - make matchers simple functors; - introduce `DeathTestResult` that keeps data about the test run, primarily to make it easier to test. In addition to the refactoring, `check_assertion.h` can now recognize when a process exits due to a trap.	2024-01-19 13:48:13 -08:00
Danila Malyutin	0388ab3e29	[Statepoint][NFC] Use uint16_t and add an assert (#78717 ) Use a fixed width integer type and assert that DwarRegNum fits the 16 bits. This is a follow up to review comments on #78600.	2024-01-20 00:44:00 +03:00
spupyrev	5954b9dca2	[InstrProf] Adding utility weights to BalancedPartitioning (#72717 ) Adding weights to utility nodes in BP so that we can give more importance to certain utilities. This is useful when we optimize several objectives jointly.	2024-01-19 13:36:59 -08:00
Eric Miotto	9175dd9cbc	[CMake] Detect properly new linker introduced in Xcode 15 (#77806 ) As explained in [1], this linker is functionally equivalent to the classic one (`ld64`) for build system purposes -- in particular to enable the use of order files to link `clang`. For this reason, in addition to fixing the detection rename `LLVM_LINKER_IS_LD64` to `LLVM_LINKER_IS_APPLE` to make the result of such detection more clear -- this should not cause any issue to downstream users, from a quick search in SourceGraph [2], only Swift uses the value of this variable (which I will take care of updating in due time). [1]: https://developer.apple.com/documentation/xcode-release-notes/xcode-15-release-notes#Linking [2]: https://sourcegraph.com/search?q=context:global+LLVM_LINKER_IS_LD64+lang:cmake+fork:no+-file:AddLLVM.cmake+-file:clang/tools/driver/CMakeLists.txt&patternType=standard&sm=1&groupBy=repo rdar://120740222	2024-01-19 16:32:32 -05:00
Pranav Kant	4482fd846a	Revert "[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636 )" This reverts commit `4d11f04b20`. This breaks some programs as mentioned in #78636	2024-01-19 21:02:20 +00:00
Sam Clegg	f5e58a0380	[lld][ELF] Simplify handleLibcall. NFC (#78659 ) I noticed this while working on #78658	2024-01-19 12:39:35 -08:00
Nick Desaulniers	2c0d20668a	[libc] remove extra -Werror (#78761 ) -Werror is now a global default as of commit `c52b467875` ("Reapply "[libc] build with -Werror (#73966)" (#74506)")	2024-01-19 12:28:06 -08:00
Aaron Ballman	89592061a4	Remove an unused API; NFC Not only is this unused, it's really confusing having getAPValueResult() and getResultAsAPValue() as sibling APIs	2024-01-19 15:15:24 -05:00
Craig Topper	9ae28fb9d3	[RISCV] Prevent RISCVMergeBaseOffsetOpt from calling getVRegDef on a physical register. (#78762 ) Fixes #78679.	2024-01-19 12:15:08 -08:00
Mital Ashok	924701311a	[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768 ) Closes #77638, #24186 Rebased from <https://reviews.llvm.org/D156032>, see there for more information. Implements wording change in [CWG2137](https://wg21.link/CWG2137) in the first commit. This also implements an approach to [CWG2311](https://wg21.link/CWG2311) in the second commit, because too much code that relies on `T{ T_prvalue}` being an elision would break. Because that issue is still open and the CWG issue doesn't provide wording to fix the issue, there may be different behaviours on other compilers.	2024-01-19 21:10:51 +01:00
Aiden Grossman	2b31a673de	[llvm-exegesis] Make duplicate snippet repetitor produce whole snippets (#77224 ) Currently, the duplicate snippet repetitor will truncate snippets that do not exactly divide the minimum number of instructions. This patch corrects that behavior by making the duplicate snippet repetitor duplicate the snippet in its entirety until the minimum number of instructions has been reached. This makes the behavior consistent with the loop snippet repetitor, which will execute at least `--num-repetitions` (soon to be renamed `--min-instructions`) instructions.	2024-01-19 11:34:16 -08:00
Aiden Grossman	c067524852	[SHT_LLVM_BB_ADDR_MAP] Add assertion and clarify docstring (#77374 ) This patch adds an assertion to readBBAddrMapImpl to confirm that PGOAnalyses and BBAddrMaps are of the same size when PGO information is requested (part of the API contract). This patch also updates the docstring for readBBAddrMap to better clarify what is guaranteed.	2024-01-19 11:34:00 -08:00
Jeremy Kun	76ffa8f63a	[mlir][transform]: fix broken bazel build for TensorTransformOps (#78766 )	2024-01-19 20:33:36 +01:00
Min-Yih Hsu	5330daad41	[RISCV] Add support for Smepmp 1.0 (#78489 ) Smepmp is a supervisor extension that prevents privileged processes from accessing unprivileged program and data. Spec: https://github.com/riscv/riscv-tee/blob/main/Smepmp/Smepmp.pdf	2024-01-19 11:09:35 -08:00
Jeremy Kun	2521e9785d	[mlir][transform]: fix broken bazel build (#78757 ) Broken by `42b160356f`	2024-01-19 19:55:43 +01:00
Durgadoss R	43531e7196	[LLVM][NVPTX] Add cp.async.bulk.commit/wait intrinsics (#78698 ) This patch adds NVVM intrinsics and NVPTX codegen for the bulk variants of the async-copy commit/wait instructions. lit tests are added to verify the generated PTX. PTX Doc link: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-async-bulk-commit-group Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2024-01-19 10:42:33 -08:00
Quinn Dawkins	42b160356f	[mlir][transform] Add an op for replacing values with function calls (#78398 ) Adds `transform.func.cast_and_call` that takes a set of inputs and outputs and replaces the uses of those outputs with a call to a function at a specified insertion point. The idea with this operation is to allow users to author independent IR outside of a to-be-compiled module, and then match and replace a slice of the program with a call to the external function. Additionally adds a mechanism for populating a type converter with a set of conversion materialization functions that allow insertion of casts on the inputs/outputs to and from the types of the function signature.	2024-01-19 13:21:52 -05:00
Thurston Dang	0784b1eefa	Re-exec TSan with no ASLR if memory layout is incompatible on Linux (#78351 ) TSan's shadow mappings only support 30-bits of ASLR entropy on x86 Linux, and it is not practical to support the maximum of 32-bits (due to pointer compression and the overhead of shadow mappings). Instead, this patch changes TSan to re-exec without ASLR if it encounters an incompatible memory layout, as suggested by Dmitry in https://github.com/google/sanitizers/issues/1716. If ASLR is already disabled but the memory layout is still incompatible, it will abort. This patch involves a bit of refactoring, because the old code is: 1. InitializePlatformEarly() 2. InitializeAllocator() 3. InitializePlatform(): CheckAndProtect() but it may already segfault during InitializeAllocator() if the memory layout is incompatible, before we get a chance to check in CheckAndProtect(). This patch adds CheckAndProtect() during InitializePlatformEarly(), before the allocator is initialized. Naturally, it is necessary to ensure that CheckAndProtect() does not allow the heap regions to be occupied here, hence we generalize CheckAndProtect() to optionally check the heap regions. We keep the original behavior of CheckAndProtect() in InitializePlatform() as a last line of defense. We need to be careful not to prematurely abort if ASLR is disabled but TSan was going to re-exec for other reasons (e.g., unlimited stack size); we implement this by moving all the re-exec logic into ReExecIfNeeded().	2024-01-19 09:33:54 -08:00
Sam Clegg	5b0e45c8ce	[lld][WebAssembly] Fix use of undefined funcs under --warn-unresolved-symbols (#78643 ) When undefined functions exist in the final link we need to create stub functions (otherwise direct calls to those functions could not be generated). We were creating those stub when `--unresolved-symbols=ignore-all` was passed but overlooked the fact that `--warn-unresolved-symbols` essentially has the same effect (i.e. undefined function can exist in the final link). Fixes: #53987	2024-01-19 09:32:22 -08:00
Felipe de Azevedo Piovezan	b6677835fe	[AsmPrinter][DebugNames] Implement DW_IDX_parent entries (#77457 ) This implements the ideas discussed in [1]. To summarize, this commit changes AsmPrinter so that it outputs DW_IDX_parent information for debug_name entries. It will enable debuggers to speed up queries for fully qualified types (based on a DWARFDeclContext) significantly, as debuggers will no longer need to parse the entire CU in order to inspect the parent chain of a DIE. Instead, a debugger can simply take the parent DIE offset from the accelerator table and peek at its name in the debug_info/debug_str sections. The implementation uses two types of DW_FORM for the DW_IDX_parent attribute: 1. DW_FORM_ref4, which points to the accelerator table entry for the parent. 2. DW_FORM_flag_present, when the entry has a parent that is not in the table (that is, the parent doesn't have a name, or isn't allowed to be in the table as per the DWARF spec). This is space-efficient, since it takes 0 bytes. The implementation works by: 1. Changing how abbreviations are encoded (so that they encode which form, if any, was used to encode IDX_Parent) 2. Creating an MCLabel per accelerator table entry, so that they may be referred by IDX_parent references. When all patches related to this are merged, we are able to show that evaluating an expression such as: ``` lldb --batch -o 'b CodeGenFunction::GenerateCode' -o run -o 'expr Fn' -- \ clang++ -c -g test.cpp -o /dev/null ``` is far faster: from ~5000 ms to ~1500ms. Building llvm-project + clang with and without this patch, and looking at its impact on object file size: ``` ls -la $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \.cpp.o) \| awk '{s+=$5} END {printf "%\047d\n", s}' 11,507,327,592 -la $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \.cpp.o) \| awk '{s+=$5} END {printf "%\047d\n", s}' 11,436,446,616 ``` That is, an increase of 0.62% in total object file size. Looking only at debug_names: ``` $stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \.cpp.o) \| grep __debug_names \| awk '{s+="0x"$3} END {printf "%\047d\n", s}' 440,772,348 $stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \.cpp.o) \| grep __debug_names \| awk '{s+="0x"$3} END {printf "%\047d\n", s}' 369,867,920 ``` That is an increase of 19%. DWARF Linkers need to be changed in order to support this. This commit already brings support to "base" linker, but it does not attempt to modify the parallel linker. Accelerator entries refer to the corresponding DIE offset, and this patch also requires the parent DIE offset -- it's not clear how the parallel linker can access this. It may be obvious to someone familiar with it, but it would be nice to get help from its authors. [1]: https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/	2024-01-19 09:19:09 -08:00
lntue	c80d68a676	[libc] Add float.h header. (#78737 )	2024-01-19 12:04:34 -05:00
Jordan Rupprecht	d0d0727104	[lldb][test] Apply @expectedFailureAll/@skipIf early for debug_info tests (#73067 ) The @expectedFailureAll and @skipIf decorators will mark the test case as xfail/skip if _all_ conditions passed in match, including debug_info. * If debug_info is not one of the matching conditions, we can immediately evaluate the check and decide if it should be decorated. * If debug_info is present as a match condition, we need to defer whether or not to decorate until when the `LLDBTestCaseFactory` metaclass expands the test case into its potential variants. This is still early enough that the standard `unittest` framework will recognize the test as xfail/skip by the time the test actually runs. TestDecorators exhibits the edge cases more thoroughly. With the exception of `@expectedFailureIf` (added by this commit), all those test cases pass prior to this commit. This is a followup to `212a60ec37`.	2024-01-19 10:50:05 -06:00
Joseph Huber	cebe4de66f	[libc] Fix test failing on GPU using deprecated 'add_unittest' Summary: We use `add_libc_test' now because it works for both hermetic and unit tests. If the test needs to be unit test only you use `UNIT_TEST_ONLY` as an argument.	2024-01-19 10:38:41 -06:00
Marius Brehler	205e15c176	[mlir][docs] Fix broken link	2024-01-19 17:38:27 +01:00
Sander de Smalen	5f41cef58f	[AArch64] NFC: Simplify discombobulating 'requiresSMChange' interface (#78703 ) Having it return a `std::optional<bool>` is unnecessarily confusing. This patch changes it to a simple 'bool'. This patch also removes the 'BodyOverridesInterface' operand because there is only a single use for this which is easily rewritten.	2024-01-19 16:15:38 +00:00
Sander de Smalen	40a631f452	[Clang] Refactor diagnostics for SME builtins. (#78258 ) The arm_sme.td file was still using `IsSharedZA` and `IsPreservesZA`, which should be changed to match the new state attributes added in #76971. This patch adds `IsInZA`, `IsOutZA` and `IsInOutZA` as the state for the Clang builtins and fixes up the code in SemaChecking and SveEmitter to match. Note that the code is written in such a way that it can be easily extended with ZT0 state (to follow in a future patch).	2024-01-19 16:02:24 +00:00
Jay Foad	e89a7c41ba	[AMDGPU] Update comment on SIInstrInfo::isLegalFLATOffset for GFX12	2024-01-19 15:53:06 +00:00
Jay Foad	1abf2570b3	[AMDGPU] Make use of CPol::SWZ_* in SelectionDAG. NFC. For GlobalISel this was already done in AMDGPUInstructionSelector::selectBufferLoadLds.	2024-01-19 15:48:45 +00:00
Kareem Ergawy	5dbb30d950	[MLIR][OpenMP] Better error reporting for unsupported `nowait` (#78551 ) Provides some context for failing to generate LLVM IR for `target enter\|exit\|update` directives when `nowait` is provided. This is directly helpful for flang users since they would get this error message if they tried to use `nowait`. Before that we had a very generic message. This is a follow-up to https://github.com/llvm/llvm-project/pull/78269, please only review the latest commit (the one with the same commit message as the PR title).	2024-01-19 16:47:24 +01:00
Jay Foad	e21b0b083e	[AMDGPU] Remove gws feature from GFX12 (#78711 ) This was already done for LLVM. This patch just updates the Clang builtin handling to match.	2024-01-19 15:45:53 +00:00
Jay Foad	97747467f1	[AMDGPU] Update hazard recognition for new GFX12 wait counters (#78722 ) In most cases the hazards no longer apply, so just assert that we are not on GFX12.	2024-01-19 15:30:41 +00:00
Jay Foad	89226ecbb9	[AMDGPU] Do not widen scalar loads on GFX12 (#78724 ) GFX12 has subword scalar loads so there is no need to do this.	2024-01-19 15:30:07 +00:00
Kiran Chandramohan	aac1d9710b	[Flang][OpenMP] Consider renames when processing reduction intrinsics (#70822 ) Fixes #68654 Depends on https://github.com/llvm/llvm-project/pull/70790	2024-01-19 15:17:21 +00:00
Vinayak Dev	497a8604b3	[FileCheck]: Fix diagnostics for NOT prefixes (#78412 ) Fixes #70221 Fix a bug in FileCheck that corrects the error message when multiple prefixes are provided through --check-prefixes and one of them is a PREFIX-NOT. Earlier, only the first of the provided prefixes was displayed as the erroneous prefix, while the actual error might be on the prefix that occurred at the end of the prefix list in the input file. Now, the right NOT prefix is shown in the error message.	2024-01-19 15:08:24 +00:00
Nikita Popov	9350860824	[AsmParser] Add support for reading incomplete IR (part 1) (#78421 ) Add an `-allow-incomplete-ir` flag to the IR parser, which allows reading IR with missing declarations. This is intended to produce a best-effort interpretation of the IR, along the same lines of what we would manually do when taking, for example, a function from `-print-after-all` output and fixing it up to be valid IR. This patch only supports dropping references to undeclared metadata, either by dropping metadata attachments from instructions/functions, or by dropping calls to certain intrinsics (like debug intrinsics). I will implement support for inserting missing function/global declarations in a followup patch. We don't have real use lists for metadata, so the approach here is to iterate over the whole IR and identify metadata that needs to be dropped. This does not support all possible cases, but should handle anything that's relevant for the function-only IR use case.	2024-01-19 16:08:16 +01:00
LLVM GN Syncbot	535b197b8e	[gn build] Port `9ff4be640f`	2024-01-19 14:42:19 +00:00

1 2 3 4 5 ...

486862 Commits