llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-11-23 22:00:10 +00:00

Author	SHA1	Message	Date
XinWang10	d124b02242	[X86][MC] Fix wrong encoding of promoted BMI instructions due to missing NoCD8 (#78386 ) Address review comments in #76709 Add `NoCD8` to class `ITy`, and rewrite the promoted instructions with `ITy` to avoid unexpected incorrect encoding about `NoCD8`.	2024-01-19 00:27:16 +08:00
Shengchen Kan	8bc7c0a058	[X86] Fix failures on EXPENSIVE_CHECKS builds Error message ``` * Bad machine code: Illegal virtual register for instruction * - function: test__blsi_u32 - basic block: %bb.0 (0x7a61208) - instruction: %5:gr32 = MOV32r0 implicit-def $eflags - operand 0: %5:gr32 Expected a GR32_NOREX2 register, but got a GR32 register ``` Reported by RKSimon in #77433 The failure is b/c compiler emits a MOV32r0 with operand GR32 when fast-isel is enabled. ``` // X86FastISel.cpp Register SrcReg = fastEmitInst_(X86::MOV32r0, &X86::GR32RegClass) ``` However, before this patch, compiler only allows GR32_NOREX operand b/c MOV32r0 is a pseudo instruction. In this patch, we relax the register class of the operand to GR32 b/c MOV32r0 is always expanded to XOR32rr, which can use EGPR. The bug was not introduced by #77433 but caught by it.	2024-01-19 00:19:55 +08:00
Nick Desaulniers	3044d75485	[libc][arm] add more math.h entrypoints (#77839 ) In particular, we have internal customers that would like to use nanf and scalbnf. The differences between various entrypoint files can be checked via: $ comm -3 <(grep libc\.src path/to/entrypoints.txt \| sort) \ <(grep libc\.src path/to/other/entrypoints.txt \| sort)	2024-01-18 08:18:13 -08:00
Nick Desaulniers	de03c46b8b	[libc] reverts for 32b arm (#78307 ) These were fixed properly by `f1f1875c18`. - Revert "[libc] temporarily set -Wno-shorten-64-to-32 (#77396)" - Revert "[libc] make off_t 32b for 32b arm (#77350)"	2024-01-18 08:17:19 -08:00
Zequan Wu	f4ede08c61	[lldb][Format] Fix missing inlined function names in frame formatting. (#78494 ) This fixes missing inlined function names when formatting frame and the `Block` in `SymbolContext` is a lexical block (e.g. `DW_TAG_lexical_block` in Dwarf).	2024-01-18 11:06:57 -05:00
Nikita Popov	f20488687e	[CVP] Add test with nested cycle (NFC) This is a regression test for a miscompile that would have been introduced by an upcoming patch.	2024-01-18 16:57:29 +01:00
Aart Bik	9cd4128998	[mlir][sparse] add a 3-d block and fiber test (#78529 )	2024-01-18 07:52:42 -08:00
Joseph Huber	12c90bd612	[LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (#78359 ) Summary: The linker wrapper's job is to sort various embedded inputs into a list of files that participate in a single link job. So far, this has been completely 1-to-1, that is, each input file participates in exactly one link job. However, support for AMD's target-id requires that one input file may participate in multiple link jobs. For example, if given a `gfx90a` static library and a `gfx90a:xnack+` object file input, we should link the gfx90a` target into the `gfx90a:xnack+` job. These are considered separate CPUs that can be mutually linked more or less. This patch adds the necessary logic to make this happen. It primarily reworks the logic to copy relevant input files into a separate list. So, it moves construction of the final list of link jobs into the extraction phase. We also need to copy the files in the case that it is needed more than once, as the entire workflow expects ownership of said file.	2024-01-18 09:44:56 -06:00
Simon Pilgrim	110e1717b3	[X86] X86MCInstLower.cpp - fix spelling mistake	2024-01-18 15:44:27 +00:00
Krzysztof Drewniak	aac23b08e3	[mlir][ROCDL] Stop setting amdgpu-implicitarg-num-bytes (#78498 ) Clang stopped doing this late 2021 back in `33315ef321`, and no other frontent does this, so stop doing it.	2024-01-18 09:43:46 -06:00
Krzysztof Drewniak	05e85e4fc5	[mlir][Math] Add pass to legalize math functions to f32-or-higher (#78361 ) Since most of the operations in the `math` dialect don't have low-precision implementations, add the -math-legalize-to-f32 pass that goes through and brackets low-precision math funcitons (like `math.sin %0 : f16`) with `arith.extf` and `arith.truncf`. This preserves the original semantics of the math operation but allows lowering to proceed. Versions of this lowering are already implicitly present in some passes, like ConvertGPUToROCDL. However, because those are implicit rewrites, they hide the floating-point extension and truncation, preventing anyone from writing passes that operate on those implitic extf/truncf pairs. Exposing this legalization explicitly is needed to allow lowening 8-bit floats on AMD GPUs, as the implementation of extf and truncf on that platform requires the complex logic found in ArithToAMDGPU, which runs before the GPU to ROCDL lowering.	2024-01-18 09:37:43 -06:00
Alan Phipps	22867890e4	[clang][CoverageMapping] Refactor setting MC/DC True/False Condition IDs (#78202 ) Clean-up of the algorithm that assigns MC/DC True/False control-flow condition IDs when constructing an MC/DC decision region. This patch creates a common API for setting/getting the condition IDs, making the binary logical operator visitor functions much cleaner. This patch also fixes issue https://github.com/llvm/llvm-project/issues/77873 in which a record's control flow map can be malformed due to an incorrect calculation of the True/False condition IDs.	2024-01-18 09:34:52 -06:00
Philip Reames	0fc5f4b524	[DAG] Set nneg flag when forming zext in demanded bits (#72281 ) We do the same for the analogous transform in DAGCombine, but this case was missed in the recent patch which added support for zext nneg. Sorry for the lack of test coverage. Not sure how to exercise this piece of logic. It appears to have only minimal impact on LIT tests (only test/CodeGen/X86/wide-scalar-shift-by-byte-multiple-legalization.ll), and even then, the changes without it appear uninteresting. Maybe we should remove this transform instead?	2024-01-18 07:34:08 -08:00
Sergio Afonso	2747193058	[Flang][MLIR][OpenMP] Remove the early outlining interface (#78450 ) After the removal of the OpenMP early outlining MLIR pass in #67319, the `EarlyOutliningInterface` stopped doing any useful work. It used to be necessary to tie the name of the function from which a target region was outlined to that new function, so it would be used when translating to LLVM IR in place of the outlined function's name. This is not necessary anymore, so this patch removes all references to this interface and uses of the `omp.outline_parent_name` discardable attribute in tests.	2024-01-18 15:33:43 +00:00
Haohai Wen	fb2c6bbf42	[BranchFolding] Use isSuccessor to confirm fall through (#77923 ) When merging blocks, if the previous block has no any branch instruction and has one successor, the successor may be SEH landing pad and the block will always raise exception and nerver fall through to next block. We can not merge them in such case. isSuccessor should be used to confirm it can fall through to next block.	2024-01-18 23:26:22 +08:00
Timm Baeder	819bd9e39b	[clang][Interp] IndirectMember initializers (#69900 ) We need to look at the chain of declarations to initialize the right field.	2024-01-18 16:25:05 +01:00
Alan Phipps	6d0b718e8c	[Profile][CoverageMapping] MC/DC Fix passing FileID for DecisionRegion Fixes oversight in commit `8ecbb0404d` in which FileID was not being set when creating a new MC/DC DecisionRegion.	2024-01-18 09:19:02 -06:00
madanial0	87ac65a994	[flang] Match the length size in comparison (NFC) (#78302 ) The template function call CheckDescriptorEqInt((exitStat.get(), 127) is deduced to have INT_T equal to std::int32_t instead of std::int64_t, but the length descriptor points to a 64-byte storage. The comparison does not work in a big endian. Co-authored-by: Mark Danial <mark.danial@ibm.com>	2024-01-18 10:18:21 -05:00
madanial0	fe4d502524	[flang] fix unsafe memory access using mlir::ValueRange (#78435 ) When running the `flang/test/HLFIR/simplify-hlfir-intrinsics.fir` test case on AIX we encounter issues building op as they are not found in the mlir context: ``` LLVM ERROR: Building op `arith.subi` but it isn't known in this MLIRContext: the dialect may not be loaded or this operation hasn't been added by the dialect. See also https://mlir.llvm.org/getting_started/Faq/#registered-loaded-dependent-whats-up-with-dialects-management LLVM ERROR: Building op `hlfir.yield_element` but it isn't known in this MLIRContext: the dialect may not be loaded or this operation hasn't been added by the dialect. See also https://mlir.llvm.org/getting_started/Faq/#registered-loaded-dependent-whats-up-with-dialects-management LLVM ERROR: Building op `hlfir.yield_element` but it isn't known in this MLIRContext: the dialect may not be loaded or this operation hasn't been added by the dialect. See also https://mlir.llvm.org/getting_started/Faq/#registered-loaded-dependent-whats-up-with-dialects-management ``` The issue is caused by the "Merge disjoint stack slots" pass and the error is not present if the source is built with `-mllvm --no-stack-coloring` Thanks to investigation by @stefanp-ibm we found that "the initializer_list {inputIndices[1], inputIndices[0]} has a lifetime that only exists for the range of the constructor for ValueRange. Once we get to stack coloring we merge the stack slot for that element with another stack slot and then it gets overwritten which corrupts transposedIndices" The changes below prevents the corruption of transposedIndices and passes the test case. Co-authored-by: Mark Danial <mark.danial@ibm.com>	2024-01-18 10:17:53 -05:00
Andrei Golubev	296a6842d1	[formatv][FmtAlign] Use fill count of type size_t instead of uint32_t (#78459 ) FmtAlign::fill() accepts a uint32_t variable while the usages operate on size_t values. On some platform / compiler combinations, this ends up being a narrowing conversion. Fix this by changing the function's signature. This was first seen on MSVC x86. Co-authored-by: Orest Chura <orest.chura@intel.com>	2024-01-18 10:16:11 -05:00
Simon Pilgrim	33287e35f2	[X86] Emit verbose (constant) comments before EVEX compression tag (#78585 ) This helps ensure the encoding details are next to the EVEX tag Noticed while preparing to add more constant commenting as part of #73783 and #71078	2024-01-18 15:13:42 +00:00
Jannik Silvanus	bd2430b421	[IR] Allow type change in ValueAsMetadata::handleRAUW (#76969 ) `ValueAsMetadata::handleRAUW` is a mechanism to replace all metadata referring to one value by a different value. Relax an assert that used to enforce the old and new value to have the same type. This seems to be a sanity plausibility assert only, as the implementation actually supports mismatching types. This is motivated by a downstream mechanism where we use poison ValueAsMetadata values to annotate pointee types of opaque pointer function arguments. When replacing one type with a different one to work around DXIL vs LLVM incompatibilities, we need to update type annotations, and handleRAUW is more efficient than creating new MD nodes.	2024-01-18 16:01:23 +01:00
stephenpeckham	a7f9e92d07	Fix typo (#78587 )	2024-01-18 08:57:10 -06:00
Luke Lau	9d6e189ee8	[RISCV] Use regexp to check negative extensions in test. NFC Everytime an extension is added, this test will need to have the negative extension appended to multiple CHECK lines where we're overriding the arch. This is quite time consuming since it needs to be in the right order, so this replaces the explicit list of negative extensions with a regexp instead.	2024-01-18 21:47:06 +07:00
Dominik Adamski	8930c5a4be	[NFC][OpenMP] Fix typo in CHECK line (#78586 ) Typo in test: openmp/libomptarget/test/offloading/fortran/basic-target-parallel-do.f90	2024-01-18 15:40:15 +01:00
Quinn Dawkins	5caab8bbc0	[mlir][transform] Add transform.get_operand op (#78397 ) Similar to `transform.get_result`, except it returns a handle to the operand indicated by a positional specification, same as is defined for the linalg match ops. Additionally updates `get_result` to take the same positional specification. This makes the use case of wanting to get all of the results of an operation easier by no longer requiring the user to reconstruct the list of results one-by-one.	2024-01-18 09:33:14 -05:00
cor3ntin	e90e43fb9c	[Clang][NFC] Rename CXXMethodDecl::isPure -> is VirtualPure (#78463 ) To avoid any possible confusion with the notion of pure function and the gnu::pure attribute.	2024-01-18 15:30:58 +01:00
Dominik Adamski	d87a53a960	[NFC][OpenMP][Flang] Add test for OpenMP target parallel do (#77776 ) Added test which proves that end-to-end compilation of `omp target parallel do` costruct is successful for Flang compiler.	2024-01-18 15:26:39 +01:00
Leandro Lupori	07abde2717	[flang][driver] Fix Driver/isysroot.f90 test (#78478 ) Check for DEFAULT_SYSROOT, because when it is set -isysroot has no effect.	2024-01-18 11:21:57 -03:00
Timm Baeder	30d458626d	[clang][Interp] Fix diagnosing non-const variables pre-C++11 (#76718 ) In CheckConstant(), consider that in C++98 const variables may not be read at all, and diagnose that accordingly.	2024-01-18 15:15:05 +01:00
Piotr Sobczak	57f6a3f7ea	[AMDGPU] Add global_load_tr for GFX12 (#77772 ) Support new amdgcn_global_load_tr instructions for load with transpose. * MC layer support for GLOBAL_LOAD_TR_B64/GLOBAL_LOAD_TR_B128 * Intrinsic int_amdgcn_global_load_tr * Clang builtins amdgcn_global_load_tr*	2024-01-18 15:14:42 +01:00
Vassil Vassilev	1566f1ffc6	[clang-repl] Add a interpreter-specific overload of operator new for C++ (#76218 ) This patch brings back the basic support for C by inserting the required for value printing runtime only when we are in C++ mode. Additionally, it defines a new overload of operator placement new because we can't really forward declare it in a library-agnostic way. Fixes the issue described in llvm/llvm-project#69072.	2024-01-18 16:06:04 +02:00
Guillaume Chatelet	e6a6a90fe7	[libc][NFC] Use the Sign type for DyadicFloat (#78577 )	2024-01-18 15:03:35 +01:00
Sergio Afonso	0c76865da9	[Flang][OpenMP][Lower] NFC: Combine two calls to ClauseProcessor::processTODO (#78451 ) Just a minimal readability improvement that we overlooked during refactoring.	2024-01-18 14:01:08 +00:00
Jay Foad	745b193260	[AMDGPU] Regenerate tests for #77892 after #77438	2024-01-18 13:50:59 +00:00
Krzysztof Parzyszek	e5a34f9226	[Flang][OpenMP] Push genEval closer to leaf lowering functions (#77760 ) This moves the lowering of the nested evaluations all the way to the bottom of the call stack. This PR does not attempt to change the leaf lowering functions beyond placing the call to `genEval` in there. Whether the nested evaluations should be lowered for any given op depends on the context in which that op is created, hence a `genNested` parameter was added. Contexts in which nested evaluations should not be lowered are during lowering of composite constructs, such as PARALLEL SECTIONS. This particular case is considered a block construct tied to the SECTIONS directive, and the lowering code will first create an empty parallel op, and then recursively lower the SECTIONS code. Similar situations occur when lowering most (if not all) compound/composite constructs. Recursive lowering [4/5]	2024-01-18 07:47:35 -06:00
Jay Foad	0a3a0ea591	[AMDGPU] Update uses of new VOP2 pseudos for GFX12 (#78155 ) New pseudos were added for instructions that were natively VOP3 on GFX11: V_ADD_F64_pseudo, V_MUL_F64_pseudo, V_MIN_NUM_F64, V_MAX_NUM_F64, V_LSHLREV_B64_pseudo --------- Co-authored-by: Mirko Brkusanin <Mirko.Brkusanin@amd.com>	2024-01-18 13:26:13 +00:00
Vlad Serebrennikov	f4fbbebb5e	[clang] Add test for CWG1807 (#77637 ) The test checks that objects in arrays are destructed in reverse order during stack unwinding. This patch is trying to establish a precedent how codegen tests for C++ defect report test suite should be written. Refer to PR for exact reasoning.	2024-01-18 17:14:25 +04:00
Mariusz Sikora	3e6589f21c	[AMDGPU][GFX12] Add 16 bit atomic fadd instructions (#75917 ) - image_atomic_pk_add_f16 - image_atomic_pk_add_bf16 - ds_pk_add_bf16 - ds_pk_add_f16 - ds_pk_add_rtn_bf16 - ds_pk_add_rtn_f16 - flat_atomic_pk_add_f16 - flat_atomic_pk_add_bf16 - global_atomic_pk_add_f16 - global_atomic_pk_add_bf16 - buffer_atomic_pk_add_f16 - buffer_atomic_pk_add_bf16	2024-01-18 14:01:09 +01:00
Mariusz Sikora	28b7e498b6	AMDGPU/GFX12: Add new dot4 fp8/bf8 instructions (#77892 ) Endoding is VOP3P. Tagged as deep/machine learning instructions. i32 type (v4fp8 or v4bf8 packed in i32) is used for src0 and src1. src0 and src1 have no src_modifiers. src2 is f32 and has src_modifiers: f32 fneg(neg_lo[2]) and f32 fabs(neg_hi[2]). --------- Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com>	2024-01-18 14:00:27 +01:00
Timm Baeder	18d0a7e4c0	[clang][Interp] Implement ComplexToReal casts (#77294 ) Add a new emitComplexReal() helper function and use that for the new casts as well as the old __real implementation.	2024-01-18 13:55:04 +01:00
Guillaume Chatelet	11ec512f44	[libc][NFC] Introduce a Sign type for FPBits (#78500 ) Another patch is needed to cover `DyadicFloat` and `NormalFloat` constructors.	2024-01-18 13:40:49 +01:00
Yingwei Zheng	9acc404230	[InstCombine] Recognize more rotation patterns (#78107 ) InstCombine already handles the pattern `(shl ShVal, (X & (Width - 1))) \| (lshr ShVal, ((-X) & (Width - 1)))`. Under certain circumstances, `X & (Width - 1)` will be simplified to `X`. Therefore, this patch adds support for the pattern `(shl ShVal, X) \| (lshr ShVal, ((-X) & (Width - 1)))`. Alive2: https://alive2.llvm.org/ce/z/P7JQ2V	2024-01-18 20:29:53 +08:00
Congcong Cai	64e94438a4	[InstCombine] combine mul(abs(x),abs(y)) to abs(mul(x,y)) (#78395 ) Fixes: https://github.com/llvm/llvm-project/issues/78076 Alive2 Proof: https://alive2.llvm.org/ce/z/XEDy0f	2024-01-18 20:12:00 +08:00
paperchalice	a48c1bda74	Revert "[CodeGen] Support start/stop in CodeGenPassBuilder" (#78567 ) Reverts llvm/llvm-project#70912. This breaks some bazel tests.	2024-01-18 20:09:53 +08:00
Simon Pilgrim	d12dffacaa	[X86] Add X86::getConstantFromPool helper function to replace duplicate implementations. We had the same helper function in shuffle decode / vector constant code - move this to X86InstrInfo to avoid duplication.	2024-01-18 11:59:46 +00:00
Alexey Lapshin	cf799b3d3b	[DWARFLinker][NFC] Move common code into the base library: IndexedValuesMap. (#77437 ) This patch is extracted from #74725. Both dwarflinkers contain similar classes for indexed values. Move the code into the DWARFLinkerBase.	2024-01-18 14:29:46 +03:00
Jie Fu	779af9b713	[AMDGPU] Fix -Wunused-variable in SIInsertWaitcnts.cpp (NFC) llvm-project/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1539:10: error: unused variable 'SWaitInst' [-Werror,-Wunused-variable] auto SWaitInst = ^ 1 error generated.	2024-01-18 19:28:48 +08:00
Paul Osmialowski	d5b2e41e20	[OpenMP][omp_lib] Restore compatibility with more restrictive Fortran compilers (#77780 ) The most recent changes to `omp_lib.h.var` have re-introduced some compatibility issues that had to be fixed due to the similar changes in the past. Namely: 1. D120707 has removed the "use omp_lib_kinds" statement and replaced it with import 2. D114537 added line continuation to the long lines This patch introduces the same kind of changes in order to restore compatibility with some more restrictive Fortran compilers so their users could still benefit from the LLVM's OpenMP Fortran library.	2024-01-18 11:06:24 +00:00
Utkarsh Saxena	667e58a72e	[coroutines][coro_lifetimebound] Detect lifetime issues with lambda captures (#77066 ) ### Problem ```cpp co_task<int> coro() { int a = 1; auto lamb = [a]() -> co_task<int> { co_return a; // 'a' in the lambda object dies after the iniital_suspend in the lambda coroutine. }(); co_return co_await lamb; } ``` [use-after-free](https://godbolt.org/z/GWPEovWWc) Lambda captures (even by value) are prone to use-after-free once the lambda object dies. In the above example, the lambda object appears only as a temporary in the call expression. It dies after the first suspension (`initial_suspend`) in the lambda. On resumption in `co_await lamb`, the lambda accesses `a` which is part of the already-dead lambda object. --- ### Solution This problem can be formulated by saying that the `this` parameter of the lambda call operator is a lifetimebound parameter. The lambda object argument should therefore live atleast as long as the return object. That said, this requirement does not hold if the lambda does not have a capture list. In principle, the coroutine frame still has a reference to a dead lambda object, but it is easy to see that the object would not be used in the lambda-coroutine body due to no capture list. It is safe to use this pattern inside a`co_await` expression due to the lifetime extension of temporaries. Example: ```cpp co_task<int> coro() { int a = 1; int res = co_await [a]() -> co_task<int> { co_return a; }(); co_return res; } ``` --- ### Background This came up in the discussion with seastar folks on [RFC](https://discourse.llvm.org/t/rfc-lifetime-bound-check-for-parameters-of-coroutines/74253/19?u=usx95). This is a fairly common pattern in continuation-style-passing (CSP) async programming involving futures and continuations. Document ["Lambda coroutine fiasco"](https://github.com/scylladb/seastar/blob/master/doc/lambda-coroutine-fiasco.md) by Seastar captures the problem. This pattern makes the migration from CSP-style async programming to coroutines very bugprone. Fixes https://github.com/llvm/llvm-project/issues/76995 --------- Co-authored-by: Chuanqi Xu <yedeng.yd@linux.alibaba.com>	2024-01-18 11:56:55 +01:00

... 2 3 4 5 6 ...

486830 Commits