llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-12-12 09:41:26 +00:00

Author	SHA1	Message	Date
Aaron Ballman	96dd50ee83	Fix LLVM Sphinx build	2023-10-05 08:12:58 -04:00
Guillot Tony	5d78b78c85	[C2X] N3007 Type inference for object definitions This patches implements the auto keyword from the N3007 standard specification. This allows deducing the type of the variable like in C++: ``` auto nb = 1; auto chr = 'A'; auto str = "String"; ``` The list of statements which allows the usage of auto: * Basic variables declarations (int, float, double, char, char...) Macros declaring a variable with the auto type The list of statements which will not work with the auto keyword: * auto arrays * sizeof(), alignas() * auto parameters, auto return type * auto as a struct/typedef member * uninitialized auto variables * auto in an union * auto as a enum type specifier * auto casts * auto in an compound literals Differential Revision: https://reviews.llvm.org/D133289	2023-10-05 08:11:02 -04:00
Matthias Springer	58678d3bcf	[mlir][tensor][bufferize] `tensor.empty` bufferizes to allocation (#68201 ) `BufferizableOpInterface::bufferizesToAllocation` is queried when forming equivalence sets during bufferization. It is not really needed for ops like `tensor.empty` which do not have tensor operands, but it should be added for consistency. This change should have been part of #68080. No test is added because the return value of this function is irrelevant for ops without tensor operands. (However, this function acts as a form documentation, describing the bufferization semantics of the op.)	2023-10-05 14:06:00 +02:00
Matthias Springer	5958043e2d	[mlir][bufferization] Add `dump_alias_sets` option to transform op (#68289 ) Add `dump_alias_sets` to `transform.bufferization.one_shot_bufferize`. This option is useful for debugging. Also improve the verifier to ensure that `test_analysis_only` is set when other debugging flags are enabled.	2023-10-05 14:05:45 +02:00
Yingwei Zheng	33a194b158	[InstCombine] Add pre-commit tests for #67915 . NFC.	2023-10-05 20:01:55 +08:00
Kohei Yamaguchi	777a6e6f10	[mlir][docs] Cleanup documentations [NFC] (#67945 ) - Fix missing links - Fix missing link format - Move transform::ApplyFuncToLLVMConversionPatternOp into Transform dialect - Remove duplicated MemRef's TOC - Remove duplicated Memref's dma_start/dma_wait docs	2023-10-05 13:33:41 +02:00
Ivan Kosarev	f04aa1f814	[AMDGPU][CodeGen] Fold immediates in src1 operands of V_MAD/MAC/FMA/FMAC. (#68002 )	2023-10-05 14:22:29 +03:00
Bogdan Graur	821dfc392a	Revert "[X86] Change target of __builtin_ia32_cmp[p\|s][s\|d] from avx into sse/sse2 (#67410 )" Does not respect `__attribute__((target("avx"))`. This reverts commit `ccd5b8db48`.	2023-10-05 10:33:44 +00:00
Simon Pilgrim	baecc9e997	[CostModel][X86] getShuffleCost - add fallback (to half vector) for bfloat vector shuffle costs Add initial half/bfloat broadcast shuffles test coverage (more to follow) Fixes #68117 - which was stuck in a loop between getting scalarized insert/extract costs for the shuffle and then trying to convert a bfloat insert into a shuffle again......	2023-10-05 11:12:40 +01:00
Jonas Hahnfeld	abb9eb2778	[Lex] Handle repl_input_end in Preprocessor::LexTokensUntilEOF() This fixes many unit tests when trying to enable IncrementalExtensions by default for testing purposes. Differential Revision: https://reviews.llvm.org/D158415	2023-10-05 12:09:14 +02:00
Mats Petersson	6180964a01	[flang]Pass to add vscale range attribute (#68103 ) Add vscale range attirbute for the Scalable Vector Extension (SVE) if provided on the command-line (options in a previous commit) If no command-line option is provided, if the target-feature of SVE is specified and the architecture is AArch64, it defualts to 128-2048. in other words a vscale-min of 1, vscale-max of 16. A pass is used to add the atribute to all functions. The vectorizer will use this attribute to generate the SVE instruction to match the range specified. The attribute is harmless if there is no vectorizable operations in the function.	2023-10-05 11:06:00 +01:00
long.chen	5979e1dfb1	[mlir] Fix `empty-tensor-elimination` around self-copies (#68129 ) * Fixes #67977, a crash in `empty-tensor-elimination`. * Also improves `linalg.copy` canonicalization. * Also improves indentation indentation in `mlir-linalg-ods-yaml-gen.cpp`.	2023-10-05 12:04:20 +02:00
Michael Buch	3a35ca01fc	[lldb][DWARFASTParserClang][NFCI] Extract DW_AT_data_member_location calculation logic (#68231 ) Currently this non-trivial calculation is repeated multiple times, making it hard to reason about when the `byte_offset`/`member_byte_offset` is being set or not. This patch simply moves all those instances of the same calculation into a helper function. We return an optional to remain an NFC patch. Default initializing the offset would make sense but requires further analysis and can be done in a follow-up patch.	2023-10-05 10:49:42 +01:00
cor3ntin	c72d3a0966	[Clang] Handle consteval expression in array bounds expressions (#66222 ) The bounds of a c++ array is a _constant-expression_. And in C++ it is also a constant expression. But we also support VLAs, ie arrays with non-constant bounds. We need to take care to handle the case of a consteval function (which are specified to be only immediately called in non-constant contexts) that appear in arrays bounds. This introduces `Sema::isAlwayConstantEvaluatedContext`, and a flag in ExpressionEvaluationContextRecord, such that immediate functions in array bounds are always immediately invoked. Sema had both `isConstantEvaluatedContext` and `isConstantEvaluated`, so I took the opportunity to cleanup that. The change in `TimeProfilerTest.cpp` is an unfortunate manifestation of the problem that #66203 seeks to address. Fixes #65520	2023-10-05 11:36:27 +02:00
Christian Sigg	c64a098ee4	[GVN] Fix after `46aac949bc` replaceUsersOf -> removeUsersOf	2023-10-05 11:31:35 +02:00
tdanyluk	a608830807	[mlir] Speed up FuncToLLVM using a SymbolTable (#68082 ) We have a project where this saves 23% of the compilation time. This means using hashmaps instead of searching in linked lists.	2023-10-05 11:24:52 +02:00
Rin	d3e4702c0f	[AArch64] [LoopVectorize] Use either fixed-width or scalable VF when tail-folding (#67543 ) Since the getMaximisedVFForTarget function is called twice, once for fixed-width and once for scalable, it adds no value to always return a fixed-width VF. Instead, when we are tail-folding, we can use either fixed-width or scalable vectors.	2023-10-05 10:24:30 +01:00
Nikita Popov	46aac949bc	[GVN] Remove users from ICF when RAUWing loads When performing store to load forwarding, replacing users of the load may turn an indirect call into one with a known callee, in which case it might become willreturn, invalidating cached ICF information. Avoid this by removing users. This is a bit more aggressive than strictly necessary (e.g. this shouldn't be necessary when doing load-load CSE), but better safe than sorry. Fixes https://github.com/llvm/llvm-project/issues/48805.	2023-10-05 11:21:33 +02:00
Christian Sigg	59e75b7df2	[mlir][bazel] Sort targets list.	2023-10-05 11:14:12 +02:00
Christian Sigg	2f1c78014f	[mlir][bazel] Fix after `d20fbc9007`	2023-10-05 11:12:55 +02:00
Guray Ozen	29b33e8397	[bazel] fix typo	2023-10-05 11:08:46 +02:00
Jonas Hahnfeld	3116d60494	[Lex] Introduce Preprocessor::LexTokensUntilEOF() This new method repeatedly calls Lex() until end of file is reached and optionally fills a std::vector of Tokens. Use it in Clang's unit tests to avoid quite some code duplication. Differential Revision: https://reviews.llvm.org/D158413	2023-10-05 11:04:07 +02:00
Job Noorman	7fa33773e3	[BOLT][RISCV] Handle long tail calls (#67098 ) Long tail calls use the following instruction sequence on RISC-V: ``` 1: auipc xi, %pcrel_hi(sym) jalr zero, %pcrel_lo(1b)(xi) ``` Since the second instruction in isolation looks like an indirect branch, this confused BOLT and most functions containing a long tail call got marked with "unknown control flow" and didn't get optimized as a consequence. This patch fixes this by detecting long tail call sequence in `analyzeIndirectBranch`. `FixRISCVCallsPass` also had to be updated to expand long tail calls to `PseudoTAIL` instead of `PseudoCALL`. Besides this, this patch also fixes a minor issue with compressed tail calls (`c.jr`) not being detected. Note that I had to change `BinaryFunction::postProcessIndirectBranches` slightly: the documentation of `MCPlusBuilder::analyzeIndirectBranch` mentions that the [`Begin`, `End`) range contains the instructions immediately preceding `Instruction`. However, in `postProcessIndirectBranches`, all the instructions in the BB where passed in the range. This made it difficult to find the preceding instruction so I made sure only the preceding instructions are passed.	2023-10-05 08:55:30 +00:00
Guray Ozen	d20fbc9007	[MLIR][NVGPU] Introduce `nvgpu.wargroup.mma.store` Op for Hopper GPUs (#65441 ) This PR introduces a new Op called `warpgroup.mma.store` to the NVGPU dialect of MLIR. The purpose of this operation is to facilitate storing fragmanted result(s) `nvgpu.warpgroup.accumulator` produced by `warpgroup.mma` to the given memref. An example of fragmentated matrix is given here : https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#wgmma-64n16-d The `warpgroup.mma.store` does followings: 1) Takes one or more `nvgpu.warpgroup.accumulator` type (fragmented results matrix) 2) Calculates indexes per thread in warp-group and stores the data into give memref. Here's an example usage: ``` // A warpgroup performs GEMM, results in fragmented matrix %result1, %result2 = nvgpu.warpgroup.mma ... // Stores the fragmented result to memref nvgpu.warpgroup.mma.store [%result1, %result2], %matrixD : !nvgpu.warpgroup.accumulator< fragmented = vector<64x128xf32>>, !nvgpu.warpgroup.accumulator< fragmented = vector<64x128xf32>> to memref<128x128xf32,3> ```	2023-10-05 10:54:13 +02:00
Job Noorman	c7d6d62252	[BOLT][RISCV] Implement TLS le/ie relocations (#67112 ) Handle the following relocations related to TLS local-exec and initial-exec: - R_RISCV_TLS_GOT_HI20 - R_RISCV_TPREL_HI20 - R_RISCV_TPREL_ADD - R_RISCV_TPREL_LO12_I - R_RISCV_TPREL_LO12_S In addition, GNU ld has a quirk where after TLS le relaxation, two unofficial relocation types may be emitted: - R_RISCV_TPREL_I - R_RISCV_TPREL_S Since they are unofficial (defined in the reserved range of relocation types), LLVM does not define them. Hence, I've defined them locally in BOLT in a private namespace.	2023-10-05 08:53:51 +00:00
Martin Storsjö	7c5e4e5fa3	Reapply [compiler-rt] Check for and use -lunwind when linking with -nodefaultlibs (#66584 ) If libc++ is available and should be used as the ubsan C++ ABI library, the check for libc++ might fail if libc++ is a static library, as the -nodefaultlibs flag inhibits a potential compiler default -lunwind. Just like the -nodefaultlibs configuration tests for and manually adds a bunch of compiler default libraries, look for -lunwind too. This is a reland of #65912.	2023-10-05 11:41:11 +03:00
Jonas Hahnfeld	26bb22b0c8	Revert "InstCombine: Introduce SimplifyDemandedUseFPClass" It causes a test failure of clang/test/Headers/__clang_hip_math.hip: https://lab.llvm.org/buildbot/#/builders/109/builds/75022 This reverts commit `59c6e2e9c1`.	2023-10-05 10:26:10 +02:00
Owen Pan	8902f12e61	[clang-format][doc] Update the Linux kernel coding style URL	2023-10-05 01:18:49 -07:00
cor3ntin	49666ec038	[Clang] Fix constant evaluating a captured variable in a lambda (#68090 ) with an explicit parameter. We tried to read a pointer to a non-existent `This` APValue when constant-evaluating an explicit object lambda call operator (the `this` pointer is never set in explicit object member functions) Fixes #68070	2023-10-05 10:17:50 +02:00
Guray Ozen	b74cfc139a	[mlir][nvgpu] Improve nvgpu->nvvm transformation of `warpgroup.mma` Op (NFC) (#67325 ) This PR introduces substantial improvements to the readability and maintainability of the `nvgpu.warpgroup.mma` Op transformation from nvgpu->nvvm. This transformation plays a crucial role in GEMM and manages complex operations such as generating multiple wgmma ops and iterating their descriptors. The prior code lacked clarity, but this PR addresses that issue effectively. PR does followings: Introduces a helper class: `WarpgroupGemm` class encapsulates the necessary functionality, making the code cleaner and more understandable. Detailed Documentation: Each function within the helper class is thoroughly documented to provide clear insights into its purpose and functionality.	2023-10-05 10:16:59 +02:00
Guray Ozen	7eb2b99f16	[mlir] Change the class name of the `GenerateWarpgroupDescriptor` (#68286 )	2023-10-05 10:15:40 +02:00
Nikita Popov	c263639134	[InstSimplify] Add missing const qualifier (NFC) The context instruction is a "const Instruction *", so that's what getWithInstruction() should accept.	2023-10-05 10:05:16 +02:00
Nikita Popov	ba149f6e09	[ValueTracking] Add SimplifyQuery ctor without TLI (NFC) While we pretty much always want to pass DT, AC and CxtI, most places don't care about TLI. Add an overload where this is not one of the first parameters.	2023-10-05 09:55:00 +02:00
Timm Bäder	57147bb253	[clang][Interp] Support LambdaThisCaptures Differential Revision: https://reviews.llvm.org/D154262	2023-10-05 09:46:15 +02:00
Nicolas Vasilache	cc2d9515d0	[mlir][Transform] NFC - Fix missing field in copy constructor	2023-10-05 07:40:35 +00:00
Timm Bäder	4d7f4a7c82	[clang][Interp] Only lazily visit constant globals Differential Revision: https://reviews.llvm.org/D158516	2023-10-05 09:37:37 +02:00
Yusra Syeda	5c4d35d8cf	[SystemZ][z/OS] Update lowerCall (#68259 ) This PR moves some calculation out of `LowerCall` and into `SystemZXPLINKFrameLowering::processFunctionBeforeFrameFinalized`. We need to make this change because LowerCall isn't invoked for functions that don't have function calls, and it is required for some tooling to work correctly. A function that does not make any calls is required to allocate 32 bytes for the parameter area required by the ABI. However, we allocate 64 bytes because this additional space is utilized by certain tools, like the debugger. Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>	2023-10-05 10:32:57 +03:00
Nikita Popov	941c75a530	[ValueTracking] Return ConstantRange instead of setting limits (NFC) Same as previously done for intrinsics.	2023-10-05 09:24:20 +02:00
Guray Ozen	6dc7717bca	[MLIR][NVGPU] Change name `wgmma.descriptor` to `warpgroup.descriptor` (NFC) (#67526 ) NVGPU dialect is gaining large support for warpgroup level operations, and their names always starts with `warpgroup....`. This PR changes name of Op and type from `wgmma.descriptor` to `warpgroup.descriptor` for sake of consistency.	2023-10-05 09:01:48 +02:00
Timm Baeder	5ef904b5da	[clang][ExprConst] Don't try to evaluate value-dependent DeclRefExprs (#67778 ) The Expression here migth be value dependent, which makes us run into an assertion later on. Just bail out early. Fixes #67690	2023-10-05 08:42:34 +02:00
Qizhi Hu	eef35c287e	[clang-tidy]: Add TagDecl into LastTagDeclRanges in UseUsingCheck only when it is a definition (#67639 ) Fix issue 67529, [clang-tidy: modernize-use-using fails when type is implicitly forward declared](https://github.com/llvm/llvm-project/issues/67529) The problem is that using `Lexer` to get record declaration will lose the type information when its original type is pointer or reference. This patch fix this problem by skip adding the tag declaration when it's only a 'declaration' and not a 'definition'. Co-authored-by: huqizhi <836744285@qq.com>	2023-10-05 13:49:21 +08:00
Mircea Trofin	a4765c6a02	[mlgo] Fix state-tracking-coro.ll test Post #68263, the inline advisor printer tries to print SCC Nodes' names, but if we perform a full pipeline (like O1), there'll be some DCE-ing happening and the Node pointers kept in the advisor for this (printing) purpose are dangling. Using the more eager printer post each scc inline pass is sufficient.	2023-10-04 22:07:44 -07:00
Yaxun (Sam) Liu	c6ed5a6125	Revert "[HIP] Support compressing device binary (#67162 )" This reverts commit `a1e81d2ead`. Revert "Fix test hip-offload-compress-zlib.hip" This reverts commit `ba01ce6066`. Revert due to sanity fail at https://lab.llvm.org/buildbot/#/builders/5/builds/37188 https://lab.llvm.org/buildbot/#/builders/238/builds/5955 /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/clang/lib/Driver/OffloadBundler.cpp:1012:25: runtime error: load of misaligned address 0xaaaae2d90e7c for type 'const uint64_t' (aka 'const unsigned long'), which requires 8 byte alignment 0xaaaae2d90e7c: note: pointer points here bc 00 00 00 94 dc 29 9a 89 fb ca 2b 78 9c 8b 8f 77 f6 71 f4 73 8f f7 77 73 f3 f1 77 74 89 77 0a ^ #0 0xaaaaba125f70 in clang::CompressedOffloadBundle::decompress(llvm::MemoryBuffer const&, bool) /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/clang/lib/Driver/OffloadBundler.cpp:1012:25 #1 0xaaaaba126150 in clang::OffloadBundler::ListBundleIDsInFile(llvm::StringRef, clang::OffloadBundlerConfig const&) /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/clang/lib/Driver/OffloadBundler.cpp:1089:7 Will reland after fixing it.	2023-10-05 00:29:42 -04:00
MaheshRavishankar	f28f09dcf0	[mlir][Vector] Add Broadcast -> CastOp reordering to SinkVectorBroadcasting patterns. (#68257 ) Also fix an issue with sink broadcast across elementwise where `arith.cmpf` is elementwise, but result type is different. The result type is not same as the operand type, creating illegal IR. Similar issue with `vector.fma` which only accepts vector operand types, while broadcasts can have scalar sources. Sinking broadcast across would result in an illegal `vector.fma` (with scalar operands).	2023-10-04 21:27:24 -07:00
Mircea Trofin	1b3fc40586	[mlgo][coro] Assign coro split-ed functions a `FunctionLevel` (#68263 )	2023-10-04 21:20:00 -07:00
Matt Arsenault	59c6e2e9c1	InstCombine: Introduce SimplifyDemandedUseFPClass This is the floating-point analog of SimplifyDemandedBits. If we know the edge cases are assumed impossible in uses, it's possible to prune upstream edge case handling. Start by only using this on returns in functions with nofpclass returns (where I'm surprised there are no other combines), but this can be extended to include any other nofpclass use or FPMathOperator with flags. Partially addresses issue #64870 https://reviews.llvm.org/D158648	2023-10-04 21:06:24 -07:00
Kazu Hirata	bbdbcd83e6	[Support] Rename llvm::support::endianness to llvm::endianness (#68174 ) As part of an effort to make our codebase ready for the migration from llvm::support::endianness to std::endian in C++20, this patch renames llvm::support::endianness to llvm::endianness. The intent of this patch is to make fully qualified names less painful. That is, with this patch, we can just say llvm::endianness::big rather than llvm::support::endianness::big. I'm not renaming llvm::support::endianness to llvm::endian because we have a lot of places with "using namespace support;" where it would be ambiguous whether "endian" refers to llvm::endian or llvm::support::endian. This patch defines several helpers for gradual migration: namespace llvm { namespace support { using endianness = llvm::endianness; constexpr llvm::endianness big = llvm::endianness::big; constexpr llvm::endianness little = llvm::endianness::little; constexpr llvm::endianness native = llvm::endianness::native; While we are at it, this patch changes the enum to "enum class". The "enum class" prevents implicit conversions from endianness to bool. I've fixed three such instances of implicit conversions: `95f4b2a708` `8de2ecc2e7` `a7517e12ca`	2023-10-04 20:34:02 -07:00
Kazu Hirata	f37028c2cc	[Support] Rename HashBuilderImpl to HashBuilder (NFC) (#68173 ) Commit `9370271ec5` made HashBuilder an alias for HashBuilderImpl: template <class HasherT, support::endianness Endianness> using HashBuilder = HashBuilderImpl<HasherT, Endianness>; This patch renames HashBuilderImpl to HashBuilder while removing the alias above.	2023-10-04 20:33:38 -07:00
Maksim Levental	6f44f87011	[mlir][python] Enable py312. (#68009 ) Python 3.12 has been released so why not support it.	2023-10-04 20:35:24 -05:00
Bill Wendling	9a954c6935	[Clang] Implement the 'counted_by' attribute The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member in the same structure holding the count of elements in the flexible array. This information can be used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. This example specifies the that the flexible array member 'array' has the number of elements allocated for it in 'count': struct bar; struct foo { size_t count; /* ... / struct bar array[] __attribute__((counted_by(count))); }; This establishes a relationship between 'array' and 'count', specifically that 'p->array' must have at least 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained through changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count sizeof(struct bar ))); p->count = count + 42; } The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: struct foo p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count * sizeof(struct bar ))); p->count = count + 42; } void use_foo(int index) { p->count += 42; p->array[index] = 0; / The sanitizer cannot properly check this access */ } Reviewed By: nickdesaulniers, aaron.ballman Differential Revision: https://reviews.llvm.org/D148381	2023-10-04 18:26:15 -07:00

1 2 3 4 5 ...

476866 Commits