llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-12-12 01:18:53 +00:00

Author	SHA1	Message	Date
Marek Kurdej	d2cb198f25	[libc++] Make future_error constructor standard-compliant This patch removes the non compliant constructor of std::future_error and adds the standards compliant constructor in C++17 instead. Note that we can't support the constructor as an extension in all standard modes because it uses delegating constructors, which require C++11. We could in theory support the constructor as an extension in C++11 and C++14 only, however I believe it is acceptable not to do that since I expect the breakage from this patch will be minimal. If it turns out that more code than we expect is broken by this, we can reconsider that decision. This was found during D99515. Differential Revision: https://reviews.llvm.org/D99567 Co-authored-by: Louis Dionne <ldionne.2@gmail.com>	2023-10-05 09:11:49 -04:00
Alexey Bataev	2c49311dea	[SLP][NFC]Add insertsubvector test with small source vector, NFC.	2023-10-05 06:03:58 -07:00
Matt Arsenault	bad5893c30	Attributor: Fix not propagating nofpclass arguments through transitive callers Fixes #64867	2023-10-05 06:03:40 -07:00
Matt Arsenault	75a3cc9c92	Attributor: Add a few nofpclass tests	2023-10-05 06:03:39 -07:00
Aaron Ballman	dc1000d5b2	Revert "[C2X] N3007 Type inference for object definitions" This reverts commit `5d78b78c85`. Reverting due to the failure found by: https://lab.llvm.org/buildbot/#/builders/245/builds/14999	2023-10-05 08:52:12 -04:00
Yingwei Zheng	c73d5544d9	[CVP] Add additional cttz tests. NFC.	2023-10-05 20:49:44 +08:00
cor3ntin	6989c4842f	[Documentation] Fix some invalid references in sphinx documentation (#68239 )	2023-10-05 14:40:59 +02:00
Nikita Popov	236228f43d	[BitcodeReader] Replace unsupported constexprs in metadata with undef Metadata (via ValueAsMetadata) can reference constant expressions that may no longer be supported. These references can both be in function-local metadata and module metadata, if the same expression is used in multiple functions. At least in theory, such references could also be in metadata proper, rather than just inside ValueAsMetadata references in calls. Instead of trying to expand these expressions (which we can't reliably do), pretend that the constant has been deleted, which means that ValueAsMetadata references will get replaced with undef metadata. Fixes https://github.com/llvm/llvm-project/issues/68281.	2023-10-05 14:38:25 +02:00
Matt Arsenault	2ca30eb8fd	AMDGPU/GlobalISel: Handle mubuf load/store for more types (#68268 ) Fixes MUBUF path for most vectors and pointers, which unblocks fixing the gfx6/7 run lines in assorted tests. Also fixes inconsistent behavior for -flat-for-global.	2023-10-05 05:36:16 -07:00
Matthias Springer	ea71d2d0fe	[mlir][tensor][bufferize] Reshapes: Fix memory side effects and memory space (#68195 ) * `tensor.collapse_shape` may bufferize to a memory read because the op may have to reallocate the source buffer. * `tensor.reshape` should not use `bufferization.clone` for reallocation. This op has requirements wrt. the order of buffer writes/reads. Use `memref.alloc` and `memref.copy` instead. Also fix a bug where the memory space of the source buffer was not propagated to the reallocated buffer.	2023-10-05 14:33:04 +02:00
qcolombet	932dc9d8c4	[mlir][MemRef] Add a pattern to simplify `extract_strided_metadata(ca… (#68291 ) …st)` `expand-strided-metadata` was missing a pattern to get rid of `memref.cast`. The pattern is straight foward: Produce a new `extract_strided_metadata` with the source of the cast and fold the static information (sizes, strides, offset) along the way.	2023-10-05 14:32:42 +02:00
Yingwei Zheng	253ee85f34	[CVP] Add pre-commit cttz/ctpop tests. NFC.	2023-10-05 20:32:00 +08:00
Giulio Eulisse	2fab15d8fd	Inline operator== and operator!= (#67958 ) Avoid triggering -Wnon-template-friend on newer GCC. --------- Co-authored-by: Richard Smith <richard@metafoo.co.uk>	2023-10-05 15:31:28 +03:00
Aaron Ballman	96dd50ee83	Fix LLVM Sphinx build	2023-10-05 08:12:58 -04:00
Guillot Tony	5d78b78c85	[C2X] N3007 Type inference for object definitions This patches implements the auto keyword from the N3007 standard specification. This allows deducing the type of the variable like in C++: ``` auto nb = 1; auto chr = 'A'; auto str = "String"; ``` The list of statements which allows the usage of auto: * Basic variables declarations (int, float, double, char, char...) Macros declaring a variable with the auto type The list of statements which will not work with the auto keyword: * auto arrays * sizeof(), alignas() * auto parameters, auto return type * auto as a struct/typedef member * uninitialized auto variables * auto in an union * auto as a enum type specifier * auto casts * auto in an compound literals Differential Revision: https://reviews.llvm.org/D133289	2023-10-05 08:11:02 -04:00
Matthias Springer	58678d3bcf	[mlir][tensor][bufferize] `tensor.empty` bufferizes to allocation (#68201 ) `BufferizableOpInterface::bufferizesToAllocation` is queried when forming equivalence sets during bufferization. It is not really needed for ops like `tensor.empty` which do not have tensor operands, but it should be added for consistency. This change should have been part of #68080. No test is added because the return value of this function is irrelevant for ops without tensor operands. (However, this function acts as a form documentation, describing the bufferization semantics of the op.)	2023-10-05 14:06:00 +02:00
Matthias Springer	5958043e2d	[mlir][bufferization] Add `dump_alias_sets` option to transform op (#68289 ) Add `dump_alias_sets` to `transform.bufferization.one_shot_bufferize`. This option is useful for debugging. Also improve the verifier to ensure that `test_analysis_only` is set when other debugging flags are enabled.	2023-10-05 14:05:45 +02:00
Yingwei Zheng	33a194b158	[InstCombine] Add pre-commit tests for #67915 . NFC.	2023-10-05 20:01:55 +08:00
Kohei Yamaguchi	777a6e6f10	[mlir][docs] Cleanup documentations [NFC] (#67945 ) - Fix missing links - Fix missing link format - Move transform::ApplyFuncToLLVMConversionPatternOp into Transform dialect - Remove duplicated MemRef's TOC - Remove duplicated Memref's dma_start/dma_wait docs	2023-10-05 13:33:41 +02:00
Ivan Kosarev	f04aa1f814	[AMDGPU][CodeGen] Fold immediates in src1 operands of V_MAD/MAC/FMA/FMAC. (#68002 )	2023-10-05 14:22:29 +03:00
Bogdan Graur	821dfc392a	Revert "[X86] Change target of __builtin_ia32_cmp[p\|s][s\|d] from avx into sse/sse2 (#67410 )" Does not respect `__attribute__((target("avx"))`. This reverts commit `ccd5b8db48`.	2023-10-05 10:33:44 +00:00
Simon Pilgrim	baecc9e997	[CostModel][X86] getShuffleCost - add fallback (to half vector) for bfloat vector shuffle costs Add initial half/bfloat broadcast shuffles test coverage (more to follow) Fixes #68117 - which was stuck in a loop between getting scalarized insert/extract costs for the shuffle and then trying to convert a bfloat insert into a shuffle again......	2023-10-05 11:12:40 +01:00
Jonas Hahnfeld	abb9eb2778	[Lex] Handle repl_input_end in Preprocessor::LexTokensUntilEOF() This fixes many unit tests when trying to enable IncrementalExtensions by default for testing purposes. Differential Revision: https://reviews.llvm.org/D158415	2023-10-05 12:09:14 +02:00
Mats Petersson	6180964a01	[flang]Pass to add vscale range attribute (#68103 ) Add vscale range attirbute for the Scalable Vector Extension (SVE) if provided on the command-line (options in a previous commit) If no command-line option is provided, if the target-feature of SVE is specified and the architecture is AArch64, it defualts to 128-2048. in other words a vscale-min of 1, vscale-max of 16. A pass is used to add the atribute to all functions. The vectorizer will use this attribute to generate the SVE instruction to match the range specified. The attribute is harmless if there is no vectorizable operations in the function.	2023-10-05 11:06:00 +01:00
long.chen	5979e1dfb1	[mlir] Fix `empty-tensor-elimination` around self-copies (#68129 ) * Fixes #67977, a crash in `empty-tensor-elimination`. * Also improves `linalg.copy` canonicalization. * Also improves indentation indentation in `mlir-linalg-ods-yaml-gen.cpp`.	2023-10-05 12:04:20 +02:00
Michael Buch	3a35ca01fc	[lldb][DWARFASTParserClang][NFCI] Extract DW_AT_data_member_location calculation logic (#68231 ) Currently this non-trivial calculation is repeated multiple times, making it hard to reason about when the `byte_offset`/`member_byte_offset` is being set or not. This patch simply moves all those instances of the same calculation into a helper function. We return an optional to remain an NFC patch. Default initializing the offset would make sense but requires further analysis and can be done in a follow-up patch.	2023-10-05 10:49:42 +01:00
cor3ntin	c72d3a0966	[Clang] Handle consteval expression in array bounds expressions (#66222 ) The bounds of a c++ array is a _constant-expression_. And in C++ it is also a constant expression. But we also support VLAs, ie arrays with non-constant bounds. We need to take care to handle the case of a consteval function (which are specified to be only immediately called in non-constant contexts) that appear in arrays bounds. This introduces `Sema::isAlwayConstantEvaluatedContext`, and a flag in ExpressionEvaluationContextRecord, such that immediate functions in array bounds are always immediately invoked. Sema had both `isConstantEvaluatedContext` and `isConstantEvaluated`, so I took the opportunity to cleanup that. The change in `TimeProfilerTest.cpp` is an unfortunate manifestation of the problem that #66203 seeks to address. Fixes #65520	2023-10-05 11:36:27 +02:00
Christian Sigg	c64a098ee4	[GVN] Fix after `46aac949bc` replaceUsersOf -> removeUsersOf	2023-10-05 11:31:35 +02:00
tdanyluk	a608830807	[mlir] Speed up FuncToLLVM using a SymbolTable (#68082 ) We have a project where this saves 23% of the compilation time. This means using hashmaps instead of searching in linked lists.	2023-10-05 11:24:52 +02:00
Rin	d3e4702c0f	[AArch64] [LoopVectorize] Use either fixed-width or scalable VF when tail-folding (#67543 ) Since the getMaximisedVFForTarget function is called twice, once for fixed-width and once for scalable, it adds no value to always return a fixed-width VF. Instead, when we are tail-folding, we can use either fixed-width or scalable vectors.	2023-10-05 10:24:30 +01:00
Nikita Popov	46aac949bc	[GVN] Remove users from ICF when RAUWing loads When performing store to load forwarding, replacing users of the load may turn an indirect call into one with a known callee, in which case it might become willreturn, invalidating cached ICF information. Avoid this by removing users. This is a bit more aggressive than strictly necessary (e.g. this shouldn't be necessary when doing load-load CSE), but better safe than sorry. Fixes https://github.com/llvm/llvm-project/issues/48805.	2023-10-05 11:21:33 +02:00
Christian Sigg	59e75b7df2	[mlir][bazel] Sort targets list.	2023-10-05 11:14:12 +02:00
Christian Sigg	2f1c78014f	[mlir][bazel] Fix after `d20fbc9007`	2023-10-05 11:12:55 +02:00
Guray Ozen	29b33e8397	[bazel] fix typo	2023-10-05 11:08:46 +02:00
Jonas Hahnfeld	3116d60494	[Lex] Introduce Preprocessor::LexTokensUntilEOF() This new method repeatedly calls Lex() until end of file is reached and optionally fills a std::vector of Tokens. Use it in Clang's unit tests to avoid quite some code duplication. Differential Revision: https://reviews.llvm.org/D158413	2023-10-05 11:04:07 +02:00
Job Noorman	7fa33773e3	[BOLT][RISCV] Handle long tail calls (#67098 ) Long tail calls use the following instruction sequence on RISC-V: ``` 1: auipc xi, %pcrel_hi(sym) jalr zero, %pcrel_lo(1b)(xi) ``` Since the second instruction in isolation looks like an indirect branch, this confused BOLT and most functions containing a long tail call got marked with "unknown control flow" and didn't get optimized as a consequence. This patch fixes this by detecting long tail call sequence in `analyzeIndirectBranch`. `FixRISCVCallsPass` also had to be updated to expand long tail calls to `PseudoTAIL` instead of `PseudoCALL`. Besides this, this patch also fixes a minor issue with compressed tail calls (`c.jr`) not being detected. Note that I had to change `BinaryFunction::postProcessIndirectBranches` slightly: the documentation of `MCPlusBuilder::analyzeIndirectBranch` mentions that the [`Begin`, `End`) range contains the instructions immediately preceding `Instruction`. However, in `postProcessIndirectBranches`, all the instructions in the BB where passed in the range. This made it difficult to find the preceding instruction so I made sure only the preceding instructions are passed.	2023-10-05 08:55:30 +00:00
Guray Ozen	d20fbc9007	[MLIR][NVGPU] Introduce `nvgpu.wargroup.mma.store` Op for Hopper GPUs (#65441 ) This PR introduces a new Op called `warpgroup.mma.store` to the NVGPU dialect of MLIR. The purpose of this operation is to facilitate storing fragmanted result(s) `nvgpu.warpgroup.accumulator` produced by `warpgroup.mma` to the given memref. An example of fragmentated matrix is given here : https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#wgmma-64n16-d The `warpgroup.mma.store` does followings: 1) Takes one or more `nvgpu.warpgroup.accumulator` type (fragmented results matrix) 2) Calculates indexes per thread in warp-group and stores the data into give memref. Here's an example usage: ``` // A warpgroup performs GEMM, results in fragmented matrix %result1, %result2 = nvgpu.warpgroup.mma ... // Stores the fragmented result to memref nvgpu.warpgroup.mma.store [%result1, %result2], %matrixD : !nvgpu.warpgroup.accumulator< fragmented = vector<64x128xf32>>, !nvgpu.warpgroup.accumulator< fragmented = vector<64x128xf32>> to memref<128x128xf32,3> ```	2023-10-05 10:54:13 +02:00
Job Noorman	c7d6d62252	[BOLT][RISCV] Implement TLS le/ie relocations (#67112 ) Handle the following relocations related to TLS local-exec and initial-exec: - R_RISCV_TLS_GOT_HI20 - R_RISCV_TPREL_HI20 - R_RISCV_TPREL_ADD - R_RISCV_TPREL_LO12_I - R_RISCV_TPREL_LO12_S In addition, GNU ld has a quirk where after TLS le relaxation, two unofficial relocation types may be emitted: - R_RISCV_TPREL_I - R_RISCV_TPREL_S Since they are unofficial (defined in the reserved range of relocation types), LLVM does not define them. Hence, I've defined them locally in BOLT in a private namespace.	2023-10-05 08:53:51 +00:00
Martin Storsjö	7c5e4e5fa3	Reapply [compiler-rt] Check for and use -lunwind when linking with -nodefaultlibs (#66584 ) If libc++ is available and should be used as the ubsan C++ ABI library, the check for libc++ might fail if libc++ is a static library, as the -nodefaultlibs flag inhibits a potential compiler default -lunwind. Just like the -nodefaultlibs configuration tests for and manually adds a bunch of compiler default libraries, look for -lunwind too. This is a reland of #65912.	2023-10-05 11:41:11 +03:00
Jonas Hahnfeld	26bb22b0c8	Revert "InstCombine: Introduce SimplifyDemandedUseFPClass" It causes a test failure of clang/test/Headers/__clang_hip_math.hip: https://lab.llvm.org/buildbot/#/builders/109/builds/75022 This reverts commit `59c6e2e9c1`.	2023-10-05 10:26:10 +02:00
Owen Pan	8902f12e61	[clang-format][doc] Update the Linux kernel coding style URL	2023-10-05 01:18:49 -07:00
cor3ntin	49666ec038	[Clang] Fix constant evaluating a captured variable in a lambda (#68090 ) with an explicit parameter. We tried to read a pointer to a non-existent `This` APValue when constant-evaluating an explicit object lambda call operator (the `this` pointer is never set in explicit object member functions) Fixes #68070	2023-10-05 10:17:50 +02:00
Guray Ozen	b74cfc139a	[mlir][nvgpu] Improve nvgpu->nvvm transformation of `warpgroup.mma` Op (NFC) (#67325 ) This PR introduces substantial improvements to the readability and maintainability of the `nvgpu.warpgroup.mma` Op transformation from nvgpu->nvvm. This transformation plays a crucial role in GEMM and manages complex operations such as generating multiple wgmma ops and iterating their descriptors. The prior code lacked clarity, but this PR addresses that issue effectively. PR does followings: Introduces a helper class: `WarpgroupGemm` class encapsulates the necessary functionality, making the code cleaner and more understandable. Detailed Documentation: Each function within the helper class is thoroughly documented to provide clear insights into its purpose and functionality.	2023-10-05 10:16:59 +02:00
Guray Ozen	7eb2b99f16	[mlir] Change the class name of the `GenerateWarpgroupDescriptor` (#68286 )	2023-10-05 10:15:40 +02:00
Nikita Popov	c263639134	[InstSimplify] Add missing const qualifier (NFC) The context instruction is a "const Instruction *", so that's what getWithInstruction() should accept.	2023-10-05 10:05:16 +02:00
Nikita Popov	ba149f6e09	[ValueTracking] Add SimplifyQuery ctor without TLI (NFC) While we pretty much always want to pass DT, AC and CxtI, most places don't care about TLI. Add an overload where this is not one of the first parameters.	2023-10-05 09:55:00 +02:00
Timm Bäder	57147bb253	[clang][Interp] Support LambdaThisCaptures Differential Revision: https://reviews.llvm.org/D154262	2023-10-05 09:46:15 +02:00
Nicolas Vasilache	cc2d9515d0	[mlir][Transform] NFC - Fix missing field in copy constructor	2023-10-05 07:40:35 +00:00
Timm Bäder	4d7f4a7c82	[clang][Interp] Only lazily visit constant globals Differential Revision: https://reviews.llvm.org/D158516	2023-10-05 09:37:37 +02:00
Yusra Syeda	5c4d35d8cf	[SystemZ][z/OS] Update lowerCall (#68259 ) This PR moves some calculation out of `LowerCall` and into `SystemZXPLINKFrameLowering::processFunctionBeforeFrameFinalized`. We need to make this change because LowerCall isn't invoked for functions that don't have function calls, and it is required for some tooling to work correctly. A function that does not make any calls is required to allocate 32 bytes for the parameter area required by the ABI. However, we allocate 64 bytes because this additional space is utilized by certain tools, like the debugger. Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>	2023-10-05 10:32:57 +03:00

1 2 3 4 5 ...

476879 Commits