Commit Graph

476879 Commits

Author SHA1 Message Date
Marek Kurdej
d2cb198f25 [libc++] Make future_error constructor standard-compliant
This patch removes the non compliant constructor of std::future_error
and adds the standards compliant constructor in C++17 instead.

Note that we can't support the constructor as an extension in all
standard modes because it uses delegating constructors, which require
C++11. We could in theory support the constructor as an extension in
C++11 and C++14 only, however I believe it is acceptable not to do that
since I expect the breakage from this patch will be minimal.

If it turns out that more code than we expect is broken by this, we can
reconsider that decision.

This was found during D99515.

Differential Revision: https://reviews.llvm.org/D99567
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
2023-10-05 09:11:49 -04:00
Alexey Bataev
2c49311dea [SLP][NFC]Add insertsubvector test with small source vector, NFC. 2023-10-05 06:03:58 -07:00
Matt Arsenault
bad5893c30 Attributor: Fix not propagating nofpclass arguments through transitive callers
Fixes #64867
2023-10-05 06:03:40 -07:00
Matt Arsenault
75a3cc9c92 Attributor: Add a few nofpclass tests 2023-10-05 06:03:39 -07:00
Aaron Ballman
dc1000d5b2 Revert "[C2X] N3007 Type inference for object definitions"
This reverts commit 5d78b78c85.

Reverting due to the failure found by:
https://lab.llvm.org/buildbot/#/builders/245/builds/14999
2023-10-05 08:52:12 -04:00
Yingwei Zheng
c73d5544d9
[CVP] Add additional cttz tests. NFC. 2023-10-05 20:49:44 +08:00
cor3ntin
6989c4842f
[Documentation] Fix some invalid references in sphinx documentation (#68239) 2023-10-05 14:40:59 +02:00
Nikita Popov
236228f43d [BitcodeReader] Replace unsupported constexprs in metadata with undef
Metadata (via ValueAsMetadata) can reference constant expressions
that may no longer be supported. These references can both be in
function-local metadata and module metadata, if the same expression
is used in multiple functions. At least in theory, such references
could also be in metadata proper, rather than just inside
ValueAsMetadata references in calls.

Instead of trying to expand these expressions (which we can't
reliably do), pretend that the constant has been deleted, which
means that ValueAsMetadata references will get replaced with
undef metadata.

Fixes https://github.com/llvm/llvm-project/issues/68281.
2023-10-05 14:38:25 +02:00
Matt Arsenault
2ca30eb8fd
AMDGPU/GlobalISel: Handle mubuf load/store for more types (#68268)
Fixes MUBUF path for most vectors and pointers, which unblocks fixing
the gfx6/7 run lines in assorted tests. Also fixes inconsistent behavior
for -flat-for-global.
2023-10-05 05:36:16 -07:00
Matthias Springer
ea71d2d0fe
[mlir][tensor][bufferize] Reshapes: Fix memory side effects and memory space (#68195)
* `tensor.collapse_shape` may bufferize to a memory read because the op
may have to reallocate the source buffer.
* `tensor.reshape` should not use `bufferization.clone` for
reallocation. This op has requirements wrt. the order of buffer
writes/reads. Use `memref.alloc` and `memref.copy` instead. Also fix a
bug where the memory space of the source buffer was not propagated to
the reallocated buffer.
2023-10-05 14:33:04 +02:00
qcolombet
932dc9d8c4
[mlir][MemRef] Add a pattern to simplify `extract_strided_metadata(ca… (#68291)
…st)`

`expand-strided-metadata` was missing a pattern to get rid of
`memref.cast`.
The pattern is straight foward:
Produce a new `extract_strided_metadata` with the source of the cast and
fold the static information (sizes, strides, offset) along the way.
2023-10-05 14:32:42 +02:00
Yingwei Zheng
253ee85f34
[CVP] Add pre-commit cttz/ctpop tests. NFC. 2023-10-05 20:32:00 +08:00
Giulio Eulisse
2fab15d8fd
Inline operator== and operator!= (#67958)
Avoid triggering -Wnon-template-friend on newer GCC.

---------

Co-authored-by: Richard Smith <richard@metafoo.co.uk>
2023-10-05 15:31:28 +03:00
Aaron Ballman
96dd50ee83 Fix LLVM Sphinx build 2023-10-05 08:12:58 -04:00
Guillot Tony
5d78b78c85 [C2X] N3007 Type inference for object definitions
This patches implements the auto keyword from the N3007 standard
specification.
This allows deducing the type of the variable like in C++:
```
auto nb = 1;
auto chr = 'A';
auto str = "String";
```
The list of statements which allows the usage of auto:

    * Basic variables declarations (int, float, double, char, char*...)
    * Macros declaring a variable with the auto type

The list of statements which will not work with the auto keyword:

    * auto arrays
    * sizeof(), alignas()
    * auto parameters, auto return type
    * auto as a struct/typedef member
    * uninitialized auto variables
    * auto in an union
    * auto as a enum type specifier
    * auto casts
    * auto in an compound literals

Differential Revision: https://reviews.llvm.org/D133289
2023-10-05 08:11:02 -04:00
Matthias Springer
58678d3bcf
[mlir][tensor][bufferize] tensor.empty bufferizes to allocation (#68201)
`BufferizableOpInterface::bufferizesToAllocation` is queried when
forming equivalence sets during bufferization. It is not really needed
for ops like `tensor.empty` which do not have tensor operands, but it
should be added for consistency.

This change should have been part of #68080. No test is added because
the return value of this function is irrelevant for ops without tensor
operands. (However, this function acts as a form documentation,
describing the bufferization semantics of the op.)
2023-10-05 14:06:00 +02:00
Matthias Springer
5958043e2d
[mlir][bufferization] Add dump_alias_sets option to transform op (#68289)
Add `dump_alias_sets` to `transform.bufferization.one_shot_bufferize`.
This option is useful for debugging. Also improve the verifier to ensure
that `test_analysis_only` is set when other debugging flags are enabled.
2023-10-05 14:05:45 +02:00
Yingwei Zheng
33a194b158
[InstCombine] Add pre-commit tests for #67915. NFC. 2023-10-05 20:01:55 +08:00
Kohei Yamaguchi
777a6e6f10
[mlir][docs] Cleanup documentations [NFC] (#67945)
- Fix missing links
- Fix missing link format
- Move transform::ApplyFuncToLLVMConversionPatternOp into Transform
dialect
- Remove duplicated MemRef's TOC
- Remove duplicated Memref's dma_start/dma_wait docs
2023-10-05 13:33:41 +02:00
Ivan Kosarev
f04aa1f814
[AMDGPU][CodeGen] Fold immediates in src1 operands of V_MAD/MAC/FMA/FMAC. (#68002) 2023-10-05 14:22:29 +03:00
Bogdan Graur
821dfc392a Revert "[X86] Change target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2 (#67410)"
Does not respect `__attribute__((target("avx"))`.

This reverts commit ccd5b8db48.
2023-10-05 10:33:44 +00:00
Simon Pilgrim
baecc9e997 [CostModel][X86] getShuffleCost - add fallback (to half vector) for bfloat vector shuffle costs
Add initial half/bfloat broadcast shuffles test coverage (more to follow)

Fixes #68117 - which was stuck in a loop between getting scalarized insert/extract costs for the shuffle and then trying to convert a bfloat insert into a shuffle again......
2023-10-05 11:12:40 +01:00
Jonas Hahnfeld
abb9eb2778 [Lex] Handle repl_input_end in Preprocessor::LexTokensUntilEOF()
This fixes many unit tests when trying to enable IncrementalExtensions
by default for testing purposes.

Differential Revision: https://reviews.llvm.org/D158415
2023-10-05 12:09:14 +02:00
Mats Petersson
6180964a01
[flang]Pass to add vscale range attribute (#68103)
Add vscale range attirbute for the Scalable Vector Extension (SVE) if
provided on the command-line (options in a previous commit)

If no command-line option is provided, if the target-feature of SVE is
specified and the architecture is AArch64, it defualts to 128-2048. in
other words a vscale-min of 1, vscale-max of 16.

A pass is used to add the atribute to all functions. The vectorizer will
use this attribute to generate the SVE instruction to match the range
specified. The attribute is harmless if there is no vectorizable
operations in the function.
2023-10-05 11:06:00 +01:00
long.chen
5979e1dfb1
[mlir] Fix empty-tensor-elimination around self-copies (#68129)
* Fixes #67977, a crash in `empty-tensor-elimination`.
* Also improves `linalg.copy` canonicalization.
* Also improves indentation indentation in `mlir-linalg-ods-yaml-gen.cpp`.
2023-10-05 12:04:20 +02:00
Michael Buch
3a35ca01fc
[lldb][DWARFASTParserClang][NFCI] Extract DW_AT_data_member_location calculation logic (#68231)
Currently this non-trivial calculation is repeated multiple times,
making it hard to reason about when the
`byte_offset`/`member_byte_offset` is being set or not.

This patch simply moves all those instances of the same calculation into
a helper function.

We return an optional to remain an NFC patch. Default initializing the
offset would make sense but requires further analysis and can be done in
a follow-up patch.
2023-10-05 10:49:42 +01:00
cor3ntin
c72d3a0966
[Clang] Handle consteval expression in array bounds expressions (#66222)
The bounds of a c++ array is a _constant-expression_. And in C++ it is
also a constant expression.

But we also support VLAs, ie arrays with non-constant bounds.

We need to take care to handle the case of a consteval function (which
are specified to be only immediately called in non-constant contexts)
that appear in arrays bounds.

This introduces `Sema::isAlwayConstantEvaluatedContext`, and a flag in
ExpressionEvaluationContextRecord, such that immediate functions in
array bounds are always immediately invoked.

Sema had both `isConstantEvaluatedContext` and
`isConstantEvaluated`, so I took the opportunity to cleanup that.

The change in `TimeProfilerTest.cpp` is an unfortunate manifestation of
the problem that #66203 seeks to address.

Fixes #65520
2023-10-05 11:36:27 +02:00
Christian Sigg
c64a098ee4
[GVN] Fix after 46aac949bc
replaceUsersOf -> removeUsersOf
2023-10-05 11:31:35 +02:00
tdanyluk
a608830807
[mlir] Speed up FuncToLLVM using a SymbolTable (#68082)
We have a project where this saves 23% of the compilation time.

This means using hashmaps instead of searching in linked lists.
2023-10-05 11:24:52 +02:00
Rin
d3e4702c0f
[AArch64] [LoopVectorize] Use either fixed-width or scalable VF when tail-folding (#67543)
Since the getMaximisedVFForTarget function is called twice, once for fixed-width and once for scalable, it adds no value to always return a fixed-width VF. Instead, when we are tail-folding, we can use either fixed-width or scalable vectors.
2023-10-05 10:24:30 +01:00
Nikita Popov
46aac949bc [GVN] Remove users from ICF when RAUWing loads
When performing store to load forwarding, replacing users of the
load may turn an indirect call into one with a known callee, in
which case it might become willreturn, invalidating cached ICF
information. Avoid this by removing users.

This is a bit more aggressive than strictly necessary (e.g. this
shouldn't be necessary when doing load-load CSE), but better safe
than sorry.

Fixes https://github.com/llvm/llvm-project/issues/48805.
2023-10-05 11:21:33 +02:00
Christian Sigg
59e75b7df2
[mlir][bazel] Sort targets list. 2023-10-05 11:14:12 +02:00
Christian Sigg
2f1c78014f
[mlir][bazel] Fix after d20fbc9007 2023-10-05 11:12:55 +02:00
Guray Ozen
29b33e8397 [bazel] fix typo 2023-10-05 11:08:46 +02:00
Jonas Hahnfeld
3116d60494 [Lex] Introduce Preprocessor::LexTokensUntilEOF()
This new method repeatedly calls Lex() until end of file is reached
and optionally fills a std::vector of Tokens. Use it in Clang's unit
tests to avoid quite some code duplication.

Differential Revision: https://reviews.llvm.org/D158413
2023-10-05 11:04:07 +02:00
Job Noorman
7fa33773e3
[BOLT][RISCV] Handle long tail calls (#67098)
Long tail calls use the following instruction sequence on RISC-V:

```
1: auipc xi, %pcrel_hi(sym)
jalr zero, %pcrel_lo(1b)(xi)
```

Since the second instruction in isolation looks like an indirect branch,
this confused BOLT and most functions containing a long tail call got
marked with "unknown control flow" and didn't get optimized as a
consequence.

This patch fixes this by detecting long tail call sequence in
`analyzeIndirectBranch`. `FixRISCVCallsPass` also had to be updated to
expand long tail calls to `PseudoTAIL` instead of `PseudoCALL`.

Besides this, this patch also fixes a minor issue with compressed tail
calls (`c.jr`) not being detected.

Note that I had to change `BinaryFunction::postProcessIndirectBranches`
slightly: the documentation of `MCPlusBuilder::analyzeIndirectBranch`
mentions that the [`Begin`, `End`) range contains the instructions
immediately preceding `Instruction`. However, in
`postProcessIndirectBranches`, *all* the instructions in the BB where
passed in the range. This made it difficult to find the preceding
instruction so I made sure *only* the preceding instructions are passed.
2023-10-05 08:55:30 +00:00
Guray Ozen
d20fbc9007
[MLIR][NVGPU] Introduce nvgpu.wargroup.mma.store Op for Hopper GPUs (#65441)
This PR introduces a new Op called `warpgroup.mma.store` to the NVGPU
dialect of MLIR. The purpose of this operation is to facilitate storing
fragmanted result(s) `nvgpu.warpgroup.accumulator` produced by
`warpgroup.mma` to the given memref.

An example of fragmentated matrix is given here :

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#wgmma-64n16-d

The `warpgroup.mma.store` does followings:
1) Takes one or more `nvgpu.warpgroup.accumulator` type (fragmented
results matrix)
2) Calculates indexes per thread in warp-group and stores the data into
give memref.

Here's an example usage:
```
// A warpgroup performs GEMM, results in fragmented matrix
%result1, %result2 = nvgpu.warpgroup.mma ...

// Stores the fragmented result to memref
nvgpu.warpgroup.mma.store [%result1, %result2], %matrixD : 
    !nvgpu.warpgroup.accumulator< fragmented = vector<64x128xf32>>,
    !nvgpu.warpgroup.accumulator< fragmented = vector<64x128xf32>> 
    to memref<128x128xf32,3>
```
2023-10-05 10:54:13 +02:00
Job Noorman
c7d6d62252
[BOLT][RISCV] Implement TLS le/ie relocations (#67112)
Handle the following relocations related to TLS local-exec and
initial-exec:
- R_RISCV_TLS_GOT_HI20
- R_RISCV_TPREL_HI20
- R_RISCV_TPREL_ADD
- R_RISCV_TPREL_LO12_I
- R_RISCV_TPREL_LO12_S

In addition, GNU ld has a quirk where after TLS le relaxation, two
unofficial relocation types may be emitted:
- R_RISCV_TPREL_I
- R_RISCV_TPREL_S

Since they are unofficial (defined in the reserved range of relocation
types), LLVM does not define them. Hence, I've defined them locally in
BOLT in a private namespace.
2023-10-05 08:53:51 +00:00
Martin Storsjö
7c5e4e5fa3
Reapply [compiler-rt] Check for and use -lunwind when linking with -nodefaultlibs (#66584)
If libc++ is available and should be used as the ubsan C++ ABI library,
the check for libc++ might fail if libc++ is a static library, as the
-nodefaultlibs flag inhibits a potential compiler default -lunwind.

Just like the -nodefaultlibs configuration tests for and manually adds a
bunch of compiler default libraries, look for -lunwind too.

This is a reland of #65912.
2023-10-05 11:41:11 +03:00
Jonas Hahnfeld
26bb22b0c8 Revert "InstCombine: Introduce SimplifyDemandedUseFPClass"
It causes a test failure of clang/test/Headers/__clang_hip_math.hip:
https://lab.llvm.org/buildbot/#/builders/109/builds/75022

This reverts commit 59c6e2e9c1.
2023-10-05 10:26:10 +02:00
Owen Pan
8902f12e61 [clang-format][doc] Update the Linux kernel coding style URL 2023-10-05 01:18:49 -07:00
cor3ntin
49666ec038
[Clang] Fix constant evaluating a captured variable in a lambda (#68090)
with an explicit parameter.

We tried to read a pointer to a non-existent `This` APValue when
constant-evaluating an explicit object lambda call operator (the `this`
pointer is never set in explicit object member functions)

Fixes #68070
2023-10-05 10:17:50 +02:00
Guray Ozen
b74cfc139a
[mlir][nvgpu] Improve nvgpu->nvvm transformation of warpgroup.mma Op (NFC) (#67325)
This PR introduces substantial improvements to the readability and
maintainability of the `nvgpu.warpgroup.mma` Op transformation from
nvgpu->nvvm. This transformation plays a crucial role in GEMM and
manages complex operations such as generating multiple wgmma ops and
iterating their descriptors. The prior code lacked clarity, but this PR
addresses that issue effectively.

**PR does followings:**
**Introduces a helper class:** `WarpgroupGemm` class encapsulates the
necessary functionality, making the code cleaner and more
understandable.

**Detailed Documentation:** Each function within the helper class is
thoroughly documented to provide clear insights into its purpose and
functionality.
2023-10-05 10:16:59 +02:00
Guray Ozen
7eb2b99f16
[mlir] Change the class name of the GenerateWarpgroupDescriptor (#68286) 2023-10-05 10:15:40 +02:00
Nikita Popov
c263639134 [InstSimplify] Add missing const qualifier (NFC)
The context instruction is a "const Instruction *", so that's
what getWithInstruction() should accept.
2023-10-05 10:05:16 +02:00
Nikita Popov
ba149f6e09 [ValueTracking] Add SimplifyQuery ctor without TLI (NFC)
While we pretty much always want to pass DT, AC and CxtI, most
places don't care about TLI. Add an overload where this is not
one of the first parameters.
2023-10-05 09:55:00 +02:00
Timm Bäder
57147bb253 [clang][Interp] Support LambdaThisCaptures
Differential Revision: https://reviews.llvm.org/D154262
2023-10-05 09:46:15 +02:00
Nicolas Vasilache
cc2d9515d0 [mlir][Transform] NFC - Fix missing field in copy constructor 2023-10-05 07:40:35 +00:00
Timm Bäder
4d7f4a7c82 [clang][Interp] Only lazily visit constant globals
Differential Revision: https://reviews.llvm.org/D158516
2023-10-05 09:37:37 +02:00
Yusra Syeda
5c4d35d8cf
[SystemZ][z/OS] Update lowerCall (#68259)
This PR moves some calculation out of `LowerCall` and into
`SystemZXPLINKFrameLowering::processFunctionBeforeFrameFinalized`.
We need to make this change because LowerCall isn't invoked for
functions that don't have function calls, and it is required for some
tooling to work correctly. A function that does not make any calls is
required to allocate 32 bytes for the parameter area required by the
ABI. However, we allocate 64 bytes because this additional space is
utilized by certain tools, like the debugger.

Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>
2023-10-05 10:32:57 +03:00