Commit Graph

473101 Commits

Author SHA1 Message Date
Pravin Jagtap
edb9fab390 [AMDGPU] Support FMin/FMax in AMDGPUAtomicOptimizer.
Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D157388
2023-08-30 12:11:11 -04:00
Pravin Jagtap
6ef6c954c6 [AMDGPU] Reorder atomic optimizer to avoid CAS loop.
Expand-Atomic pass emits the CAS loop for FP operations
which limits the optimizations offered by atomic optimizer.

Moving atomic optimizer before expand-atomics allows
better codegen.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D157265
2023-08-30 12:05:21 -04:00
Pravin Jagtap
f09360d20d [AMDGPU] Support FAdd/FSub global atomics in AMDGPUAtomicOptimizer.
Reduction and Scan are implemented using `Iterative`
and `DPP` strategy for `float` type.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D156301
2023-08-30 11:57:48 -04:00
Ellis Hoag
2bfb41426b [GlobPattern] Fix build error
This fixes a build error introduced by https://reviews.llvm.org/D153587
2023-08-30 08:55:44 -07:00
Matt Arsenault
ddb3f12c42 InstSimplify: Start cleaning up simplifyFCmpInst
Also picks up a few improvements (Some of the fcmp.ll
test names imply they aren't quite testing what was intended.
Checking the sign bit can't be performed with a compare to a 0).

Much of the logic in here is the same as the class detection
logic of fcmpToClassTest. We could unify more with a weaker
version of fcmpToClassTest which returns implied classes rather
than exact class-like compares. Also could unify more with detection
of possible classes in non-splat vectors.

One problem here is we now only perform folds that used
to always work now require a context instruction. This is
because fcmpToClassTest requires the parent function.
Either fcmpToClassTest could tolerate a missing context
function, or we could require passing in one to simplifyFCmpInst.
Without this it's possible to hit the !isNan assert (which feels like
an unnecessary assert). In any case, these cases don't appear in
any tests.

https://reviews.llvm.org/D151887
2023-08-30 11:53:05 -04:00
Matt Arsenault
6012fed6f5 AMDGPU: Fix sqrt fast math flags spreading to fdiv fast math flags
This was working around the lack of operator| on FastMathFlags. We
have that now which revealed the bug.
2023-08-30 11:53:05 -04:00
Matthew Voss
2263dfe368 [test] Correct PS5 triple in clang :: Driver/unified-lto.c 2023-08-30 08:45:16 -07:00
Mark de Wever
8930d04d55 [libc++][format] Fixes out of bounds access.
Fixes https://llvm.org/PR65011

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D158940
2023-08-30 17:40:58 +02:00
Mikhail Goncharov
c2647ed9b9 fix unused variables in condition warning
for 92023b1509
2023-08-30 17:39:54 +02:00
Florian Hahn
e544d9cc36
[VPlan] Remove unused VPBuilder::insert member (NFC). 2023-08-30 16:35:55 +01:00
Ellis Hoag
8daace8b2d [GlobPattern] Support brace expansions
Extend `GlobPattern` to support brace expansions, e.g., `foo.{c,cpp}` as discussed in https://reviews.llvm.org/D152762#4425203.

The high level change was to turn `Tokens` into a list that gets larger when we see a new brace expansion term. Then in `GlobPattern::match()` we must check against each token group.

This is a breaking change since `{` will no longer match a literal without escaping. However, `\{` will match the literal `{` before and after this change. Also, from a brief survey of LLVM, it seems that `GlobPattern` is mostly used for symbol and path matching, which likely won't need `{` in their patterns.

See https://github.com/devongovett/glob-match#syntax for a nice glob reference.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D153587
2023-08-30 08:30:39 -07:00
Mikhail R. Gadelha
abacab6e0c [libc][riscv] Added support for rv32 in setjmp and longjmp
This patch adds two new macros to setjmp (STORE, STORE_FP) and two new
macros to longjmp (LOAD, LOAD_FP) that takes a register and a buff, then
select the correct asm instruction for rv32 and rv64.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D158640
2023-08-30 12:21:46 -03:00
Amara Emerson
c95ed6e492 [GlobalISel] Try to commute G_CONSTANT_FOLD_BARRIER LHS operands to RHS.
Differential Revision: https://reviews.llvm.org/D159097
2023-08-30 08:07:22 -07:00
Florian Hahn
cd9563ae17
[VPlan] Remove unused VPInstruction::clone member (NFC). 2023-08-30 15:53:39 +01:00
Cyndy Ishida
1a0d6992ae [llvm][ReadTAPI] Add & fix rpath comparison checks
* Check and emit out differences in rpath inputs
* Prevent rpaths from being overwritten
* Capture file path for tbd-v5
2023-08-30 07:42:52 -07:00
Matt Arsenault
bfe6bc05cd AMDGPU: Cleanup check for integral exponents in pow folds
Also improves undef handling

https://reviews.llvm.org/D159006
2023-08-30 10:37:24 -04:00
Mikhail R. Gadelha
b0272d8ec3 [libc] Fix set_thread_ptr call in rv32 start up code
This patch changes the instruction in set_thread_ptr from ld to mv,
as rv32 doesn't have the ld instruction, and mv is supported by both
rv32 and rv64.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D159110
2023-08-30 11:30:56 -03:00
Marius Brehler
36c9afc7a8 [mlir][emitc][nfc] List supported comparisons
Reviewed By: simon-camp

Differential Revision: https://reviews.llvm.org/D159195
2023-08-30 14:29:02 +00:00
Jie Fu
910b9372d1 [flang] Function 'attributeTypeIsCompatible' should be debug only (NFC)
/data/home/jiefu/llvm-project/flang/lib/Optimizer/CodeGen/CodeGen.cpp:2905:20: error: unused function 'attributeTypeIsCompatible' [-Werror,-Wunused-function]
static inline bool attributeTypeIsCompatible(mlir::MLIRContext *ctx,
                   ^
1 error generated.
2023-08-30 22:28:35 +08:00
Mikhail R. Gadelha
9e34454519 [libc] Fix test case that expects time_t to be a 32-bit type
This patch changes a test case that tests for overflow when time_t is
32-bit long, however, it was checking size_t instead of time_t.

This in on par with other testcases that correctly check the size of
time_t (asctime_test.cpp, gmtime_r_test.cpp and gmtime_test.cpp).

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D159113
2023-08-30 11:24:19 -03:00
Luke Lau
3a4ad45a2c [DAGCombiner] Combine trunc (splat_vector x) -> splat_vector (trunc x)
From the discussion in https://reviews.llvm.org/D158853, moving the truncate
into the splat helps more splatted scalar operands get selected on RISC-V, and
also avoids the need for splat_vector_parts on RV32.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D159147
2023-08-30 15:22:57 +01:00
Matt Arsenault
aa539b128f AMDGPU: Add baseline tests for libcall recognition of pow/powr/pown 2023-08-30 10:10:03 -04:00
Leandro Lupori
c8517f1752 [flang] Add support for dense complex constants
Add support for representing complex array constants with MLIR
dense attribute. This improves compile time and greatly reduces
memory usage of programs with large complex array constants.

Fixes https://github.com/llvm/llvm-project/issues/63610

Reviewed By: vzakhari

Differential Revision: https://reviews.llvm.org/D155951
2023-08-30 10:51:02 -03:00
Benjamin Maxwell
296d5cb60c [mlir][BuiltinTypes] Return VectorType from VectorType::Builder conversion operator
0-D vectors are now supported, so the special case of returning the just
the element type can now be removed.

A few callers that relied on the old behaviour have been updated.

Reviewed By: awarzynski, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D159122
2023-08-30 13:47:06 +00:00
Andrzej Warzynski
715cde0a25 [flang][nfc] Refine how output is defined in a test
Updates optimization-remark.f90. Makes sure that every RUN line:
* discords the actual output of the compilation (we only care about the
  optimisation remarks),
* re-uses the same definition of the output (better code re-use),
* doesn't generate object files - no need to use `-c` if `-emit-llvm` is
  sufficient.

Differential Revision: https://reviews.llvm.org/D158951
2023-08-30 13:37:28 +00:00
Matthias Springer
8dd8c4adba [mlir][Transforms] Inliner: Extra checks for unstructured control flow
Do not inline IR with multiple blocks into ops that may not support unstructured control flow.

This fixes #64978.

Differential Revision: https://reviews.llvm.org/D159072
2023-08-30 15:28:29 +02:00
Matthias Springer
0ac21e654f [mlir][IR] SingleBlockImplicitTerminator: Declare "inherited" trait in ODS instead of C++
`SingleBlockImplicitTerminator` is now a combination of two traits: `SingleBlock` and `SingleBlockImplicitTerminatorImpl` (the original `SingleBlockImplicitTerminator`).

This change makes it possible to check if the `SingleBlock` op trait is implemented. Until now, `Operation::hasTrait<OpTrait::SingleBlock>()` returned `false` for ops that implement `SingleBlockImplicitTerminator`.

Differential Revision: https://reviews.llvm.org/D159078
2023-08-30 15:26:47 +02:00
Louis Dionne
eb27be95a4 [clang] Add a Windows build in the Clang pre-commit CI
This patch adds a CI job for Clang on Windows that is separate from
the monolithic job that gets added automatically via the Phabricator
integration with Buildkite. This way, we will retain the Windows testing
for Clang when we move to GitHub Pull Requests.

Differential Revision: https://reviews.llvm.org/D158995
2023-08-30 09:15:55 -04:00
Sander de Smalen
0a32a999ae [AArch64][SME] NFC: Rename hasNewZAInterface to hasNewZABody.
__arm_new_za is a declaration attribution, not a type attribute,
and is therefore not part of the interface of a function.
2023-08-30 13:14:42 +00:00
Tobias Gysi
5f230ed762 [mlir][llvm] Translate alias scopes lazily
Change the LLVM dialect to LLVM IR translation to convert the alias
scope attributes lazily to LLVM IR metadata. Previously, the alias
scopes have been translated upfront walking the alias scopes of
operations that implement the AliasAnalysisOpInterface. As a result,
the translation of a module that contains only a noalias scope
intrinsic failed, since its alias scope attribute has not been
translated due to the intrinsic not implementing
AliasAnalysisOpInterface.

Reviewed By: zero9178

Differential Revision: https://reviews.llvm.org/D159187
2023-08-30 12:59:48 +00:00
Simon Pilgrim
376050db9f [DAG] Move some unary constant folds from getNode() to FoldConstantArithmetic()
We need to clean up some type handling before the remainder (int<->fp and bitcasts) can be moved over.
2023-08-30 13:59:28 +01:00
Yuxuan Shui
9a4b3fdb82 [lldb][windows] _wsopen_s does not accept bits other than _S_IREAD | _S_IWRITE
When sending file from a Linux host to a Windows remote, Linux host will try to copy the source file's permission bits, which will contain `_S_I?GRP` and `_S_I?OTH` bits. Those bits are rejected by `_wsopen_s`, causing it to return EINVAL.

This patch masks out the rejected bits.

GitHub issue: #64313

Reviewed By: jasonmolenda, DavidSpickett

Differential Revision: https://reviews.llvm.org/D156817
2023-08-30 15:53:31 +03:00
Tue Ly
76bb278ebb [libc][math] Implement double precision exp10 function correctly rounded for all rounding modes.
Implement double precision exp10 function correctly rounded for all
rounding modes.  Using the same algorithm as double precision exp
(https://reviews.llvm.org/D158551) and exp2 (https://reviews.llvm.org/D158812)
functions.

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D159143
2023-08-30 08:43:50 -04:00
Mikhail Goncharov
74f4daef04 fix unused variable warnings in conditionals
for 92023b1509
2023-08-30 14:36:42 +02:00
Ramkumar Ramachandra
04b1276ad3 LoopVectorize/iv-select-cmp: add tests for truncated IV
The current tests in iv-select-cmp.ll are not representative of clang
output of common real-world C programs, which are often written with i32
induction vars, as opposed to i64 induction vars. Hence, add five tests
corresponding to the following programs:

  int test(int *a, int n) {
    int rdx = 331;
    for (int i = 0; i < n; i++) {
      if (a[i] > 3)
        rdx = i;
    }
    return rdx;
  }

  int test(int *a) {
    int rdx = 331;
    for (int i = 0; i < 20000; i++) {
      if (a[i] > 3)
        rdx = i;
    }
    return rdx;
  }

  int test(int *a, long n) {
    int rdx = 331;
    for (int i = 0; i < n; i++) {
      if (a[i] > 3)
        rdx = i;
    }
    return rdx;
  }

  int test(int *a, unsigned n) {
    int rdx = 331;
    for (int i = 0; i < n; i++) {
      if (a[i] > 3)
        rdx = i;
    }
    return rdx;
  }

  int test(int *a) {
    int rdx = 331;
    for (long i = INT_MIN - 1; i < UINT_MAX; i++) {
      if (a[i] > 3)
        rdx = i;
    }
    return rdx;
  }

The first two can theoretically be vectorized without a runtime-check,
while the third and fourth cannot. The fifth cannot be vectorized, even
with a runtime-check.

This issue was found while reviewing D150851.

Differential Revision: https://reviews.llvm.org/D156124
2023-08-30 13:09:37 +01:00
Egor Zhdan
9939556625
[APINotes] Initial support for C++ namespaces
This upstreams a part of the C++ namespaces support in Clang API Notes.

The complete patch was recently merged downstream in the Apple fork: https://github.com/apple/llvm-project/pull/7230.

This patch only adds the parts of the namespace support that can be cleanly applied on top of the API Notes infrastructure that was upstreamed previously.

Differential Revision: https://reviews.llvm.org/D159092
2023-08-30 12:54:42 +01:00
Joseph Huber
ccb1d183c3 [OpenMP][Docs] Remove old entry saying static libraries are unsupported
Summary:
Static libraries have been supported since LLVM 15.0, this entry is
misleading and should be removed.
2023-08-30 06:48:57 -05:00
Serge Pavlov
1792852f86 [symbolizer] Change reaction on invalid input
If llvm-symbolizer finds a malformed command, it echoes it to the
standard output. New versions of binutils (starting from 2.39) allow to
specify an address by a symbols. Implementation of this feature in
llvm-symbolizer makes the current reaction on invalid input
inappropriate. Almost any invalid command may be treated as a symbol
name, so the right reaction should be "symbol not found" in such case.

The exception are commands that are recognized but have incorrect
syntax, like "FILE:FILE:". The utility must produce descriptive
diagnostic for such input and route it to the stderr.

This change implements the new reaction on invalid input and is a
prerequisite for implementation of symbol lookup in llvm-symbolizer.

Differential Revision: https://reviews.llvm.org/D157210
2023-08-30 17:54:37 +07:00
OverMighty
38c92c1ee2 [AArch64] Add patterns for FMADD, FMSUB
FMADD, FMSUB instructions perform better or the same compared to indexed
FMLA, FMLS.

For example, the Arm Cortex-A55 Software Optimization Guide lists "FP
multiply accumulate" FMADD, FMSUB instructions with a throughput of 2
IPC, whereas it lists "ASIMD FP multiply accumulate, by element" FMLA,
FMLS with a throughput of 1 IPC.

The Arm Cortex-A77 Software Optimization Guide, however, does not
separately list "by element" variants of the "ASIMD FP multiply
accumulate" instructions, which are listed with the same throughput of 2
IPC as "FP multiply accumulate" instructions.

Reviewed By: samtebbs, dzhidzhoev

Differential Revision: https://reviews.llvm.org/D158008
2023-08-30 12:39:04 +02:00
Georgi Mirazchiyski
0563725600 [NFC][AMDGPU] Guard the custom fixups kind array from invalid access
Protect from accidental passing of an invalid MCFixupKind value which
can cause an out-of-bounds access in the array.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D158725
2023-08-30 11:12:50 +01:00
Luke Lau
976244bb84 [RISCV] Canonicalize vrot{l,r} to vrev8 when lowering shuffle as rotate
A rotate of 8 bits of an e16 vector in either direction is equivalent to a
byteswap, i.e. vrev8. There is a generic combine on ISD::ROT{L,R} to
canonicalize these rotations to byteswaps, but on fixed vectors they are
legalized before they have the chance to be combined. This patch teaches the
rotate vector_shuffle lowering to emit these rotations as byteswaps to match
the scalable vector behaviour.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D158195
2023-08-30 11:01:49 +01:00
Luke Lau
a61c4a0ef6 [RISCV][SelectionDAG] Lower shuffles as bitrotates with vror.vi when possible
Given a shuffle mask like <3, 0, 1, 2, 7, 4, 5, 6> for v8i8, we can
reinterpret it as a shuffle of v2i32 where the two i32s are bit rotated, and
lower it as a vror.vi (if legal with zvbb enabled).
We also need to make sure that the larger element type is a valid SEW, hence
the tests for zve32x.

X86 already did this, so I've extracted the logic for it and put it inside
ShuffleVectorSDNode so it could be reused by RISC-V. I originally tried to add
this as a generic combine in DAGCombiner.cpp, but it ended up causing worse
codegen on X86 and PPC.

Reviewed By: reames, pengfei

Differential Revision: https://reviews.llvm.org/D157417
2023-08-30 11:01:47 +01:00
Florian Hahn
4a5bcbd560
[ConstraintElim] Store conditional facts as (Predicate, Op0, Op1).
This allows to add facts even if no corresponding ICmp instruction
exists in the IR.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D158837
2023-08-30 10:54:28 +01:00
dingfei
5b3f41c55d [analyzer][NFC] Workaround miscompilation on recent MSVC
SVal argument 'Cond' passed in is corrupted in release mode with
exception handling enabled (result in an UndefinedSVal), or changing
lambda capture inside the callee can workaround this.

Known problematic VS Versions:
- VS 2022 17.4.4
- VS 2022 17.5.4
- VS 2022 17.7.2

Verified working VS Version:
- VS 2019 16.11.25

Fixes https://github.com/llvm/llvm-project/issues/62130

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D159163
2023-08-30 17:14:38 +08:00
Dinar Temirbulatov
73e3866acb [AArch64][SME] Promote mask for masked load to a similar type size with load value.
The legalizer could keep an original mask type of masked load combined with
sign/zero extend, but we have to extend the mask to a type similar to our
combined load otherwise instruction selection could not lower the load.

Differential Revision: https://reviews.llvm.org/D158386
2023-08-30 08:54:46 +00:00
Daniil Kovalev
ea130a81d4 [Support/BLAKE3] Fix error when building llvm for big endian AArch64 host
BLAKE3 implementation does not support using arm neon on big-endian hosts: see
blake3_neon.c. Setting `BLAKE3_USE_NEON` to 1 by default for all AArch64
hosts broke builds for big endian hosts. This patch fixes the behavior
by introducing an additional check against `__ARM_BIG_ENDIAN` before
setting `BLAKE3_USE_NEON`.

Differential Revision: https://reviews.llvm.org/D159156
2023-08-30 11:47:12 +03:00
Sergei Barannikov
a7eaaba699 [Parser] Parse string literal arguments of 'availability', 'external_source_symbol' and 'uuid' attributes as unevaluated
This is a complementary to D156237.
These attributes have custom parsing logic.

Reviewed By: cor3ntin

Differential Revision: https://reviews.llvm.org/D159024
2023-08-30 11:46:54 +03:00
Juan Manuel MARTINEZ CAAMAÑO
9b35254018 [NFC][Clang] Remove unused function CodeGenModule::addDefaultFunctionDefinitionAttributes
This patch deletes the unused `addDefaultFunctionDefinitionAttributes(llvm::Function);` function,
while it still keeps `void addDefaultFunctionDefinitionAttributes(llvm::AttrBuilder &attrs);` which is being used.

Differential Revision: https://reviews.llvm.org/D158990
2023-08-30 10:32:51 +02:00
Qiu Chaofan
21bea1a208 [PowerPC] Support initial-exec TLS relocation on AIX
Add TLS_IE relocation type to XCOFF writer, and emit code sequence for
initial-exec TLS variables.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D156292
2023-08-30 16:22:16 +08:00
Jim Lin
c1dda0f793 [AST] Remove unneeded return false from UseExcessPrecision. NFC.
Remove unneeded `return false` from UseExcessPrecision and move `break` inside.
2023-08-30 16:05:55 +08:00