Commit Graph

412134 Commits

Author SHA1 Message Date
Fangrui Song
c03fdd3403 [ELF] Fix the branch range computation when reusing a thunk
Notation: dst is `t->getThunkTargetSym()->getVA()`

On AArch64, when `src-0x8000000-r_addend <= dst < src-0x8000000`, the condition
`target->inBranchRange(rel.type, src, rel.sym->getVA(rel.addend))` may
incorrectly consider a thunk reusable.
`rel.addend = -getPCBias(rel.type)` resets the addend to 0 for AArch64/PPC
and the zero addend is used by `rel.sym->getVA(rel.addend)` to check
out-of-range relocations.

See the test for a case this computation is wrong:
`error: a.o:(.text_high+0x4): relocation R_AARCH64_JUMP26 out of range: -134217732 is not in [-134217728, 134217727]`
I have seen a real world case with r_addend=19960.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D117734
2022-01-24 09:03:21 -08:00
Simon Pilgrim
6997f4d07f [X86] combineSetCCMOVMSK - fold allof(cmpeq(x,y)) -> ptest(sub(x,y)) (PR53379)
As suggested on PR53379, for all-of icmp-eq patterns, we can use ptest(sub(x,y)) on SSE41+ targets

This is a generalization of the existing allof(cmpeq(x,0)) -> ptest(x) pattern

We can probably extend this further, in particularly to handle 256-bit cases on pre-AVX2 targets, but this part of the generalization is pretty trivial

Fixes Issue #53379
2022-01-24 16:44:37 +00:00
Sander de Smalen
699e22a083 [ISEL] Move trivial step_vector folds to FoldConstantArithmetic.
Given that step_vector is practically a constant, doing this early
helps with DAGCombine folds that happen before type legalization.

There is currently no way to test this happens earlier, although existing
tests for step_vector folds continue protect the folds happening at all.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D117863
2022-01-24 16:37:21 +00:00
Casey Carter
cfe17986c9 [libcxx][test] {move,reverse}_iterator cannot be instantiated for a type with no operator*
Since their nested reference types are defined in terms of `iter_reference_t<T>`, which examines `decltype(*declval<T>())`.

Differential Revision: https://reviews.llvm.org/D117371
2022-01-24 08:34:39 -08:00
gysit
e494278cee [mlir][linalg] Add transpose support to hoist padding.
Add a transpose option to hoist padding to transpose the padded tensor before storing it into the packed tensor. The early transpose improves the memory access patterns of the actual compute kernel. The patch introduces a transpose right after the hoisted pad tensor and a second transpose inside the compute loop. The second transpose can either be fused into the compute operation or will canonicalize away when lowering to vector instructions.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D117893
2022-01-24 16:33:05 +00:00
Craig Topper
a43ed49f5b [DAGCombiner][RISCV] Canonicalize (bswap(bitreverse(x))->bitreverse(bswap(x)).
If the bitreverse gets expanded, it will introduce a new bswap. By
putting a bswap before the bitreverse, we can ensure it gets cancelled
out when this happens.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D118012
2022-01-24 08:31:53 -08:00
Matthias Springer
b00ee46b5e [mlir][bufferize][NFC] Implement BufferizableOpInterface on bufferization ops directly
No longer go through an external model. Also put BufferizableOpInterface into the same build target as the BufferizationDialect. This allows for some code reuse between BufferizationOps canonicalizers and BufferizableOpInterface implementations.

Differential Revision: https://reviews.llvm.org/D117987
2022-01-25 01:23:26 +09:00
Craig Topper
b8c7cdcc81 [SelectionDAG][RISCV] Teach getNode to fold bswap(bswap(x))->x.
This can show up during when bitreverse is expanded to bswap and
swap of bits within a byte. If the input is already a bswap, we
should cancel them out before we further transform them in a way
that makes it harder to see the redundancy.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D118007
2022-01-24 08:17:46 -08:00
Craig Topper
cd2a9ff397 [RISCV] Select int_riscv_vsll with shift of 1 to vadd.vv.
Add might be faster than shift. We can't do this earlier without
using a Freeze instruction.

This is the intrinsic version of D106689.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118013
2022-01-24 08:04:53 -08:00
Hans Wennborg
c1335166b2 Don't run test/ClangScanDeps/modules-symlink.c on Windows
'ln -s' isn't Windows friendly.
2022-01-24 16:52:01 +01:00
Lorenzo Chelini
217570b03b [MLIR][OpenMP] Suppress -Wreturn-type warnings (NFC) 2022-01-24 16:50:56 +01:00
Matthias Springer
c30d2893a4 [mlir][bufferize] Change insertion point for ToTensorOps
Both insertion points are valid. This is to make BufferizableOpInteface-based bufferization compatible with existing partial bufferization test cases. (So less changes are necessary to unit tests.)

Differential Revision: https://reviews.llvm.org/D117986
2022-01-25 00:43:04 +09:00
Kristóf Umann
3ad35ba4de [Templight] Don't display empty strings for names of unnamed template parameters
Patch originally by oktal3000: https://github.com/mikael-s-persson/templight/pull/40

When a template parameter is unnamed, the name of -templight-dump might return
an empty string. This is fine, they are unnamed after all, but it might be more
user friendly to at least describe what entity is unnamed.

Differential Revision: https://reviews.llvm.org/D115521
2022-01-24 16:37:11 +01:00
Valentin Clement
4d53f88d1a
[flang] Add MemoryAllocation pass to the pipeline
Add the MemoryAllocation pass into the pipeline. Add
the possibilty to pass the options directly within the tool (tco).

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D117886
2022-01-24 16:32:39 +01:00
Matthias Springer
fc08d1c294 [mlir][tensor][bufferize] Support tensor.rank in BufferizableOpInterfaceImpl
This is the only op that is not supported via BufferizableOpInterfaceImpl bufferization. Once this op is supported we can switch `tensor-bufferize` over to the new unified bufferization.

Differential Revision: https://reviews.llvm.org/D117985
2022-01-25 00:31:20 +09:00
Sean Fertile
d193f7be78 [libc++][AIX] Do not assert chmod return value is non-zero.
A number of the filesystem tests create a directory that contains a bad
symlink. On AIX recursively setting permissions on said directory will
return a non-zero value because of the bad symlink, however the
following rm -r still completes successfully. Avoid the assertion on
AIX, and rely on the return value of the remove command to detect
problems.

Differential Revision: https://reviews.llvm.org/D112086
2022-01-24 10:30:05 -05:00
Denys Shabalin
2d9ed1aba2 [mlir] Fix broken __repr__ implementation in Linalg OpDSL
Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D118027
2022-01-24 15:58:35 +01:00
David Spickett
473aa8e10c [llvm][docs] Fix code-block in the testing guide
Without a langauge name it's an error (with some verisons of Sphinx
it seems) or the block is simply missing in the output.
2022-01-24 14:56:31 +00:00
Matthias Springer
49e3700069 [mlir][tensor] Move BufferizableOpInterface impl to tensor dialect
This is in preparation of unifying the existing bufferization with One-Shot bufferization.

A subsequent commit will replace `tensor-bufferize`'s implementation with the BufferizableOpInterface-based implementation and move over missing test cases.

Differential Revision: https://reviews.llvm.org/D117984
2022-01-24 23:54:49 +09:00
Matt Arsenault
18aabae8e2 AMDGPU: Fix assertion on fixed stack objects with VGPR->AGPR spills
These have negative / out of bounds frame index values and would
assert when trying to set the BitVector. Fixed stack objects can't be
colored away so ignore them.
2022-01-24 09:45:41 -05:00
Bjorn Pettersson
354b2c36ee Pre-commit test cases for (sra (load)) -> (sextload) folds. NFC
Add test case to show missing folds for (sra (load)) -> (sextload).

Differential Revision: https://reviews.llvm.org/D116929
2022-01-24 15:30:55 +01:00
Matt Arsenault
99e8e17313 Reapply "Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction"
This reverts commit a97e20a3a8.
2022-01-24 09:26:52 -05:00
Matt Arsenault
7a5b0a2934 Reapply "IR: Make getRetAlign check callee function attributes"
Reapply 3d2d208f6a, reverted in
a97e20a3a8
2022-01-24 09:26:51 -05:00
ksyx
5e5efd8a91 [clang-format] Fix SeparateDefinitionBlocks issues
- Fixes https://github.com/llvm/llvm-project/issues/53227 that wrongly
  indents multiline comments
- Fixes wrong detection of single-line opening braces when used along
  with those only opening scopes, causing crashes due to duplicated
  replacements on the same token:
    void foo()
    {
      {
        int x;
      }
    }
- Fixes wrong recognition of first line of definition when the line
  starts with block comment, causing crashes due to duplicated
  replacements on the same token for this leads toward skipping the line
  starting with inline block comment:
    /*
      Some descriptions about function
    */
    /*inline*/ void bar() {
    }
- Fixes wrong recognition of enum when used as a type name rather than
  starting definition block, causing crashes due to duplicated
  replacements on the same token since both actions for enum and for
  definition blocks were taken place:
    void foobar(const enum EnumType e) {
    }
- Change to use function keyword for JavaScript instead of comparing
  strings
- Resolves formatting conflict with options EmptyLineAfterAccessModifier
  and EmptyLineBeforeAccessModifier (prompts with --dry-run (-n) or
  --output-replacement-xml but no observable change)
- Recognize long (len>=5) uppercased name taking a single line as return
  type and fix the problem of adding newline below it, with adding new
  token type FunctionLikeOrFreestandingMacro and marking tokens in
  UnwrappedLineParser:
    void
    afunc(int x) {
      return;
    }
    TYPENAME
    func(int x, int y) {
      // ...
    }
- Remove redundant and repeated initialization
- Do no change to newlines before EOF

Reviewed By: MyDeveloperDay, curdeius, HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D117520
2022-01-24 14:23:20 +00:00
Paul Walker
34aedbe90d [AArch64] Regenerate CHECK lines for llvm/test/CodeGen/AArch64/sve2-int-mul.ll 2022-01-24 14:11:19 +00:00
Simon Pilgrim
0553f5e61a [X86] Add cmp-equality bool reductions
PR53379 test coverage
2022-01-24 14:05:10 +00:00
Simon Pilgrim
4436d4cd7c [X86] Rename cmp-with-zero bool reductions
Explicitly name them icmp0_* - I'm intending to add PR53379 test coverage shortly
2022-01-24 14:05:10 +00:00
Simon Pilgrim
f7079bf9ee [X86] Fix v8i8 -> v8i16 typo in bool reductions
We were supposed to be testing <8 x i16> reductions
2022-01-24 14:05:09 +00:00
serge-sans-paille
25e8f5f827 Add missing STLExtras.h include from lldb/unittests/TestingSupport/MockTildeExpressionResolver.cpp 2022-01-24 15:03:11 +01:00
Fraser Cormack
d42678b453 [RISCV] Add side-effect-free vsetvli intrinsics
This patch introduces new intrinsics that enable the use of vsetvli in
contexts where only the returned vector length is of interest. The
pre-existing intrinsics are marked with side-effects, which prevents
even trivial optimizations on/across them.

These intrinsics are intended to be used in situations where the vector
length is fed in turn to RVV intrinsics or to vector-predication
intrinsics during loop vectorization, for example. Those codegen paths
ensure that instructions are generated with their own implicit vsetvli,
so the vector length and vtype can be relied upon to be correct.

No corresponding C builtins are planned at this stage, though that is a
possibility for the future if the need arises.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117910
2022-01-24 13:52:08 +00:00
Sjoerd Meijer
ada6d78a78 [LoopFlatten] Address FIXME about getTripCountFromExitCount. NFC.
Together with the previous commit which mainly documents better LoopFlatten's
overall strategy, this addresses a concern added as a FIXME comment in D110587;
the code refactoring (NFC) introduces functions (also for the SCEV usage) to
make this clearer.
2022-01-24 13:46:19 +00:00
Sjoerd Meijer
f6ac8088b0 [LoopFlatten] Added comments about usage of various Loop APIs. NFC. 2022-01-24 13:46:19 +00:00
serge-sans-paille
a0d5e938fe Add missing include llvm/ADT/STLExtras 2022-01-24 14:41:24 +01:00
Evgeny Shulgin
589a939072 Add isConstinit matcher
Support C++20 constinit variables for AST Matchers.
2022-01-24 08:35:42 -05:00
Nathan Sidwell
6184e565ad [demangler][NFC] Refactor some parsing
There's some unnecessary code duplication in the parser.  This
refactors that and deploys boolean variables to avoid the duplication.
These also happen to help adding module demangling (with an updated
mangling scheme).

1a) The grammar requires some lookahead concerning <template-args>. We
may discover an <unscoped-name> is actually <unscoped-template-name>
<template-args>.  (When <unscoped-name> was a substitution, there must
be a following <template-args>.)  Refactor parseName to only have one
code path looking for the 'I' indicating <template-args>.

1b) While there I altered the control flow to hold the result in a
variable, rather than tail call.  Made it easier to debug (and of
course an optimizer will DTRT here anyway).

2a) An <unscoped-name> can have an St or StL prefix.  No need for
completely separate code paths handling the following unqualified-name
though.

2b) Also no need to look for both 'St' and 'StL' separately.  Look for
'St' and then conditionally swallow an 'L'.

3) We get a similar issue as #1a when parsing a typeName.  Here I just
change the control flow slightly to bring the 'break' out to the end
of the 'S' block and embed the early return inside an if.  That's more
in keeping with the code style.

4) Although NFC, there's a new testcase as that's not covered by the
existing demangler tests and is significant in the #1a case above.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D117879
2022-01-24 05:28:38 -08:00
Nathan Sidwell
897d1bb659 [demangler] write-protect non-canonical source
To try and avoid undesired changes to the non-canonical demangler
sources, change the cp-to-llvm script to (a) write-protect the target
files and (b) prepend 'do not edit' comments that are significant to
emacs[*], and hopefully humans.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D118008
2022-01-24 05:28:38 -08:00
Nathan Sidwell
38ffea9b4c [demangler] Resync demangler sources
Recent commits changed llvm/include/llvm/Demangle without also
changing libcxxabi/src/Demangle, which is the canonical source
location.  This resyncs those commits to the libcxxabi directory.

Commits:
* 1f9e18b656
* f53d359816
* 065044c443

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D117990.diff
2022-01-24 05:28:38 -08:00
Valentin Clement
853e79d8d8
[flang] Update tco tool pipline and add translation to LLVM IR
tco is a tool to test the FIR to LLVM IR pipeline of the Flang compiler.

This patch update tco pipelines and adds the translation to LLVM IR.

A simple test is added to make sure the tool is working with a simple
FIR program.
More tests will be upstream in follow up patch from the fir-dev branch.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan, awarzynski, schweitz, mehdi_amini

Differential Revision: https://reviews.llvm.org/D117781

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
2022-01-24 14:16:27 +01:00
serge-sans-paille
5f290c090a Move STLFunctionalExtras out of STLExtras
Only using that change in StringRef already decreases the number of
preoprocessed lines from 7837621 to 7776151 for LLVMSupport

Perhaps more interestingly, it shows that many files were relying on the
inclusion of StringRef.h to have the declaration from STLExtras.h. This
patch tries hard to patch relevant part of llvm-project impacted by this
hidden dependency removal.

Potential impact:
- "llvm/ADT/StringRef.h" no longer includes <memory>,
  "llvm/ADT/Optional.h" nor "llvm/ADT/STLExtras.h"

Related Discourse thread:
https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831
2022-01-24 14:13:21 +01:00
Florian Hahn
b2a8eff45c
[LV] Make some tests more robust by adding missing users. 2022-01-24 13:04:09 +00:00
Simon Pilgrim
0e70dd858e [X86] Add PR46249 test case showing poorly widened select predicate mask 2022-01-24 12:59:30 +00:00
Sebastian Neubauer
f1e36474b9 [AMDGPU][NFC] Fix debug prints
Print the instructions instead of pointers.
2022-01-24 13:55:00 +01:00
Evgeniy Brevnov
b4b6d6374e [NFC] New test case for BasicAA and memcy/memmove with deopt
New test checks results of BasicAA for llvm.memcpy.*/llvm.memmove.* intrinsics in presence of deopt bundle. By specification expected result for unrelated global memory should be Ref. Currently this is not the case and will be fixed in upcoming patches.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D118031
2022-01-24 19:53:29 +07:00
Groverkss
b754d09fde [MLIR][Presburger] Refactor duplicate division merging to Utils
This patch moves merging of duplicate divisions to presburger utility
functions. This is required to support division merging in structures other
than IntegerPolyhedron.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D118001
2022-01-24 18:12:13 +05:30
SForeKeeper
70f83f3084 [RISCV] add support for zbkx subextension in MC layer.
This patch adds support for zbkx extension from K extension(v1.0.0) in MC layer.
Instructions with same functionality and same encoding is defined in the bitmanip extension.
It defines {Xperm8, Xperm4} as instruction aliases for xperm.* in Zbp extension. When Zbkx is enabled while Zbp is not, xperm.h will not be available. When Zbkx and Zbp are both enabled, the instructions will be decoded in Zbp format.

[[ https://reviews.llvm.org/D94999 | D94999 ]] this is the patch that introduces xperm.* instructions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117889
2022-01-24 20:38:46 +08:00
Florian Hahn
b7f69b8d46
[LV] Name values and blocks in same induction tests (NFC).
This reduces the churn in the test in future updates due to numbering
changes.
2022-01-24 12:28:43 +00:00
LLVM GN Syncbot
54f1d95066 [gn build] Port 3696c70e67 2022-01-24 12:06:49 +00:00
Kerry McLaughlin
8082ab2fc3 [LoopVectorize] Support epilogue vectorisation of loops with reductions
isCandidateForEpilogueVectorization will currently return false for loops
which contain reductions. This patch removes this restriction and makes
the following changes to support epilogue vectorisation with reductions:

- `fixReduction`: If fixReduction is being called during vectorisation of the
    epilogue, the phi node it creates will need to additionally carry incoming
     values from the middle block of the main loop.

- `createEpilogueVectorizedLoopSkeleton`: The incoming values of the phi
    created by fixReduction are updated after the vec.epilog.iter.check block
    is added. The phi is also moved to the preheader of the epilogue.

- `processLoop`: The start value of any VPReductionPHIRecipes are updated before
    vectorising the epilogue loop. The getResumeInstr function added to the ILV
    will return the resume instruction associated with the recurrence descriptor.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D116928
2022-01-24 12:03:31 +00:00
Simon Pilgrim
e4074432d5 [X86] Remove avx512f integer and/or/xor/min/max reduction intrinsics and use generic equivalents
None of these have any reordering issues, and they still emit the same reduction intrinsics without any change in the existing test coverage:

llvm-project\clang\test\CodeGen\X86\avx512-reduceIntrin.c
llvm-project\clang\test\CodeGen\X86\avx512-reduceMinMaxIntrin.c

Differential Revision: https://reviews.llvm.org/D117881
2022-01-24 11:57:53 +00:00
Adrian Vogelsgesang
3696c70e67 [clang-tidy] Add readability-container-contains check
This commit introduces a new check `readability-container-contains` which finds
usages of `container.count()` and `container.find() != container.end()` and
instead recommends the `container.contains()` method introduced in C++20.

For containers which permit multiple entries per key (`multimap`, `multiset`,
...), `contains` is more efficient than `count` because `count` has to do
unnecessary additional work.

While this this performance difference does not exist for containers with only
a single entry per key (`map`, `unordered_map`, ...), `contains` still conveys
the intent better.

Reviewed By: xazax.hun, whisperity

Differential Revision: http://reviews.llvm.org/D112646
2022-01-24 12:57:18 +01:00