423449 Commits

Author SHA1 Message Date
Simon Pilgrim
ec6024d081 [X86] Replace avx512f integer mul reduction builtins with generic builtin
D117829 added the generic "__builtin_reduce_mul" which we can use to replace the x86 specific integer mul reduction builtins - internally these were mapping to the same intrinsic already so there are no test changes required.

Differential Revision: https://reviews.llvm.org/D125222
2022-05-09 14:10:28 +01:00
Nikita Popov
33f02de5df [ScalarEvolution] Add tests for umin_seq with non-zero operand (NFC) 2022-05-09 15:03:12 +02:00
Rosie Sumpter
1a2665902f [AArch64][SVE] Improve codegen when extracting first lane of active lane mask
When extracting the first lane of a predicate created using the
llvm.get.active.lane.mask intrinsic, it should give the same codegen as
when the predicate is created using the llvm.aarch64.sve.whilelo
intrinsic, since get.active.lane.mask is lowered to whilelo. This patch
ensures the codegen is the same by recognizing
llvm.get.active.lane.mask as a flag-setting operation in this case.

Differential Revision: https://reviews.llvm.org/D125215
2022-05-09 13:56:04 +01:00
Sam McCall
a316a9815a [clangd] Rewrite TweakTesting helpers to avoid reparsing the same code. NFC
Previously the EXPECT_AVAILABLE macros would rebuild the code at each marked
point, by expanding the cases textually.
There were often lots, and it's nice to have lots!

This reduces total unittest time by ~10% on my machine.
I did have to sacrifice a little apply() coverage in AddUsingTests (was calling
expandCases directly, which was otherwise unused), but we have
EXPECT_AVAILABLE tests covering that, I don't think there's real risk here.

Differential Revision: https://reviews.llvm.org/D125109
2022-05-09 14:53:00 +02:00
Florian Hahn
41e142fdc7
Recommit "[SimpleLoopUnswitch] Collect either logical ANDs/ORs but not both."
This reverts commit 7211d5ce07830ebfa2cfc30818cd7155375f7e47.

This version fixes a crash that caused buildbot failures with the first
version.
2022-05-09 13:49:12 +01:00
Florian Hahn
4c569ceeaa
[SimpleLoopUnswitch] Add test case for crash with db7a87ed4fa7. 2022-05-09 13:48:56 +01:00
Sam McCall
bb53eb1ef4 [clangd] Skip extra round-trip in parsing args in debug builds. NFC
This is a clever cross-cutting sanity test for clang's arg parsing I suppose.
But clangd creates thousands of invocations, ~all with identical trivial
arguments, and problems with these would be caught by clang's tests.
This overhead accounts for 10% of total unittest time!

Differential Revision: https://reviews.llvm.org/D125169
2022-05-09 14:45:35 +02:00
Sam McCall
bf9921adb9 [clangd] Disable predefined macros in tests. NFC
These aren't needed. With them the generated predefines buffer is 13KB.
For every TestTU, we must:
 - generate the buffer (3 times: parsing preamble, scanning preamble, main file)
 - parse the buffer (again 3 times)
 - serialize all the macros it defines in the PCH
 - compress the buffer itself to write it into the PCH
 - decompress it from the PCH

Avoiding this reduces unit test time by ~25%.

Differential Revision: https://reviews.llvm.org/D125172
2022-05-09 14:44:51 +02:00
David Sherwood
45f2e92d97 [NFC][LoopVectorize] Add SVE test for tail-folding combined with interleaving
Differential Revision: https://reviews.llvm.org/D125001
2022-05-09 13:08:25 +01:00
Nathan Sidwell
e48cd7088b [demangler] Buffer peeking needs buffer
The output buffer has a 'back' member, which returns NUL when you try
it with an empty buffer.  But there are no use cases that need that
additional functionality.  This makes the 'back' member behave more
like STL containers' back members.  (It still returns a value, not a
reference.)

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D123201
2022-05-09 04:17:22 -07:00
Simon Pilgrim
8a92c45e07 [Clang] Add integer mul reduction builtin
Similar to the existing bitwise reduction builtins, this lowers to a llvm.vector.reduce.mul intrinsic call.

For other reductions, we've tried to share builtins for float/integer vectors, but the fmul reduction intrinsic also take a starting value argument and can either do unordered or serialized, but not reduction-trees as specified for the builtins. However we address fmul support this shouldn't affect the integer case.

Differential Revision: https://reviews.llvm.org/D117829
2022-05-09 12:12:53 +01:00
Nathan James
12cb540529
[clang-tidy][NFC] Replace many instances of std::string where a StringRef would suffice.
There's many instances in clang tidy checks where owning strings are used when we already have a stable string from the options, so using a StringRef makes much more sense.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D124341
2022-05-09 12:01:46 +01:00
Sigurur sgeirsson
fc440f27cd Filter non-external static members from SBType::GetFieldAtIndex.
See [[ https://github.com/llvm/llvm-project/issues/55040 | issue 55040 ]] where static members of classes declared in the anonymous namespace are incorrectly returned as member fields from lldb::SBType::GetFieldAtIndex(). It appears that attrs.member_byte_offset contains a sentinel value for members that don't have a DW_AT_data_member_location.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D124409
2022-05-09 12:34:13 +02:00
Alban Bridonneau
fef81131d9 [SVE] Optimize new cases for lowerConvertToSVBool
Converts to SVBool are already considered as a nop, if they
are converting an operand from a ptrue or a cmp, because
they zero the extra predicate lanes by construction.

This patch adds 2 similar cases:
- The wide cmp, which were not directly recognized by the test
for other forms of cmp
- Splats of 1, which will be generated as ptrue, and as such
will also zero the extra predicate lines.

Reviewed By: paulwalker-arm, peterwaller-arm

Differential Revision: https://reviews.llvm.org/D124908
2022-05-09 10:17:57 +00:00
Benjamin Kramer
a48adc5658 [mlir][math] Promote (b)f16 to f32 when lowering to libm calls
libm doesn't have overloads for the small types, so promote them to a
bigger type and use the f32 function.

Differential Revision: https://reviews.llvm.org/D125093
2022-05-09 11:59:55 +02:00
Pavel Labath
ae7fe65cf6 [lldb/DWARF] Fix linking direction in CopyUniqueClassMethodTypes
IIUC, the purpose of CopyUniqueClassMethodTypes is to link together
class definitions in two compile units so that we only have a single
definition of a class. It does this by adding entries to the die_to_type
and die_to_decl_ctx maps.

However, the direction of the linking seems to be reversed. It is taking
entries from the class that has not yet been parsed, and copying them to
the class which has been parsed already -- i.e., it is a very
complicated no-op.

Changing the linking order allows us to revert the changes in D13224
(while keeping the associated test case passing), and is sufficient to
fix PR54761, which was caused by an undesired interaction with that
patch.

Differential Revision: https://reviews.llvm.org/D124370
2022-05-09 11:47:55 +02:00
Martin Storsjö
61f9ec5e61 [libcxx] [test] Fix the nasty_macros test on Windows on ARM/ARM64
This isn't a configuration that we unfortunately can add to
the CI practically at the moment, but I do run the tests
sporadically offline in this configuration.

Differential Revision: https://reviews.llvm.org/D124993
2022-05-09 12:46:41 +03:00
Marek Kurdej
85ec8a9ac1 [clang-format] Correctly handle SpaceBeforeParens for builtins.
That's a partial fix for https://github.com/llvm/llvm-project/issues/55292.

Before, known builtins behaved differently from other identifiers:
```
void f () { return F (__builtin_LINE() + __builtin_FOO ()); }
```
After:
```
void f () { return F (__builtin_LINE () + __builtin_FOO ()); }
```

Reviewed By: owenpan

Differential Revision: https://reviews.llvm.org/D125085
2022-05-09 11:42:41 +02:00
Philipp Tomsich
91b24b0180 [AArch64] Ampere1 does not support MTE
The initial support for the Ampere1 mistakenly signalled support for
the MTE feature.  However, the core does not include the optional MTE
functionality.

Update the target parser to not include MTE for Ampere1.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D125191
2022-05-09 11:29:42 +02:00
Rahul Anand R
7dcd0ea683 [AArch64] Generate AND in place of CSEL for predicated CTTZ
This patch implements a for a target specific optimization that replaces
the cmp and csel from cttz with an and mask.

Differential Revision: https://reviews.llvm.org/D123782
2022-05-09 10:28:20 +01:00
Pavel Labath
fa593b079b Revert "[lldb] parallelize calling of Module::PreloadSymbols()"
This reverts commit b7d807dbcff0d9df466e0312b4fef57178d207be -- it
breaks TestMultipleDebuggers.py.
2022-05-09 11:11:01 +02:00
Florian Hahn
61bb2e4ea8
[ConstraintElimination] Add initial ssub.with.overflow tests. 2022-05-09 10:02:59 +01:00
Marek Kurdej
50cd52d935 [clang-format] Fix WhitespaceSensitiveMacros not being honoured when macro closing parenthesis is followed by a newline.
Fixes https://github.com/llvm/llvm-project/issues/54522.

This fixes regression introduced in 5e5efd8a91.

Before the culprit commit, macros in WhitespaceSensitiveMacros were correctly formatted even if their closing parenthesis weren't followed by semicolon (or, to be precise, when they were followed by a newline).
That commit changed the type of the macro token type from TT_UntouchableMacroFunc to TT_FunctionLikeOrFreestandingMacro.

Correct formatting (with `WhitespaceSensitiveMacros = ['FOO']`):
```
FOO(1+2)
FOO(1+2);
```

Regressed formatting:
```
FOO(1 + 2)
FOO(1+2);
```

Reviewed By: HazardyKnusperkeks, owenpan, ksyx

Differential Revision: https://reviews.llvm.org/D123676
2022-05-09 10:59:33 +02:00
David Green
02f8519502 [DAG] Prevent infinite loop combining bitcast shuffle
This prevents an infinite loop from D123801, where code trying to reduce
the total number of bitcasts, but also handling constants, could create
the opposite transform. Prevent the transform in these case to let the
bitcast of a constant transform naturally.

Fixes #55345
2022-05-09 09:36:22 +01:00
Ben Shi
d2c4ac979b [AVR] Add PrintMethod for operand memspi
Reviewed By: Patryk27

Differential Revision: https://reviews.llvm.org/D124913
2022-05-09 08:31:49 +00:00
Abinav Puthan Purayil
7f6489d0e3 [AMDGPU] Regenerate checks in a mir test 2022-05-09 13:28:09 +05:30
Jean Perier
ed0341788a [flang] retain binding label of entry subprograms
When processing an entry-stmt in name resolution, attrs_ was
reset before SetBindNameOn was called, causing the symbol to lose
the binding label information.

Differential Revision: https://reviews.llvm.org/D125097
2022-05-09 09:50:17 +02:00
Hongtao Yu
a4190037fa [CSSPGO][Preinliner] Use linear threshold to drive inline decision.
The per-callsite size threshold used today to drive preinline decision is based on hotness/coldness cutoff. The default setup is for callsites with a sample count above the hotness cutoff (99%), a 1500 size threshold is used. Any callsite below 99.99% coldness cutoff uses a zero threshold. This has a couple issues:

1. While both cutoffs and size thoresholds are configurable, different applications may need different setups, making a universal setup impractical.

2. The callsites between hotness cutoff and coldness cutoff are not considered as inline candidates, which could be a missing opportunity.

3. Hot callsites always use the same threshold. In reality we may want a bigger threshold for hotter callsites.

In this change we are introducing a linear threshold regardless of hot/cold cutoffs. Given a sample space, a threshold is computed for a callsite based on the position of that callsite sample in the whole space. With that we no longer need to define what's hot or cold. Callsites with different hotness will get a different threshold. This should overcome the above three issues.

I have seen good results with a universal default setup for two of our internal services.

For one service, 0.2% to 0.5% perf improvement over a baseline with a previous default setup, on-par code size.
For the second service, 0.5% to 0.8% perf improvement over a baseline with a previous default setup, 0.2% code size increase; on-par performance and code size with a baseline that is with a carefully tuned cutoff to cover enough hot functions.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D125023
2022-05-08 22:07:58 -07:00
Christopher Bate
9879807393 [mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types
Adds missing logic in the lowering from NvGPU to NVVM to support fp32
(in an accumulator operand) and tf32 (in multiplicand operand) types.
Fixes logic in one of the helper functions for converting the result
of a mma.sync operation with multiple 8x256bit output tiles, which is
the case for f32 outputs.

Differential Revision: https://reviews.llvm.org/D124533
2022-05-08 21:49:42 -06:00
Peixin-Qiao
c207e36025 [flang] Enforce a program not including more than one main program
As Fortran 2018 5.2.2 states, a program shall consist of exactly one
main program. Add this semantic check.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D125186
2022-05-09 10:48:06 +08:00
Xiaodong Liu
36d4f42c36 [lld] Fix typo for processAux; NFC
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D125163
2022-05-09 10:21:47 +08:00
Alexander Yermolovich
3abb68a626 [BOLT][DWARF] Fix assert for split dwarf.
Fixing a small bug where it would assert if CU does not modify .debug_addr section.

Differential Revision: https://reviews.llvm.org/D125181
2022-05-08 19:18:17 -07:00
Simon Pilgrim
9a12138b5f [SLP][X86] Add test coverage for PR50392 / Issue #49736 2022-05-08 19:40:04 +01:00
Tue Ly
6d92f4022d [libc][Obvious] Fix cmake usage of list PREPEND (unavailable pre-3.15). 2022-05-08 13:58:05 -04:00
Tue Ly
13f358376a [libc] Add LINK_LIBRARIES option to add_fp_unittest and add_libc_unittest.
This is needed to prepare for adding FLAGS option.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D125055
2022-05-08 17:33:45 +00:00
Simon Pilgrim
7e3aa70668 [X86] Add test coverage for PR26515 / Issue #26889 2022-05-08 18:19:04 +01:00
Groverkss
4d1fd705f0 [docs] Add Office Hours for Tobias Grosser 2022-05-08 21:14:31 +05:30
Simon Pilgrim
6824cf1ab7 [X86] Set some more plausible latencies for horizontal add/subs on znver1
These are all microcoded/multi-pipe nightmares on Ryzen, but we shouldn't just be using the WriteMicrocoded class which is for REALLY bad microcoded nightmares - instead use the same approximate latencies as znver2 (Agner and uops.info both suggest similar values) - and make sure we use the FPU defs for both

Fixes #53242
2022-05-08 15:48:42 +01:00
Simon Pilgrim
800d36cf32 [DAG] Only perform the fold (A-B)+(C-D) --> (A+C)-(B+D) when both inner subs have one use
Fixes #51381
2022-05-08 13:51:58 +01:00
Simon Pilgrim
5a6792a146 [X86] combine-add.ll - add test case for PR52039 / Issue #51381
Also split AVX1/AVX2 test coverage
2022-05-08 13:45:23 +01:00
Luo, Yuanke
d5d498f9ba [X86][AMX] Simplify AMX test case.
Extract test for zero tile configure into a small test case.
2022-05-08 19:12:54 +08:00
Simon Pilgrim
751005a2ca [SLP][X86] Add test coverage for PR42652 / Issue #41997 2022-05-08 12:09:14 +01:00
Simon Pilgrim
7d94597048 [SLP][X86] Add test coverage for PR41892 / Issue #41237 2022-05-08 11:40:53 +01:00
Simon Pilgrim
2233a61500 [SLP][X86] Add test coverage for PR49934 / Issue #49278
D124284 should help us vectorize the sub-128-bit vector cases
2022-05-08 11:33:01 +01:00
Simon Pilgrim
96d2d2508e [SLP][X86] Add test coverage for PR47491 / Issue #46835
D124284 should help us vectorize the sub-128-bit vector cases
2022-05-08 11:24:46 +01:00
Simon Pilgrim
993d9462e1 [InstCombine] Add test coverage for PR43261 / Issue #42606 2022-05-08 11:10:49 +01:00
Simon Pilgrim
72eb630207 [Headers][X86] Enable basic Wdocumentation testing on X86 headers
First part of Issue #35297 - we want to enable Wdocumentation-pedantic as well, but need '\n' support first which Issue #55319 is addressing
2022-05-08 10:53:28 +01:00
Simon Pilgrim
6b3a111a28 [Headers][X86] Replace \operation with \code{.operation}
\operation ... \endoperation are not valid doxygen commands and cause issues when -Wdocumentation is enabled (Issue #35297)

This patch proposes to replace them with \code{.operation} ... \endcode blocks so that the pseudo-code is correctly retained in any documentation and downstream can use the ".operation" type for its own formatting.

Differential Revision: https://reviews.llvm.org/D125170
2022-05-08 10:46:26 +01:00
David Green
6f9e1ea0ef [VectorCombine] Attempt to fold select shuffles from reductions
Given a commutative reduction leading from a shuffle, the order of the
lanes on the shuffle are not important for the result. This means we can
reorder the shuffle to something simpler, which we try shuffling the
first vector lanes first. This was D123494.

The new shuffle may not be profitable though, and if it is not we can
try the folding of select shuffles from D123911. This, with some
adjustment as the output lane ordering is now unimportant, can allow the
final shuffle to simplify given the inputs to the patterns from D123911.
Where as each transformation on their own are not profitable, the
combination is.

We can only support a single shuffle when called from reductions, but we
are able to sort the ReconstructMask, potentially allowing it to
simplify to an identity or concat mask.

Differential Revision: https://reviews.llvm.org/D125086
2022-05-08 10:32:41 +01:00
Simon Pilgrim
f2b1648812 [X86] Fix some signedness errors in x86 headers
Another step toward enabling full -Wsystem-headers testing across all x86 headers

Fix a number of cases where the arg / return value signedness doesn't match the C/C++ intrinsic.

So far I've just added explicit casts as necessary, but we might want to address some of the mismatches directly

Differential Revision: https://reviews.llvm.org/D125164
2022-05-08 09:42:58 +01:00