399585 Commits

Author SHA1 Message Date
Teresa Johnson
7acd1807dd [Sanitizer] Modify test to avoid bot timeouts
Change the mutex type to one that initializes on construction and
hopefully avoid what appear to be deadlock failures in the new test
on a couple bots, e.g.:

https://green.lab.llvm.org/green/job/clang-stage1-RA/24140/testReport/SanitizerCommon-Unit/__Sanitizer-x86_64-Test/SanitizerCommon_ReportFile/
2021-09-21 18:47:16 -07:00
Chris Lattner
da93829b44 [DialectAsmPrinter] Add missing 'printAttributeWithoutType' member.
DialectAsmParser has a `parseAttribute` member that takes a
contextual type, but DialectAsmPrinter doesn't have the corresponding
member to take advantage of it.  As such, custom attribute
implementations can't really use it.  This adds the obvious missing
method which fills this hole.

Differential Revision: https://reviews.llvm.org/D110211
2021-09-21 18:45:24 -07:00
David Tenty
7a320b279d [libcxx][AIX] Remove locale fallbacks for old OS levels
These routines were add years ago during initial porting attempts to AIX and are mostly build hacks for routines which we're missing at the time, but are available now on recent AIX OS levels.

Thus builds on modern AIX OS levels no longer need these and they cause problems if you try to build the library with a generic triple (i.e. powerpc-ibm-aix) as we'll pull them in and encounter duplicate definitions from the OS.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D110183
2021-09-21 21:14:50 -04:00
Wenlei He
5f187f0afa [SamplePGO] Add switch to honor zero count on block level as accurate
Add a new LLVM switch `-profile-sample-block-accurate` to trust zero block counts for branches. Currently we leave out such zero counts when annotating branch weight metadata, which would lead to weights being considered as unknown.

Differential Revision: https://reviews.llvm.org/D110117
2021-09-21 17:06:37 -07:00
Louis Dionne
f8b1cc3657 [libc++abi] Remove unnecessary atomic_support.h header from libc++abi
The file was a duplicate of atomic_support.h in libc++. Since we now
require the libc++ sources in order to build libc++abi, it's OK to
remove this duplication.

Thanks to @chandlerc for noticing this.

Differential Revision: https://reviews.llvm.org/D110103
2021-09-21 19:55:21 -04:00
Teresa Johnson
56dec4be9b [Sanitizer] Allow setting the report path to create directory
When setting the report path, recursively create the directory as
needed. This brings the profile path support for memprof on par with
normal PGO. The code was largely cloned from __llvm_profile_recursive_mkdir
in compiler-rt/lib/profile/InstrProfilingUtil.c.

Differential Revision: https://reviews.llvm.org/D109794
2021-09-21 16:42:42 -07:00
Toshihito Kikuchi
22ea0cea59 [compiler-rt] [windows] Add more assembly patterns for interception
To intercept the functions in Win11's ntdll.dll, we need to use the trampoline
technique because there are bytes other than 0x90 or 0xcc in the gaps between
exported functions.  This patch adds more patterns that appear in ntdll's
functions.

Bug: https://bugs.llvm.org/show_bug.cgi?id=51721

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D109941
2021-09-21 15:51:58 -07:00
Usman Nadeem
645b8f5365 [AArch64][SVE] Add patterns to generate ADR instruction
Differential Revision: https://reviews.llvm.org/D109665

Change-Id: I9d2928688b80b804a16f52928e2057749ec2c0b2
2021-09-21 15:50:49 -07:00
Yuanfang Chen
45c0ebe00e [libc++] Surpress -Wunused-value warning in variant
The idiom helps with parameter unpacking so the return value is not
important. Make it explicit.
2021-09-21 15:33:10 -07:00
Arthur Eubanks
e42234383e Make DiagnosticInfoResourceLimit's limit param required
And always print it.

This makes some LLVM diagnostics match up better with Clang's diagnostics.

Updated some AMDGPU uses of DiagnosticInfoResourceLimit and now we print
better diagnostics for those.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D110204
2021-09-21 15:27:58 -07:00
Kirill Stoimenov
2649999579 [asan] Fixed a bug causing a crash when redzone optimization kicked in on X86 with -asan-optimize-callbacks flag on.
This change adds the ASan intrinsic to the list whihc are setting hasCopyImplyingStackAdjustment.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D110012
2021-09-21 22:26:03 +00:00
Alex Zinenko
bdaf038266 [mlir] Always create a list of alias scopes when emitting LLVM IR
Previously, the translation to LLVM IR would emit IR that directly uses
a scope metadata node in case only one scope was in use in alias.scopes
or noalias metadata. It should always be a list of scopes. The verifier
change in 8700f2bd36bb9b7d7075ed4dac0aef92b9489237 enforced this and
broke the test. Fix the translation to always create a list of scopes
using a new metadata node, update and reenable the respective test.

Fixes PR51919.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D110140
2021-09-22 00:00:46 +02:00
Craig Topper
b81e26c7f4 Recommit "[X86] Clear kill flags when rewriting SETCC uses in flag copy lowering."
This time with the right bug number.

When we rewrite the setcc we replace set old setcc output register
with the new CondReg. But since CondReg can be shared by other
replacements, we don't know if the kill flags for the old register
are valid for CondReg. So be conservative and remove them.

The test case has a SETCCr and a SETCCm on the same condition so
they end up sharing the same CondReg. The SETCCr had one use with
a kill flag. This kill flag isn't valid after the replacement because
CondReg needs a live range extending to the later SETCCm replacment.

Fixes PR51903.
2021-09-21 14:59:25 -07:00
Xu Mingjie
32ab405717 [LTO] Emit DebugLoc for dead function in optimization remarks
Currently, the dead functions information getting from optimizations remarks does not contain debug location, but knowing where these dead functions locate could be useful for debugging or for detecting dead code.

Cause in `LTO::addRegularLTO()` we use `BitcodeModule::getLazyModule()` to read the bitcode module, when we pass Function F to `ore::NV()`, F is not materialized, so `F->getSubprogram()` returns nullptr, and there is no debug location information of dead functions in optimizations remarks.

This patch call `F->materialize()` before we pass Function F to `ore::NV()`, then debug location information will be emitted for dead functions in optimization remarks.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D109737
2021-09-21 14:50:21 -07:00
Joseph Huber
e95731cca7 [OpenMP] Add thread ID function into new RTL
The new device runtime library currently lacks the
`kmpc_get_hardware_thread_id_in_block` function which is currently used
when doing the SPMDzation optimization. This call would be introduced
through the optimization and then cause a linking error because it was
not present. This patch adds support for this runtime call.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D110195
2021-09-21 17:43:50 -04:00
Arthur Eubanks
e1ed02181f [clang] Make -Rpass imply -Rpass=.*
Previously with -Rpass (and friends) we'd have remarks "enabled", but
without an actual regex.

As seen in the test change to line numbers, this can give us better
diagnostics by properly enabling NeedLocTracking with -Rpass.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D110201
2021-09-21 14:35:56 -07:00
Craig Topper
51a82e051e Revert "[X86] Clear kill flags when rewriting SETCC uses in flag copy lowering."
This reverts commit 7550f146ff75667d6e1828d64438dcc23b77f036.

I botched the bug number.
2021-09-21 14:33:44 -07:00
Craig Topper
7550f146ff [X86] Clear kill flags when rewriting SETCC uses in flag copy lowering.
When we rewrite the setcc we replace set old setcc output register
with the new CondReg. But since CondReg can be shared by other
replacements, we don't know if the kill flags for the old register
are valid for CondReg. So be conservative and remove them.

The test case has a SETCCr and a SETCCm on the same condition so
they end up sharing the same CondReg. The SETCCr had one use with
a kill flag. This kill flag isn't valid after the replacement because
CondReg needs a live range extending to the later SETCCm replacment.

Fixes PR51908.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110046
2021-09-21 14:29:46 -07:00
Albion Fung
b93359ea3f [PowerPC] Support for vector bool int128 on vector comparison builtins
This patch implements support for the type vector bool int128
for arguments on vector comparison builtins listed below,
which would otherwise crash due to ambiguity.

The following builtins are added:

vec_all_eq (vector bool __int128, vector bool __int128)
vec_all_ne (vector bool __int128, vector bool __int128)
vec_any_eq (vector bool __int128, vector bool __int128)
vec_any_ne (vector bool __int128, vector bool __int128)
vec_cmpne(vector bool __int128 a, vector bool __int128 b)
vec_cmpeq(vector bool __int128 a, vector bool __int128 b)

Differential revision: https://reviews.llvm.org/D110084
2021-09-21 16:29:37 -05:00
Sanjay Patel
52832cd917 [CodeGen] regenerate test checks; NFC
This broke with 2f6b07316f56 because it wrongly runs the entire LLVM optimizer.
2021-09-21 16:53:41 -04:00
George Burgess IV
cd5f582c3d MemoryBuiltins: update comment; NFC
This comment references behavior that was removed in
ccae43a247b0791f78ea89b9cb7e59fa70f5000d, which is a commit from 5 years
ago. It seems safe to assume that that behavior won't be coming back
soon. If it does, we can readd this part of the comment :)
2021-09-21 13:47:26 -07:00
Giorgis Georgakoudis
ac90dfc43a Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit 1d66649adf28d48ae1731516d87fb899426e3349.

Revert to fix AMG GPU issue.
2021-09-21 13:20:39 -07:00
Arthur O'Dwyer
c9af0e61fa [libc++] counting_semaphore should not be default-constructible.
Neither the current C++2b draft, nor any revision of [p1135],
nor libstdc++, claims that `counting_semaphore` should be
default-constructible. I think this was just a copy-paste issue
somehow.

Also, `explicit` was missing from the constructor.

Also, `constexpr` remains missing; but that's probably more of a
technical limitation, since apparently there are some platforms
where we don't (can't??) use the atomic implementation and
have to rely on pthreads, which obviously isn't constexpr.

Differential Revision: https://reviews.llvm.org/D110042
2021-09-21 16:19:31 -04:00
Sanjay Patel
2f6b07316f [InstCombine] fold cast of right-shift if high bits are not demanded
(masked) trunc (lshr X, C) --> (masked) lshr (trunc X), C

Narrowing the shift should be better for analysis and can lead
to follow-on transforms as shown.

Attempt at a general proof in Alive2:
https://alive2.llvm.org/ce/z/tRnnSF

Here are a couple of the specific tests:
https://alive2.llvm.org/ce/z/bCnTp-
https://alive2.llvm.org/ce/z/TfaHnb

Differential Revision: https://reviews.llvm.org/D110170
2021-09-21 16:09:08 -04:00
Antonio Frighetto
43d6991c2a [IR] Look through bitcast in hasFnAttribute()
A logic incompleteness may lead MemorySSA to be too conservative
in its results. Specifically, when dealing with a call of kind
`call i32 bitcast (i1 (i1)* @test to i32 (i32)*)(i32 %1)`, where
the function `test` is declared with readonly attribute, the
bitcast is not looked through, obscuring function attributes. Hence,
some methods of CallBase (e.g., doesNotReadMemory) could provide
suboptimal results.

Differential Revision: https://reviews.llvm.org/D109888
2021-09-21 21:57:02 +02:00
Usman Nadeem
248342b7c7 [OpenMP][OMPD] Fix compile error when OMPD is not supported
Differential Revision: https://reviews.llvm.org/D110120

Change-Id: I9d39dacfab5b7fbab37ee4b4d960d51e0892b24d
2021-09-21 12:45:15 -07:00
Nikita Popov
e4a1af3724 [MergeICmps] Remove unused NumMerged variable 2021-09-21 21:43:25 +02:00
Matheus Izvekov
d9308aa39b [clang] don't mark as Elidable CXXConstruct expressions used in NRVO
See PR51862.

The consumers of the Elidable flag in CXXConstructExpr assume that
an elidable construction just goes through a single copy/move construction,
so that the source object is immediately passed as an argument and is the same
type as the parameter itself.

With the implementation of P2266 and after some adjustments to the
implementation of P1825, we started (correctly, as per standard)
allowing more cases where the copy initialization goes through
user defined conversions.

With this patch we stop using this flag in NRVO contexts, to preserve code
that relies on that assumption.
This causes no known functional changes, we just stop firing some asserts
in a cople of included test cases.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D109800
2021-09-21 21:41:20 +02:00
Dan F-M
33e1713a00 [Bazel] Add support for targeting macOS arm64
In attempting to build JAX on Apple Silicon, we discovered an issue with
the bazel configuration in llvm-project-overlay. This patch fixes the
logic, at least when building JAX. More context is included on the
following GitHub issue: https://github.com/google/jax/issues/5501

Differential Revision: https://reviews.llvm.org/D109839
2021-09-21 12:24:16 -07:00
Nikita Popov
f2fa6ad047 [MergeICmps] Don't reorder unmerged comparisons
MergeICmps will currently sort (by offset) all comparisons in a chain,
including those that do not get merged. This is problematic in two ways:

 * We may end up moving the original first block into the middle of
   the chain, in which case the "extra work" instructions will also
   be in the middle of the chain, resulting in invalid IR
   (reported in https://reviews.llvm.org/D108782#3005583).
 * Reordering branches is generally not legal, because it may
   introduce branch on poison, which is UB (PR51845). The merging
   done by MergeICmps is legal as long as we assume that memcmp()
   works on frozen memory, but the reordering of unmerged comparisons
   is definitely incorrect (without inserting freeze instructions),
   so we should avoid it.

There are easier ways to fix the first issue, but I figured it was
worthwhile to do this properly to also fix the second one. What we
now do is to restore the original relative order of (potentially
merged) comparisons.

I took the liberty of dropping the MERGEICMPS_DOT_ON functionality,
because it would be more awkward to implement now (as the before and
after representation is different) and it doesn't seem terribly
useful nowadays.

Differential Revision: https://reviews.llvm.org/D110024
2021-09-21 21:22:12 +02:00
David Blaikie
40e971a210 nullptr printing - update for a change to clang type printing that now uses "std::nullptr_t" 2021-09-21 11:50:24 -07:00
Christian Sigg
9149ae09bd Support value-typed references in iterator facade's operator->()
Add a PointerProxy similar to the existing iterator_facade_base::ReferenceProxy and return it from the arrow operator. This prevents iterator facades with a reference type that is not a true reference to take the address of a temporary.

Forward the reference type of the mapped_iterator to the iterator adaptor which in turn forwards it to the iterator facade. This fixes mlir::op_iterator::operator->() to take the address of a temporary.

Make some polishing changes to op_iterator and op_filter_iterator.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D109490
2021-09-21 20:42:22 +02:00
David Blaikie
49c519a848 DebugInfo: Rebuild decltype(nullptr) as 'std::nullptr_t'
Now that Clang's been changed to render nullptr types/template
parameters as 'std::nullptr_t' do the same thing down here.

(Clang commit: 131e8786640a49daf533b7ead4d3b5b82e0aea2a )
2021-09-21 11:37:30 -07:00
Michael Liao
2d1ffad010 [IR] Re-group AAMDNodes relevant interfaces. NFC. 2021-09-21 14:29:33 -04:00
David Blaikie
131e878664 Print nullptr_t namespace qualified within std::
This improves diagnostic (& important to me, DWARF) accuracy - otherwise
there could be ambiguities between "std::nullptr_t" and some user-defined
type that's /actually/ "nullptr_t" defined in the global namespace.

Differential Revision: https://reviews.llvm.org/D110044
2021-09-21 11:21:40 -07:00
alex-t
1a33294652 [AMDGPU] Filtering out the inactive lanes bits when lowering copy to SCC
Normally, given that the DA results are kept consistent over the selection DAG, uniform comparisons get selected to S_CMP_* but divergent to V_CMP_*.  Sometimes, for the sake of efficiency,  SSA subgraphs may be converted to VALU to avoid repeatedly copying data back and forth. Hence we have to be able to sustain the correctness passing the i1 from VALU to SALU context and vice versa.

VALU operations only process the active lanes of the VGPR and ignore inactive ones.
Active lanes correspond to 1 bit in the EXEC mask register.
SALU represents i1 as just one bit but VALU as 64bits: 0/1 and 0/(0xffffffffffffffff & EXEC) respectively.
SALU uses one-bit conditional flag SCC but VALU - VCC that is a pair of 32-bit SGPRs

To expose SCC to the VALU context we need to convert the one-bit boolean value to the appropriate 64bit.
To return back to the SALU context we need to do the opposite.

To correctly convert 64bit VALU boolean to either 0 or 1 we need to filter out the bits corresponding to the inactive lanes.

Reviewed By: piotr

Differential Revision: https://reviews.llvm.org/D109900
2021-09-21 21:19:31 +03:00
Owen Anderson
b5fbbdd202 Teach InstCombine to eliminate malloc-realloc-free triplets.
Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D109988
2021-09-21 18:07:49 +00:00
Brendon Cahoon
cbdf624bb8 [AMDGPU] Correctly merge alias.scope and noalias metadata for memops
When adding alias.scope and noalias metadata to a memcpy function,
the alias.scope and noalias metadata from the operands are merged.
The rule for merging alias.scope is to take the intersection of
the domains and the union of the scopes within those domains.
The rule for merging noalias is to take the intersection.

The bug is that AMDGPULowerModuleLDS was using concatenation for
both alias.scope and noalias. For example, when f1 and f2 are added
to the LDS structure and there is a memcpy(f2, f1, sizeof(f1)).
Then, concatenation creates noalias metadata for the memcpy that
includes both {f1, f2}. That means that the memcpy is assumed
not to alias a prior load of f2, which enables the optimizer to
remove a load of f2 that occurs after mempcy.

The function MDNode::getmostGenericAliasScope defines the semantics
for alias.scope. There is a function, combineMetadata in Local.cpp,
that uses intersect for noalias.

Differential Revision: https://reviews.llvm.org/D110049
2021-09-21 13:02:01 -05:00
Craig Topper
7c975665b4 [RISCV] Make some arrays of constants 'static const'. NFC
This helps the compiler generate better code.
2021-09-21 10:52:47 -07:00
Danila Malyutin
78b51c7a2c [LSR] Make sure that Factor fits into Base type
Fixes pr42770

Differential Revision: https://reviews.llvm.org/D108772
2021-09-21 20:50:50 +03:00
Giorgis Georgakoudis
1d66649adf [OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert, jhuber6

Differential Revision: https://reviews.llvm.org/D102107
2021-09-21 10:50:04 -07:00
Amy Kwan
2af57b6099 [PowerPC] Add prefix load pattern for fpext to v2f64
This patch adds a prefixed load pattern involving v2f32 fpext v2f64, where we
are dealing with a value with an offset that fits into a 34-bit signed immediate.
A reduced test case is also added to patch that tests the pattern, in which the
pattern is tested in the big endian CHECKs of the newly added test.

Differential Revision: https://reviews.llvm.org/D109887
2021-09-21 12:45:24 -05:00
Ayal Zaks
ab6a69dfea [LV] Fix crash for reverse interleaved loads with gap under fold-tail.
This patch fixes the crash found by PR51614:
whenever doing tail folding, interleave groups must be considered under mask.

Another fix D108900 follows for targets that support masked loads and stores:
when *deciding* to vectorize with masked interleave groups, check if the access
is reverse - which is currently not supported; rather than (only) asserting when
computing cost and generating code.

Differential Revision: https://reviews.llvm.org/D108891
2021-09-21 20:13:32 +03:00
Craig Topper
aeb63d464f [RISCV] Teach RISCVTargetLowering::shouldSinkOperands to sink splats for and/or/xor.
This requires a minor change to CodeGenPrepare to ensure that
shouldSinkOperands will be called for And.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D110106
2021-09-21 10:07:29 -07:00
Nico Weber
908c115442 [lldb/win] Default to native PDB reader when LLVM_ENABLE_DIA_SDK=NO
Trying to use the DIA SDK reader only to fail with "DIA SDK wasn't enabled"
isn't very useful. The native PDB reader is missing some stuff, but it's still
better than nothing.

Reduces number of lldb-check-shell test failures with LLVM_ENABLE_DIA_SDK=NO
from 27 to 15.

Differential Revision: https://reviews.llvm.org/D110172
2021-09-21 13:02:52 -04:00
LLVM GN Syncbot
a3bb4f1455 [gn build] Port a04a6ce7726b 2021-09-21 16:34:07 +00:00
Mark de Wever
a04a6ce772 [libc++][format] Adds parser std-format-spec.
This implements the generic std.format.spec framework for all types.

The Unicode support will be added in a separate patch.

Implements parts of:
- P0645 Text Formatting

Completes:
- LWG-3242 std::format: missing rules for arg-id in width and precision
- P1892 Extended locale-specific presentation specifiers for std::format

Reviewed By: #libc, ldionne, vitaut

Differential Revision: https://reviews.llvm.org/D103368
2021-09-21 18:29:58 +02:00
cchen
8c68bd480f [OpenMP][NFC] Add declare variant and metadirective to support page 2021-09-21 11:28:13 -05:00
Aaron Ballman
73a8bcd789 Revert "Diagnose -Wunused-value based on CFG reachability"
This reverts commit 63e0d038fc20c894a3d541effa1bc2b1fdea37b9.

It causes test failures:

http://lab.llvm.org:8011/#/builders/119/builds/5612
https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8835548361443044001/+/u/clang/test/stdout
2021-09-21 12:25:13 -04:00
Dávid Bolvanský
c0fdfc9af2 [InstCombine] powi(x, y) * powi(x, z) -> powi(x, y + z)
We already have pow(x, y) * pow(x, z) -> pow(x, y + z) transformation, but we are missing same transformation for powi (power is integer).

Requires reassoc.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D109954
2021-09-21 18:20:46 +02:00