Commit Graph

379573 Commits

Author SHA1 Message Date
Rong Xu
db0d7d0ba9 [SampleFDO][NFC] Refactor SampleProfileLoader to reuse in CodeGen
Break SampleProfileLoader into to a base and a derived class.
Base class (SampleProfileLoaderBaseImpl) includes the common
code for IR and MachineIR (CodeGen) sample loader.
It will be templatelized in the later patch.

Inline and Probe related code will remain in the derived class of
SampleProfileLoader and stays in SampleProfile.cpp.

We need to refactor some functions:
(1) getInstWeight() to enable the code sharing -- put the core into
getInstWeightImpl().
(2) emitAnnotation() and propagateWeights() to carve out the code
specific to SampleProfileLoader.
(3) make getInstWeight() and findFunctionSamples() virtual and override
in SampleProfileLoader as they need to access the fields in the derived
class.

Differential Revision: https://reviews.llvm.org/D95832
2021-02-10 13:29:15 -08:00
Jessica Paquette
9283058abb [AArch64][GlobalISel] Fold G_ADD into the cset for G_ICMP
When we have a G_ADD which is fed by a G_ICMP on one side, we can fold it into
the cset for the G_ICMP.

e.g. Given

```
%cmp = G_ICMP ... %x, %y
%add = G_ADD %cmp, %z
```

We would normally emit a cmp, cset, and add.

However, `%add` is either `%z` or `%z + 1`. So, we can just use `%z` as the
source of the cset rather than wzr, saving an instruction.

This would probably be cleaner in AArch64PostLegalizerLowering, but we'd need
to change the way we represent G_ICMP to do that, I think. For now, it's
easiest to implement in selection.

This is a 0.1% code size improvement on CTMark/pairlocalalign at -Os.

Example: https://godbolt.org/z/7KdrP8

Differential Revision: https://reviews.llvm.org/D96388
2021-02-10 13:28:01 -08:00
Sam McCall
bda5e57742 [clangd] Remove redundant -fno-delayed-template-parsing in tests. NFCI
We now (since a while) turn this off centrally in ParsedAST and CodeComplete.
2021-02-10 22:20:23 +01:00
Sam McCall
4dc8365f80 [clangd] Remove support for pre-standard semanticHighlighting notification
This is obsoleted by the standard semanticTokens request family.
As well as the protocol details, this allows us to remove a bunch of plumbing
around pushing highlights to clients.

This should not land until the new protocol has feature parity, see D77702.

Differential Revision: https://reviews.llvm.org/D95576
2021-02-10 22:09:03 +01:00
Jacques Pienaar
5e77ea04f2 Make gCrashRecoveryEnabled thread local
If context is enabled/disabled and queried concurrently then this
results in a data race/TSAN failure with RunSafely (where boolean
variable was not locked).

There doesn't seem to be a reasonable way to enable threads that enable
and disable recovery in parallel (without also keeping
gCrashRecoveryEnabled's lock held during Fn execution which seems
undesirable). This makes enable checking if enabled thread local and
consistent with other thread local usage of crash context here.

Differential Revision: https://reviews.llvm.org/D93907
2021-02-10 12:44:18 -08:00
Hongtao Yu
1cb47a063e [CSSPGO] Unblock optimizations with pseudo probe instrumentation.
The IR/MIR pseudo probe intrinsics don't get materialized into real machine instructions and therefore they don't incur runtime cost directly. However, they come with indirect cost by blocking certain optimizations. Some of the blocking are intentional (such as blocking code merge) for better counts quality while the others are accidental. This change unblocks perf-critical optimizations that do not affect counts quality. They include:

1. IR InstCombine, sinking load operation to shorten lifetimes.
2. MIR LiveRangeShrink, similar to #1
3. MIR TwoAddressInstructionPass, i.e, opeq transform
4. MIR function argument copy elision
5. IR stack protection. (though not perf-critical but nice to have).

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D95982
2021-02-10 12:43:17 -08:00
Ilya Tokar
c81d52997a [libc++] Use builtins in more math.h functions.
Not using builtins doesn't always imply worse code,
but for e. g. isinf, this is 30%+ faster.

Before:
name        time/op
BM_isinf     2.14ns ± 2%

After:
name        time/op
BM_isinf     1.33ns ± 2%

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D88854
2021-02-10 15:17:42 -05:00
Adrian Prantl
19fc8eede4 Add missing nullptr check.
salvageDebugInfoImpl() may fail and return a nullptr.
2021-02-10 12:15:24 -08:00
Philip Reames
9bf3cfa77b [SCEV] Add a missing AssumptionCache parameter
The AssumptionCache mechanism is used to feed assumes into known bits computations.  Most places in SCEV passed it in, but one place appears to have been missed.

Spotted via inspection, don't have a test case which actually exercises this, but it seemed like an obvious fixit.
2021-02-10 12:08:55 -08:00
Sanjay Patel
6e2053983e [InstCombine] fold lshr(mul X, SplatC), C2
This is a special-case multiply that replicates bits of
the source operand. We need this fold to avoid regression
if we make canonicalization to `mul` more aggressive for
shl+or patterns.

I did not see a way to make Alive generalize the bit width
condition for even-number-of-bits only, but an example of
the proof is:
  Name: i32
  Pre: isPowerOf2(C1 - 1) && log2(C1) == C2 && (C2 * 2 == width(C2))
  %m = mul nuw i32 %x, C1
  %t = lshr i32 %m, C2
  =>
  %t = and i32 %x, C1 - 2

  Name: i14
  %m = mul nuw i14 %x, 129
  %t = lshr i14 %m, 7
  =>
  %t = and i14 %x, 127

https://rise4fun.com/Alive/e52
2021-02-10 15:02:31 -05:00
Sanjay Patel
6bcc1fd461 [InstCombine] add tests for lshr with mul; NFC 2021-02-10 15:02:31 -05:00
Mehdi Amini
81987396ac Fix StridedMemRefType operator[] SFINAE to allow correctly selecting the int64_t overload for non-container operands 2021-02-10 20:02:11 +00:00
Pavel Labath
d77b04e4ed [lldb/test] Move and improve TestPlatformProcessConnect.py
Although it is located under tools/lldb-server, this test is very
different that other lldb-server tests. The most important distinction
is that it does not test lldb-server directly, but rather interacts with
it through the lldb client. It also tests the relevant client
functionality (the platform connect command, which is even admitted in
the test name). The fact that this test is structured as a lldb-server
test means it cannot access most of the goodies available to the
"normal" lldb tests (the runCmd function, which it reimplements; the
run_break_set_by_symbol utility function; etc.).

This patch makes it a full-fledged lldb this, and rewrites the relevant
bits to make use of the standard features. I also move the test into the
"commands" subtree to better reflect its new status.
2021-02-10 21:01:26 +01:00
Nawrin Sultana
4692bb4a8a [OpenMP] Add lower and upper bound in num_teams clause
This patch adds lower-bound and upper-bound to num_teams clause
according to OpenMP 5.1 specification. The initial number of teams
created is implementation defined, but it will be greater than or
equal to lower-bound and less than or equal to upper-bound. If
num_teams clause is not specified, the number of teams created is
implementation defined, but it will be greater or equal to 1.

Differential Revision: https://reviews.llvm.org/D95820
2021-02-10 13:58:50 -06:00
Jing Pu
544cebd619 Change type constraint of the "index" in "shape.split_at" to Shape_SizeOrIndexType
Make the type contraint consistent with other shape dialect operations.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D96377
2021-02-10 11:58:19 -08:00
Aart Bik
0b1764a3d7 [mlir][sparse] sparse tensor storage implementation
This revision connects the generated sparse code with an actual
sparse storage scheme, which can be initialized from a test file.
Lacking a first-class citizen SparseTensor type (with buffer),
the storage is hidden behind an opaque pointer with some "glue"
to bring the pointer back to tensor land. Rather than generating
sparse setup code for each different annotated tensor (viz. the
"pack" methods in TACO), a single "one-size-fits-all" implementation
has been added to the runtime support library.  Many details and
abstractions need to be refined in the future, but this revision
allows full end-to-end integration testing and performance
benchmarking (with on one end, an annotated Lingalg
op and, on the other end, a JIT/AOT executable).

Reviewed By: nicolasvasilache, bixia

Differential Revision: https://reviews.llvm.org/D95847
2021-02-10 11:57:24 -08:00
Christopher Di Bella
17db24a7a8 [libcxx] adds concepts std::invocable and std::regular_invocable
Implements parts of:
    - P0898R3 Standard Library Concepts
    - P1754 Rename concepts to standard_case for C++20, while we still can

Differential Revision: https://reviews.llvm.org/D96235
2021-02-10 19:35:53 +00:00
Christopher Di Bella
c63de225fd [libcxx] adds concept std::derived_from
Implements parts of:
    - P0898R3 Standard Library Concepts
    - P1754 Rename concepts to standard_case for C++20, while we still can

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D74292
2021-02-10 19:30:08 +00:00
Michael Kruse
d50f92a4f0 [Polly] Added dedicated test for working -O3 pipeline.
Test the NewPM as well as the legacy PM.
2021-02-10 13:25:56 -06:00
Michael Kruse
11511ee343 [Polly] Do not use -O3 pipeline for single pass test. 2021-02-10 13:25:56 -06:00
Jameson Nash
a7db680183 Renovate CMake files in the llvm-exegesis tool.
This attempts to move all tools over to using `add_llvm_library` for
better consistency. After doing this, I noticed it ended up as nearly a
reimplementation of https://reviews.llvm.org/rL342148, which later got
reverted in r342336 (b09a8c9bd9).

With ccache and ninja on a large core machine (40), I haven't run into
build errors, so I'm hopeful it's better now, though it doesn't seem to
be any different / new.

Reviewed By: stephenneuendorffer

Differential Revision: https://reviews.llvm.org/D90970
2021-02-10 14:22:55 -05:00
Arthur Eubanks
5d960cba34 [opt][NewPM] Add a --print-passes flag to print all available passes
It seems nicer to list passes given a flag rather than displaying all
passes in opt --help.

This is awkwardly structured because a PassBuilder is required, but
reusing the PassBuilder in runPassPipeline() doesn't work because we
read the input IR before getting to runPassPipeline(). So printing the
list of passes needs to happen before reading the input IR. If we remove
the legacy PM code in main() and move everything from NewPMDriver.cpp
into opt.cpp, we can create the PassBuilder before reading IR and check
if we should print the list of passes and exit. But until then this hack
seems fine.

Compared to the legacy PM, the new PM passes are lacking descriptions.
We'll need to figure out a way to add descriptions if we think this is
important.

Also, this only works for passes specified in PassRegistry.def. If we
want to print other custom registered passes, we'll need a different
mechanism.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D96101
2021-02-10 11:22:12 -08:00
Craig Topper
fc4d780eaf [RISCV] Remove superfluous semicolon. NFC 2021-02-10 11:20:29 -08:00
Christopher Di Bella
bee7b07f23 [libcxx] adds [concepts.arithmetic]
Implements parts of:
    * P0898R3 Standard Library Concepts
    * P1754 Rename concepts to standard_case for C++20, while we still can

Differential Revision: https://reviews.llvm.org/D88131
2021-02-10 19:04:57 +00:00
Nick Desaulniers
68945a8686 [Thumb2] support movs pc, lr alias for subs pc, lr, #0/eret
This is used by the Linux kernel built with CONFIG_THUMB2_KERNEL.

Because different operands are not permitted to `movs`, the diagnostics now provide multiple suggestions along the lines of using a non-pc destination operand or lr source operand.

Forked from D95586.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D96304
2021-02-10 11:00:42 -08:00
Mehdi Amini
9680ea5c98 Add convenience C++ helper to manipulate ranked strided memref
Reland 11f32a41c2 that was reverted in e49967fbd9 after fixing the build.

Differential Revision: https://reviews.llvm.org/D96192
2021-02-10 18:58:05 +00:00
Arthur Eubanks
c2c977ce50 Specify that some flags are legacy PM-specific
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D96100
2021-02-10 10:53:04 -08:00
Louis Dionne
183b75f667 [libc++] Remove c++98 Lit features in the test suite
We don't populate a Lit feature named c++98 since 31cbe0f240.
2021-02-10 13:32:41 -05:00
Erik Pilkington
1e8afba6f1 [clang] Add support for attribute 'swift_async_error'
This attribute specifies how an error is represented for a swift async method.
rdar://71941280

Differential revision: https://reviews.llvm.org/D96175
2021-02-10 13:18:13 -05:00
Craig Topper
cb161b3a88 [RISCV] Add support for matching .vf forms of fadd/fsub/fmul/fdiv/fma for fixed vectors.
fma+neg will come in a different patch since I haven't done it for .vv
yet either.

Differential Revision: https://reviews.llvm.org/D96375
2021-02-10 10:16:27 -08:00
Tom Stellard
997f6b6f8e [CMake] Remove some dead code in llvm_install_library_symlink()
Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D95666
2021-02-10 10:13:04 -08:00
Mehdi Amini
e49967fbd9 Revert "Add convenience C++ helper to manipulate ranked strided memref"
This reverts commit 11f32a41c2.

The build is broken because this commit conflits with the refactoring of
the DialectRegistry APIs in the context. It'll reland shortly after
fixing the API usage.
2021-02-10 18:09:38 +00:00
Craig Topper
0c254b4a69 [RISCV] Add support for selecting vrgather.vx/vi for fixed vector splat shuffles.
The test cases extract a fixed element from a vector and splat it
into a vector. This gets DAG combined into a splat shuffle.

I've used some very wide vectors in the test to make sure we have
at least a couple tests where the element doesn't fit into the
uimm5 immediate of vrgather.vi so we fall back to vrgather.vx.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96186
2021-02-10 10:01:56 -08:00
Christopher Di Bella
2193e8be3e [libcxx] adds concept std::copy_constructible
Implements parts of:
    - P0898R3 Standard Library Concepts
    - P1754 Rename concepts to standard_case for C++20, while we still can

Depends on D96230

Differential Revision: https://reviews.llvm.org/D96232
2021-02-10 17:56:42 +00:00
Fangrui Song
04a2e12612 DebugInfo/Symbolize: Retrieve filename from the preceding STT_FILE for .symtab symbolization
The ELF spec says:

> STT_FILE: Conventionally, the symbol's name gives the name of the source file associated with the object file. A file symbol has STB_LOCAL binding, its section index is SHN_ABS, and it precedes the other STB_LOCAL symbols for the file, if it is present.

For a local symbol, the preceding STT_FILE symbol is almost always in the same
file[1]. GNU addr2line uses this heuristic to retrieve the filename associated
with a local symbol (e.g. internal linkage functions in C/C++).

GNU addr2line can assign STT_FILE filename to a non-local symbol, too, but the trick
only works if no regular symbol precede STT_FILE. This patch does not implement this corner case
(not useful for most executables which have more than one files).

In case of filename mismatch between .debug_line & .symtab, arbitrarily make .debug_line win.

[1]: LLD does not synthesize STT_FILE symbols
(https://bugs.llvm.org/show_bug.cgi?id=48023 see also
https://sourceware.org/bugzilla/show_bug.cgi?id=26822).  An assembly file
without `.file` directives can cause mis-attribution. This is an edge case.

Differential Revision: https://reviews.llvm.org/D95927
2021-02-10 09:47:10 -08:00
Fangrui Song
4f30a3d3d2 [llvm-cfi-verify] Set UseSymbolTable to false
parseSectionContents expects to skip regions not described by DWARF.  With my
pending DebugInfo/Symbolize change, the filename can be recovered and there
will be more IndirectInstructions entries.
2021-02-10 09:44:13 -08:00
Mehdi Amini
11f32a41c2 Add convenience C++ helper to manipulate ranked strided memref
Differential Revision: https://reviews.llvm.org/D96192
2021-02-10 17:40:36 +00:00
Christopher Di Bella
2b2f36a8b1 [libcxx] adds concept std::move_constructible
Implements parts of:
    - P0898R3 Standard Library Concepts
    - P1754 Rename concepts to standard_case for C++20, while we still can

Depends on D77961

Differential Revision: https://reviews.llvm.org/D96230
2021-02-10 17:33:35 +00:00
Fangrui Song
564788ddce [Polly] Fix -Wunused-lambda-capture 2021-02-10 09:19:05 -08:00
Fangrui Song
89e257bd62 [Polly] Fix -DPOLLY_ENABLE_GPGPU_CODEGEN=off build after 222d380d2f 2021-02-10 09:17:13 -08:00
Mitch Phillips
b93786907c [GWP-ASan] Add back some headers removed by IWYU.
These headers are required for Android.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D96374
2021-02-10 09:13:45 -08:00
Nicolas Vasilache
0ac3d97bf4 [mlir][Linalg] Fix pad hoisting.
This revision fixes the indexing logic into the packed tensor that result from hoisting padding. Previously, the index was incorrectly set to the loop induction variable when in fact we need to compute the iteration count (i.e. `(iv - lb).ceilDiv(step)`).

Differential Revision: https://reviews.llvm.org/D96417
2021-02-10 16:49:38 +00:00
Tom Weaver
b86a763afb Revert "Revert "[clang][driver] Only warn once about invalid library values""
This reverts commit a743702a1f.

Test was fixed in c6a1b16db7
2021-02-10 16:40:07 +00:00
Tom Weaver
a743702a1f Revert "[clang][driver] Only warn once about invalid library values"
This reverts commit a6439b5208.

Caused buildbot failure http://lab.llvm.org:8014/#/builders/125/builds/125
2021-02-10 16:37:34 +00:00
Colin Finck
ebfadd82cb [libc++] Fix copy-paste mistake in __threading_support
Differential Revision: https://reviews.llvm.org/D96115
2021-02-10 11:00:32 -05:00
Jeremy Morse
1d68e0a075 Reland [DWARF] Location-less inlined variables should not have DW_TAG_variable
Originally landed in ddc2f1e3fb and reverted in d32deaab4d because of
a Generic test objecting. That was fixed up in 013613964f. Original
landing commit message follows:

[DWARF] Location-less inlined variables should not have DW_TAG_variable

Discussed in this thread:

  https://lists.llvm.org/pipermail/llvm-dev/2021-January/148139.html

DwarfDebug::collectEntityInfo accidentally distinguishes between variable
locations that never have a location specified, and variable locations that
have an empty location specified. The latter leads to the creation of an
empty variable referring to the abstract origin.

Fix this by seeking a non-empty location before producing a concrete
entity, to guarantee a DW_AT_location will be produced. Other loops in
collectEntityInfo and endFunctionImpl take care of examining the
retainedNodes collection and ensuring optimised-out variables are created.

Differential Revision: https://reviews.llvm.org/D95617
2021-02-10 15:40:47 +00:00
Paul Robinson
5ea2d4fa48 Avoid conflicts between debug-info and pseudo-probe profiling
After D93264, using both -fdebug-info-for-profiling and
-fpseudo-probe-for-profiling will cause the compiler to crash.
Diagnose these conflicting options in the driver.

Also, the existing CodeGen test was using the driver when it should be
running cc1.

Differential Revision: https://reviews.llvm.org/D96354
2021-02-10 07:09:18 -08:00
Jay Foad
b5f3383152 [AMDGPU] Add another test case for combining DS reads 2021-02-10 14:59:49 +00:00
Jay Foad
2114b458b0 [AMDGPU] Fix comments in SILoadStoreOptimizer::offsetsCanBeCombined 2021-02-10 14:49:33 +00:00
Nico Weber
c6a1b16db7 clang: try to fix Driver/undefined-libs.cpp on non-linux 2021-02-10 09:45:04 -05:00