Commit Graph

381645 Commits

Author SHA1 Message Date
James Henderson
49c91a64fd [llvm-objcopy][test] Improve many-sections object and test case
Additionally do some test tidy-ups and improve coverage of symbol
section indexes where the logical section index >= SHN_LORESERVE.

The symbol and section names in the many-section input object were
mostly shared. This patch changes them to be distinct, enabling
different operations such as --add-symbol, to be more targeted, when
using the object. It also makes the test less confusing and removes some
oddness in the symbol table order, presumably caused by the duplicate
names.

The input object was built from assembly that was of the form:
.section s1
sym1:
.section s2
sym2:
...
with a total of 65536 such occurrences. llvm-objcopy was then used to
remove the empty .text section automatically generated by MC, and
incidentally to move .strtab to the end of the object. This ensured that
the section/symbol indexes matched their name (i.e. section index 1 was
s1, section index 2 was s2 etc, and sym1 was in s1, sym2 in s2 etc).

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D97660
2021-03-04 09:42:43 +00:00
Fraser Cormack
8e7ceffd0b [RISCV] Fix crash when inserting large fixed-length subvectors
This patch addresses a compiler crash resulting from passing a
fixed-length type to one that expects scalable vector types. An
assertion was added to prevent this regressing in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97868
2021-03-04 09:27:16 +00:00
Fraser Cormack
d8e1d2ebf4 [RISCV] Preserve fixed-length VL on insert_vector_elt in more cases
This patch fixes up one case where the fixed-length-vector VL was
dropped (falling back to VLMAX) when inserting vector elements, as the
code would lower via ISD::INSERT_VECTOR_ELT (at index 0) which loses the
fixed-length vector information.

To this end, a custom node, VMV_S_XF_VL, was introduced to carry the VL
operand through to the final instruction. This node wraps the RVV
vmv.s.x and vmv.s.f instructions, which were being selected by
insert_vector_elt anyway.

There should be no observable difference in scalable-vector codegen.

There is still one outstanding drop from fixed-length VL to VLMAX, when
an i64 element is inserted into a vector on RV32; the splat (which is
custom legalized) has no notion of the original fixed-length vector
type.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97842
2021-03-04 09:21:10 +00:00
Martin Storsjö
1bdb636661 [ARM] Fix linking of the new unittest from a968e7b82e 2021-03-04 11:04:17 +02:00
Petr Hosek
46a3f4ae27 Revert "[XRay][x86_64] Fix CFI directives in assembly trampolines"
This reverts commit 9ee61cf3f6 since
it's failing to compile on Darwin.
2021-03-04 01:03:04 -08:00
David Green
a968e7b82e [ARM] KnownBits for CSINC/CSNEG/CSINV
This adds some simple known bits handling for the three CSINC/NEG/INV
instructions. From the operands known bits we can compute the common
bits of the first operand and incremented/negated/inverted second
operand. The first, especially CSINC ZR, ZR, comes up fair amount in the
tests. The others are more rare so a unit test for them is added.

Differential Revision: https://reviews.llvm.org/D97788
2021-03-04 08:40:20 +00:00
Andy Wingo
e638d8b2bc [lld][WebAssembly] -Bsymbolic creates indirect function table if needed
It can be that while processing relocations, we realize that in the end
we need an indirect function table.  Ensure that one is present, in that
case, to avoid writing invalid object files.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49397.

Differential Revision: https://reviews.llvm.org/D97843
2021-03-04 09:28:21 +01:00
Max Kazantsev
d9e93e8e57 [X86][CodeGenPrepare] Try to reuse IV's incremented value instead of adding the offset, part 1
While optimizing the memory instruction, we sometimes need to add
offset to the value of `IV`. We could avoid doing so if the `IV.next` is
already defined at the point of interest. In this case, we may get two
possible advantages from this:

- If the `IV` step happens to match with the offset, we don't need to add
  the offset at all;
- We reduce overlap of live ranges of `IV` and `IV.next`. They may stop overlapping
  and it will lead to better register allocation. Even if the overlap will preserve,
  we are not introducing a new overlap, so it should be a neutral transform (Disabled
  this patch, will come with follow-up).

Currently I've only added support for IVs that get decremented using `usub`
intrinsic. We could also support `AddInstr`, however there is some weird
interaction with some other transform that may lead to infinite compilation
in this case (seems like same transform is done and undone over and over).
I need to investigate why it happens, but generally we could do that too.

The first part only handles case where this reuse fully elimiates the offset.

Differential Revision: https://reviews.llvm.org/D96399
Reviewed By: reames
2021-03-04 15:22:55 +07:00
Juneyoung Lee
b15ce2f344 [LangRef] remove links to lifetime since use marker intro already has a link 2021-03-04 17:19:23 +09:00
Alex Zinenko
19db802e7b [mlir] make implementations of translation to LLVM IR interfaces private
There is no need for the interface implementations to be exposed, opaque
registration functions are sufficient for all users, similarly to passes.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D97852
2021-03-04 09:16:32 +01:00
Juneyoung Lee
2079ea94de [LangRef] fix more undefined label errors 2021-03-04 17:09:03 +09:00
Arpith C. Jacob
4a2930f495 [mlir] Add loop codegen options to some LLVM dialect ops.
Add a Loop Option attribute and generate llvm metadata attached to
branch instructions to control code generation.

Reviewed By: ftynse, mehdi_amini

Differential Revision: https://reviews.llvm.org/D96820
2021-03-04 09:01:57 +01:00
Craig Topper
90b7825598 [LegalizeVectorTypes] Remove a tautological compare. 2021-03-03 23:26:00 -08:00
Jonas Devlieghere
3dcbfa27d4 [debugserver] Fix more compiler warnings on arm64
This fixes the following two warnings in code that's only compiled on
arm64:

 - warning: cast from 'const void *' to 'unsigned char *' drops const
   qualifier [-Wcast-qual]
 - warning: embedding a directive within macro arguments has undefined
   behavior [-Wembedded-directive]
2021-03-03 23:12:12 -08:00
Martin Storsjö
c793f68d9b [libcxx] Don't use dllimport for a static member in a template
This fixes clang warnings (that are treated as errors when running
the test suite):

libcxx/include/string:4409:59: error: definition of dllimport static field [-Werror,-Wdllimport-static-field-def]
               basic_string<_CharT, _Traits, _Allocator>::npos;

The warning is normally not visible as long as the libc++ headers
are treated as system headers.

The same construct is always an error in MSVC.

(One _LIBCPP_FUNC_VIS was added in
2d8f23f571, which broke DLL builds.
59919c4d6b fixed this by adding another
_LIBCPP_FUNC_VIS on the declaration for consistency, but the underlying
issue remained, that one can't use dllimport here.)

Differential Revision: https://reviews.llvm.org/D97168
2021-03-04 08:55:27 +02:00
Hongtao Yu
c75da238b4 [CSSPGO] Deduplicating dangling pseudo probes.
Same dangling probes are redundant since they all have the same semantic that is to rely on the counts inference tool to get reasonable count for the same original block. Therefore, there's no need to keep multiple copies of them. I've seen jump threading created tons of redundant dangling probes that slowed down the compiler dramatically. Other optimization passes can also result in redundant probes though without an observed impact so far.

This change removes block-wise redundant dangling probes specifically introduced by jump threading. To support removing redundant dangling probes caused by all other passes, a final function-wise deduplication is also added.

An 18% size win of the .pseudo_probe section was seen for SPEC2017. No performance difference was observed.

Differential Revision: https://reviews.llvm.org/D97482
2021-03-03 22:44:42 -08:00
Hongtao Yu
8985515822 [CSSPGO] Unblocking optimizations by dangling pseudo probes.
This change fixes a couple places where the pseudo probe intrinsic blocks optimizations because they are not naturally removable. To unblock those optimizations, the blocking pseudo probes are moved out of the original blocks and tagged dangling, instead of allowing pseudo probes to be literally removed. The reason is that when the original block is removed, we won't be able to sample it. Instead of assigning it a zero weight, moving all its pseudo probes into another block and marking them dangling should allow the counts inference a chance to assign them a more reasonable weight. We have not seen counts quality degradation from our experiments.

The optimizations being unblocked are:

	1. Removing conditional probes for if-converted branches. Conditional probes are tagged dangling when their homing branch arms are folded so that they will not be over-counted.
	2. Unblocking jump threading from removing empty blocks. Pseudo probe prevents jump threading from removing logically empty blocks that only has one unconditional jump instructions.
	3. Unblocking SimplifyCFG and MIR tail duplicate to thread empty blocks and blocks with redundant branch checks.

Since dangling probes are logically deleted, they should not consume any samples in LTO postLink. This can be achieved by setting their distribution factors to zero when dangled.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D97481
2021-03-03 22:44:42 -08:00
Hongtao Yu
ad2a59f584 [CSSPGO] Introducing dangling pseudo probes.
Dangling probes are the probes associated to an empty block. This usually happens when all real instructions are optimized away from the block. There is a problem with dangling probes during the offline counts processing. The way the sample profiler works is that samples collected on the first physical instruction following a probe will be counted towards the probe. This logically equals to treating the instruction next to a probe as if it is from the same block of the probe. In the dangling probe case, the real instruction following a dangling probe actually starts a new block, and samples collected on the new block may cause issues when counted towards the empty block.

To mitigate this issue, we first try to move around a dangling probe inside its owning block. If there are still native instructions preceding the probe in the same block, we can then use them as a place holder to collect samples for the probe. A pass is added to walk each block backwards looking for probes not followed by any real instruction and moving them before the first real instruction. This is done right before the object emission.

If we are unlucky to find such in-block preceding instructions for a probe, the solution we are taking is to tag such probe as dangling so that the samples reported for them will not be trusted by the compiler. We leave it up to the counts inference algorithm to get such probes a reasonable count. The number `UINT64_MAX` is used to mark sample count as collected for a dangling probe.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D95962
2021-03-03 22:44:41 -08:00
Christopher Di Bella
647af31e74 [libcxx] adds concept std::assignable_from
Implements parts of:
    - P0898R3 Standard Library Concepts
    - P1754 Rename concepts to standard_case for C++20, while we still can

Depends on D96660

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D96742
2021-03-03 22:41:55 -08:00
Johannes Doerfert
e04c058798 [Docs] Remove no-aa from the alias analysis documentation
The `no-aa` pass has been removed with 7b560d40bd.

Differential Revision: https://reviews.llvm.org/D95416
2021-03-04 00:35:52 -06:00
Johannes Doerfert
5b70c12f3e [Attributor] Make DepClass a required argument
We often used a sub-optimal dependence class in the past because we
didn't see the argument. Let's make it explicit so we remember to think
about it.
2021-03-04 00:35:52 -06:00
Johannes Doerfert
e592dad82e [Attributor] Fold "TrackDependence" into the DepClassTy enum
We don't need a bool and an enum to express the three options we
currently have. This makes the interface nicer and much easier to
use optional dependencies. Also avoids mistakes where the bool is
false and enum ignored.
2021-03-04 00:35:52 -06:00
Johannes Doerfert
c8c93fdf0a [Attributor] Avoid work for GEPs and wait till the users are visited 2021-03-04 00:35:52 -06:00
Johannes Doerfert
f3f88287c5 [Attributor] Use known alignment as lower bound to avoid work
If we know already more than available from a use, we don't need to
invest time on it.
2021-03-04 00:35:52 -06:00
Johannes Doerfert
c14213e030 [Attributor][NFC] Move some trivial checks up 2021-03-04 00:35:52 -06:00
Johannes Doerfert
09c3eebf5f [Attributor] Use sensible initialization in AANoCaptureCallSiteReturned 2021-03-04 00:35:51 -06:00
Kito Cheng
b46a1b129f [doc] Fix description of _Float16
According to ISO/IEC TS 18661-3:2015 _FloatN is interchange floating
point type, extended floating-point type is _FloatNx.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2342.pdf

Reviewed By: SjoerdMeijer

Differential revision: https://reviews.llvm.org/D97759
2021-03-04 14:17:54 +08:00
Siva Chandra Reddy
35e2e448ce [libc] Remove redundant header files included from internal paths. 2021-03-03 21:29:46 -08:00
Evgeniy Brevnov
e94125f054 [DSE] Add support for not aligned begin/end
This is an attempt to improve handling of partial overlaps in case of unaligned begin\end.

Existing implementation just bails out if it encounters such cases. Even when it doesn't I believe existing code checking alignment constraints is not quite correct. It tries to ensure alignment of the "later" start/end offset while should be preserving relative alignment between earlier and later start/end.

The idea behind the change is simple. When start/end is not aligned as we wish instead of bailing out let's adjust it as necessary to get desired alignment.

I'll update with performance results as measured by the test-suite...it's still running...

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D93530
2021-03-04 12:24:23 +07:00
Alan Baker
21427b8eb8 libclc: Add clspv target to libclc
Add clspv as a new target for libclc. clspv is an open-source compiler that compiles OpenCL C to Vulkan SPIR-V. Compiles for the spir target.

The clspv target differs from the the spirv target in the following ways:
* fma is modified to use uint2 instead of ulong for mantissas. This results in lower performance fma, but provides a implementation that can be used on more Vulkan devices where 64-bit integer support is less common.
* Use of a software implementation of nextafter because the generic implementation depends on nextafter being a defined builtin function for which clspv has no definition.
* Full optimization of the library (-O3) and no conversion to SPIR-V

This library is close to what would be produced by running opt -O3 < builtins.opt.spirv-mesa3d-.bc > builtins.opt.clspv--.bc and continuing the build from that point.

Reviewer: jvesely

Differential Revision: https://reviews.llvm.org/D94013
2021-03-04 00:19:10 -05:00
Siva Chandra Reddy
0106370bee [compiler-rt | interceptors] Provide an intercept override knob.
This knob is useful for downstream users who want that some of their
libc functions to not be intercepted.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D97740
2021-03-03 21:03:46 -08:00
Christopher Di Bella
f893312c1a [libcxx] adds concept std::common_with
Implements parts of:
    - P0898R3 Standard Library Concepts
    - P1754 Rename concepts to standard_case for C++20, while we still can

Depends on D96660

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D96683
2021-03-03 20:01:11 -08:00
Serguei Katkov
a0ff0f30df [InstCombine] Move statepoint intrinsic handling from visitCall to visitCallBase
statepoint intrinsic can be used in invoke context,
so it should be handled in visitCallBase to cover both call and invoke.

Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D97833
2021-03-04 11:00:22 +07:00
Wang, Pengfei
e7e67c930a Add Windows ehcont section support (/guard:ehcont).
Add option /guard:ehcont

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D96709
2021-03-04 11:47:29 +08:00
Jason Molenda
266bb78f7d LanguageRuntime for 0th frame unwind, simplify getting pc-for-symbolication
Add calls into LanguageRuntime when finding the unwind method to
use out of the 0th (currently executing) stack frame.

Allow for the LanguageRuntimes to indicate if this stack frames
should be treated like a zeroth-frame -- symbolication should be
done based on the saved pc address, not decremented like normal ABI
function calls.

Add methods to RegisterContext and StackFrame to get a pc value
suitable for symbolication, to reduce the number of places in lldb
where we decrement the saved pc values before symbolication.

<rdar://problem/70398009>
Differential Revision: https://reviews.llvm.org/D97644
2021-03-03 19:29:40 -08:00
Arthur O'Dwyer
09fa1d0e50 [libc++] Introduce __identity_t<T>. NFCI.
This is just a shorter synonym for `__identity<T>::type`.
Use it consistently throughout, where possible.

There is still some metaprogramming in <memory> and <variant>
where `__identity` is being used _without_ immediately calling
`::type` on it; but this is the unusual case, and it will become
even less usual as we start deliberately protecting certain types
against deduction (e.g. D97742).

Differential Revision: https://reviews.llvm.org/D97862
2021-03-03 22:23:14 -05:00
Christopher Di Bella
3f5438c46c [libcxx] adds concept std::common_reference_with
Implements parts of:
    - P0898R3 Standard Library Concepts
    - P1754 Rename concepts to standard_case for C++20, while we still can

Depends on D96657

Reviewed By: ldionne, Mordante, #libc

Differential Revision: https://reviews.llvm.org/D96660
2021-03-03 17:52:41 -08:00
Aart Bik
553cb6d473 [mlir][sparse] fix bug in reduction chain
Found with exhaustive testing, it is possible that a while loop
appears in between chainable for loops. As long as we don't
scalarize reductions in while loops, this means we need to
terminate the chain at the while. This also refactors the
reduction code into more readable helper methods.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D97886
2021-03-03 17:38:22 -08:00
Juneyoung Lee
dbf41ddaa3 [LangRef] fix undefined label 2021-03-04 10:12:57 +09:00
Juneyoung Lee
c821ef4513 [LangRef] Make lifetime intrinsic's semantics consistent with StackColoring's comment
This patch is an update to LangRef by describing lifetime intrinsics' behavior
by following the description of MIR's LIFETIME_START/LIFETIME_END markers
at StackColoring.cpp (eb44682d67/llvm/lib/CodeGen/StackColoring.cpp (L163)) and the discussion in llvm-dev.

In order to explicitly define the meaning of an object lifetime, I added 'Object Lifetime' subsection.

Reviewed By: nlopes

Differential Revision: https://reviews.llvm.org/D94002
2021-03-04 09:58:06 +09:00
River Riddle
83ef862fad [mlir] Add support for generating Attribute classes for ODS
The support for attributes closely maps that of Types (basically 1-1) given that Attributes are defined in exactly the same way as Types. All of the current ODS TypeDef classes get an Attr equivalent. The generation of the attribute classes themselves share the same generator as types.

Differential Revision: https://reviews.llvm.org/D97589
2021-03-03 16:41:49 -08:00
Craig Topper
201ebf211f [RISCV] Make use of the required features in BuiltinInfo to store that V extension builtins require 'experimental-v'.
Use that to print the diagnostic in SemaChecking instead of
listing all of the builtins in a switch.

With the required features, IR generation will also be able
to error on this. Checking this here allows us to have a RISCV
focused error message.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D97826
2021-03-03 16:24:08 -08:00
Fangrui Song
584cb67d2d [IRSymTab] Set FB_used on llvm.compiler.used symbols
IR symbol table does not parse inline asm. A symbol only referenced by inline
asm is not in the IR symbol table, so LTO does not know that the definition (in
another translation unit) is referenced and may internalize it, even if that
definition has `__attribute__((used))` (which lowers to `llvm.compiler.used` on
ELF targets since D97446).

```
// cabac.c
__attribute__((used)) const uint8_t ff_h264_cabac_tables[...] = {...};

// h264_cabac.c
  asm("lea ff_h264_cabac_tables(%rip), %0" : ...);
```

`__attribute__((used))` is the recommended way to tell the compiler there may
be inline asm references, so the usage is perfectly fine. This patch
conservatively sets the `FB_used` bit on `llvm.compiler.used` symbols to work
around the IR symbol table limitation. Note: before D97446, Clang never emitted
symbols in the `llvm.compiler.used` list, so this change does not punish any
Clang emitted global object.

Without the patch, `ff_h264_cabac_tables` may be assigned to a non-external
partition and get internalized. Then we will get a linker error because the
`cabac.c` definition is not exposed.

Differential Revision: https://reviews.llvm.org/D97755
2021-03-03 16:22:30 -08:00
Steven Wan
0b274ed499 [AIX] Update default arch on AIX
On AIX, the default arch level should match the minimum supported arch level of the OS version.

Differential Revision: https://reviews.llvm.org/D97823
2021-03-03 19:07:43 -05:00
River Riddle
e07c968a6d [mlir][pdl][NFC] Rename InputOp to OperandOp
This better matches the actual IR concept that is being modeled, and is consistent with how the rest of PDL is structured.

Differential Revision: https://reviews.llvm.org/D95718
2021-03-03 15:48:00 -08:00
River Riddle
55f878bad9 [mlir][pdl] Add a new !pdl.range<> type
This type represents a range of positional values. It will be used in followup revisions to add support for variadic constructs to PDL, such as operand and result ranges.

Differential Revision: https://reviews.llvm.org/D95717
2021-03-03 15:48:00 -08:00
Xun Li
03f668613c [LICM][Coroutine] Don't sink stores from loops with coro.suspend instructions
See pr46990(https://bugs.llvm.org/show_bug.cgi?id=46990). LICM should not sink store instructions to loop exit blocks which cross coro.suspend intrinsics. This breaks semantic of coro.suspend intrinsic which return to caller directly. Also this leads to use-after-free if the coroutine is freed before control returns to the caller in multithread environment.

This patch disable promotion by check whether loop contains coro.suspend intrinsics.
This is a resubmit of D86190.
Disabling LICM for loops with coroutine suspension is a better option not only for correctness purpose but also for performance purpose.
In most cases LICM sinks memory operations. In the case of coroutine, sinking memory operation out of the loop does not improve performance since coroutien needs to get data from the frame anyway. In fact LICM would hurt coroutine performance since it adds more entries to the frame.

Differential Revision: https://reviews.llvm.org/D96928
2021-03-03 15:21:57 -08:00
Fangrui Song
30ad7b5dad [test] Fix profiling.ll
`__llvm_prf_nm` is compressed if zlib is available. In addition, its size may not be that stable.
2021-03-03 15:18:44 -08:00
Jin Lin
7c2192b277 Add the use of register r for outlined function when register r is live in and defined later.
The compiler needs to mark register $x0 as live in for the following case.

    $x1 = ADDXri $sp, 16, 0
    BL @spam, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit killed $x1, implicit-def $sp, implicit-def dead $x0

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D95267
2021-03-03 15:14:11 -08:00
River Riddle
64be3fcb7a Fix flang build after D97804 2021-03-03 15:07:03 -08:00