Commit Graph

25424 Commits

Author SHA1 Message Date
Sanjay Patel
597d03ba90 [SelectionDAG] fold constant with undef vector per element
This makes the SDAG behavior consistent with the way we do this in IR.
It's possible that we were getting the wrong answer before. For example,
'xor undef, undef --> 0' but 'xor undef, C' --> undef. 

But the most practical improvement is likely as shown in the tests here - 
for FP, we were overconstraining undef lanes to NaN, and that can prevent 
vector simplifications/narrowing (see D51553).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348090 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-02 13:48:42 +00:00
Sanjay Patel
9b0a5d5469 [DAGCombiner] guard against an oversized shift crash
This change prevents the crash noted in the post-commit comments 
for rL347478 :
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181119/605166.html

We can't guarantee that an oversized shift amount is folded away, 
so we have to check for it.

Note that I committed an incomplete fix for that crash with:
rL347502

But as discussed here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181126/605679.html
...we have to try harder.

So I'm not sure how to expose the bug now (and apparently no fuzzers have found 
a way yet either).

On the plus side, we have discovered that we're missing real optimizations by 
not simplifying nodes sooner, so the earlier fix still has value, and there's 
likely more value in extending that so we can simplify more opcodes and simplify 
when doing RAUW and/or putting nodes on the combiner worklist.

Differential Revision: https://reviews.llvm.org/D54954


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348089 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-02 13:33:56 +00:00
Simon Pilgrim
b0510431f2 [SelectionDAG] Improve SimplifyDemandedBits to SimplifyDemandedVectorElts simplification
D52935 introduced the ability for SimplifyDemandedBits to call SimplifyDemandedVectorElts through BITCASTs if the demanded bit mask entirely covered the sub element.

This patch relaxes this to demanding an element if we need any bit from it.

Differential Revision: https://reviews.llvm.org/D54761

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348073 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-01 12:08:55 +00:00
Nicolai Haehnle
37b386de21 AMDGPU: Fix various issues around the VirtReg2Value mapping
Summary:
The VirtReg2Value mapping is crucial for getting consistently
reliable divergence information into the SelectionDAG. This
patch fixes a bunch of issues that lead to incorrect divergence
info and introduces tight assertions to ensure we don't regress:

1. VirtReg2Value is generated lazily; there were some cases where
   a lookup was performed before all relevant virtual registers were
   created, leading to an out-of-sync mapping. Those cases were:

  - Complex code to lower formal arguments that generated CopyFromReg
    nodes from live-in registers (fixed by never querying the mapping
    for live-in registers).

  - Code that generates CopyToReg for formal arguments that are used
    outside the entry basic block (fixed by never querying the
    mapping for Register nodes, which don't need the divergence info
    anyway).

2. For complex values that are lowered to a sequence of registers,
   all registers must be reflected in the VirtReg2Value mapping.

I am not adding any new tests, since I'm not actually aware of any
bugs that these problems are causing with trunk as-is. However,
I recently added a test case (in r346423) which fails when D53283 is
applied without this change. Also, the new assertions should provide
most of the effective test coverage.

There is one test change in sdwa-peephole.ll. The underlying issue
is that since the divergence info is now correct, the DAGISel will
select V_OR_B32 directly instead of S_OR_B32. This leads to an extra
COPY which affects the behavior of MachineLICM in a way that ends up
with the S_MOV_B32 with the constant in a different basic block than
the V_OR_B32, which is presumably what defeats the peephole.

Reviewers: alex-t, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D54340

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348049 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 22:55:29 +00:00
Sanjay Patel
8e27bd1b3b [SelectionDAG] fold FP binops with 2 undef operands to undef
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348016 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 18:38:52 +00:00
Yonghong Song
f170fd931e Revert "[BTF] Add BTF DebugInfo"
This reverts commit 9c6b970db8.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348004 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 16:54:43 +00:00
Yonghong Song
9c6b970db8 [BTF] Add BTF DebugInfo
This patch adds BPF Debug Format (BTF) as a standalone
LLVM debuginfo. The BTF related sections are directly
generated from IR. The BTF debuginfo is generated
only when the compilation target is BPF.

What is BTF?
============

First, the BPF is a linux kernel virtual machine
and widely used for tracing, networking and security.
  https://www.kernel.org/doc/Documentation/networking/filter.txt
  https://cilium.readthedocs.io/en/v1.2/bpf/

BTF is the debug info format for BPF, introduced in the below
linux patch
  69b693f0ae (diff-06fb1c8825f653d7e539058b72c83332)
in the patch set mentioned in the below lwn article.
  https://lwn.net/Articles/752047/

The BTF format is specified in the above github commit.
In summary, its layout looks like
  struct btf_header
  type subsection (a list of types)
  string subsection (a list of strings)

With such information, the kernel and the user space is able to
pretty print a particular bpf map key/value. One possible example below:
  Withtout BTF:
    key: [ 0x01, 0x01, 0x00, 0x00 ]
  With BTF:
    key: struct t { a : 1; b : 1; c : 0}
  where struct is defined as
    struct t { char a; char b; short c; };

How BTF is generated?
=====================

Currently, the BTF is generated through pahole.
  https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=68645f7facc2eb69d0aeb2dd7d2f0cac0feb4d69
and available in pahole v1.12
  https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=4a21c5c8db0fcd2a279d067ecfb731596de822d4

Basically, the bpf program needs to be compiled with -g with
dwarf sections generated. The pahole is enhanced such that
a .BTF section can be generated based on dwarf. This format
of the .BTF section matches the format expected by
the kernel, so a bpf loader can just take the .BTF section
and load it into the kernel.
  8a138aed4a

The .BTF section layout is also specified in this patch:
with file include/llvm/BinaryFormat/BTF.h.

What use cases this patch tries to address?
===========================================

Currently, only the bpf instruction stream is required to
pass to the kernel. The kernel verifies it, jits it if configured
to do so, attaches it to a particular kernel attachment point,
and later executes when a particular event happens.

This patch tries to expand BTF to support two more use cases below:
  (1). BPF supports subroutine calls.
       During performance analysis, it would be good to
       differentiate which call is hot instead of just
       providing a virtual address. This would require to
       pass a unique identifier for each subroutine to
       the kernel, the subroutine name is a natual choice.
  (2). If a particular jitted instruction is hot, we want
       user to know which source line this jitted instruction
       belongs to. This would require the source information
       is available to various profiling tools.

Note that in a single ELF file,
  . there may be multiple loadable bpf programs,
  . for a particular to-be-loaded bpf instruction stream,
    its instructions may come from multiple PROGBITS sections,
    the bpf loader needs to merge them together to a single
    consecutive insn stream before loading to the kernel.
For example:
  section .text: subroutines funcFoo
  section _progA: calling funcFoo
  section _progB: calling funcFoo
The bpf loader could construct two loadable bpf instruction
streams and load them into the kernel:
  . _progA funcFoo
  . _progB funcFoo
So per ELF section function offset and instruction offset
will need to be adjusted before passing to the kernel, and
the kernel essentially expect only one code section regardless
of how many in the ELF file.

What do we propose and Why?
===========================

To support the above two use cases, we propose to
add an additional section, .BTF.ext, to the ELF file
which is the input of the bpf loader. A different section
is preferred since loader may need to manipulate it before
loading part of its data to the kernel.

The .BTF.ext section has a similar header to the .BTF section
and it contains two subsections for func_info and line_info.
  . the func_info maps the func insn byte offset to a func
    type in the .BTF type subsection.
  . the line_info maps the insn byte offset to a line info.
  . both func_info and line_info subsections are organized
    by ELF PROGBITS AX sections.

pahole is not a good place to implement .BTF.ext as
pahole is mostly for structure hole information and more
importantly, we want to pass the actual code to the kernel.
  . bpf program typically is small so storage overhead
    should be small.
  . in bpf land, it is totally possible that
    an application loads the bpf program into the
    kernel and then that application quits, so
    holding debug info by the user space application
    is not practical as you may not even know who
    loads this bpf program.
  . having source codes directly kept by kernel
    would ease deployment since the original source
    code does not need ship on every hosts and
    kernel-devel package does not need to be
    deployed even if kernel headers are used.

LLVM is a good place to implement.
  . The only reliable time to get the source code is
    during compilation time. This will result in both more
    accurate information and easier deployment as
    stated in the above.
  . Another consideration is for JIT. The project like bcc
    (https://github.com/iovisor/bcc)
    use MCJIT to compile a C program into bpf insns and
    load them to the kernel. The llvm generated BTF sections
    will be readily available for such cases as well.

Design and implementation of emiting .BTF/.BTF.ext sections
===========================================================

The BTF debuginfo format is defined. Both .BTF and .BTF.ext
sections are generated directly from IR when both
"-target bpf" and "-g" are specified. Note that
dwarf sections are still generated as dwarf is used
by user space tools like llvm-objdump etc. for BPF target.

This patch also contains tests to verify generated
.BTF and .BTF.ext sections for all supported types, func_info
and line_info subsections. The patch is also tested
against linux kernel bpf sample tests and selftests.

Signed-off-by: Yonghong Song <yhs@fb.com>

Differential Revision: https://reviews.llvm.org/D53736

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347999 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 16:22:59 +00:00
Than McIntosh
40a8867348 [CodeGen] Prefer static frame index for STATEPOINT liveness args
Summary:
If a given liveness arg of STATEPOINT is at a fixed frame index
(e.g. a function argument passed on stack), prefer to use this
fixed location even the address is also in a register. If we use
the register it will generate a spill, which is not necessary
since the fixed frame index can be directly recorded in the stack
map.

Patch by Cherry Zhang <cherryyz@google.com>.

Reviewers: thanm, niravd, reames

Reviewed By: reames

Subscribers: cherryyz, reames, anna, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347998 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 16:22:41 +00:00
Nicolai Haehnle
98272e49b8 TableGen/ISel: Allow PatFrag predicate code to access captured operands
Summary:
This simplifies writing predicates for pattern fragments that are
automatically re-associated or commuted.

For example, a followup patch adds patterns for fragments of the form
(add (shl $x, $y), $z) to the AMDGPU backend. Such patterns are
automatically commuted to (add $z, (shl $x, $y)), which makes it basically
impossible to refer to $x, $y, and $z generically in the PredicateCode.

With this change, the PredicateCode can refer to $x, $y, and $z simply
as `Operands[i]`.

Test confirmed that there are no changes to any of the generated files
when building all (non-experimental) targets.

Change-Id: I61c00ace7eed42c1d4edc4c5351174b56b77a79c

Reviewers: arsenm, rampitec, RKSimon, craig.topper, hfinkel, uweigand

Subscribers: wdng, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D51994

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347992 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 14:15:13 +00:00
Alex Bradbury
365d19d957 [SelectionDAG] Support result type promotion for FLT_ROUNDS_
For targets where i32 is not a legal type (e.g. 64-bit RISC-V), 
LegalizeIntegerTypes must promote the result of ISD::FLT_ROUNDS_.

Differential Revision: https://reviews.llvm.org/D53820


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347986 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 13:18:33 +00:00
Alex Bradbury
cdc3118297 [SelectionDAG] Support promotion of PREFETCH operands
For targets where i32 is not a legal type (e.g. 64-bit RISC-V), 
LegalizeIntegerTypes must promote the operands of ISD::PREFETCH.

Differential Revision: https://reviews.llvm.org/D53281


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347980 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 10:06:31 +00:00
Alex Bradbury
810e8672a4 [SelectionDAG] Support promotion of FRAMEADDR/RETURNADDR operands
For targets where i32 is not a legal type (e.g. 64-bit RISC-V), 
LegalizeIntegerTypes must promote the operand.

Differential Revision: https://reviews.llvm.org/D53279


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347978 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 10:02:06 +00:00
Alex Bradbury
392b2c8943 [TargetLowering][RISCV] Introduce isSExtCheaperThanZExt hook and implement for RISC-V
DAGTypeLegalizer::PromoteSetCCOperands currently prefers to zero-extend 
operands when it is able to do so. For some targets this is more expensive 
than a sign-extension, which is also a valid choice. Introduce the 
isSExtCheaperThanZExt hook and use it in the new SExtOrZExtPromotedInteger 
helper. On RISC-V, we prefer sign-extension for FromTy == MVT::i32 and ToTy == 
MVT::i64, as it can be performed using a single instruction.

Differential Revision: https://reviews.llvm.org/D52978


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347977 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 09:56:54 +00:00
Hsiangkai Wang
3c67822cb8 [CodeGen] Fix bugs in BranchFolderPass when debug labels are generated.
Skip DBG_VALUE and DBG_LABEL in branch folding algorithms.

The bug is reported in
https://bugs.chromium.org/p/chromium/issues/detail?id=898160.

Differential Revision: https://reviews.llvm.org/D54199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347964 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 08:07:29 +00:00
Hsiangkai Wang
5f388804f3 [NFC] Refine doxygen format.
Differential Revision: https://reviews.llvm.org/D54568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347963 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-30 08:07:24 +00:00
Sanjay Patel
426ddc1933 [DAGCombiner] narrow truncated binops
The motivating case for this is shown in:
https://bugs.llvm.org/show_bug.cgi?id=32023
and the corresponding rot16.ll regression tests.

Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc 
sequences that don't get folded in IR.

As the TODO comments suggest, there will be regressions if we extend this (for x86, 
we mostly seem to be missing LEA opportunities, but there are likely vector folds 
missing too). I think those should be considered existing bugs because this is the 
same transform that we do as an IR canonicalization in instcombine. We just need 
more tests to make those visible independent of this patch.

Differential Revision: https://reviews.llvm.org/D54640


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347917 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-29 20:58:26 +00:00
Alex Bradbury
6edc257cf4 [RISCV] Implement codegen for cmpxchg on RV32IA
Utilise a similar ('late') lowering strategy to D47882. The changes to 
AtomicExpandPass allow this strategy to be utilised by other targets which 
implement shouldExpandAtomicCmpXchgInIR.

All cmpxchg are lowered as 'strong' currently and failure ordering is ignored. 
This is conservative but correct.

Differential Revision: https://reviews.llvm.org/D48131


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347914 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-29 20:43:42 +00:00
Francis Visoiu Mistrih
50c71c5f06 [MachineScheduler] Order FI-based memops based on stack direction
It makes more sense to order FI-based memops in descending order when
the stack goes down. This allows offsets to stay "consecutive" and allow
easier pattern matching.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347906 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-29 20:03:19 +00:00
Craig Topper
74e6178bb1 [SelectionDAG][AArch64][X86] Move legalization of vector MULHS/MULHU from LegalizeDAG to LegalizeVectorOps
I believe we should be legalizing these with the rest of vector binary operations. If any custom lowering is required for these nodes, this will give the DAG combine between LegalizeVectorOps and LegalizeDAG to run on the custom code before constant build_vectors are lowered in LegalizeDAG.

I've moved MULHU/MULHS handling in AArch64 from Lowering to isel. Moving the lowering earlier caused build_vector+extract_subvector simplifications to kick in which made the generated code worse.

Differential Revision: https://reviews.llvm.org/D54276

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347902 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-29 19:36:17 +00:00
Petr Pavlu
659d8117b2 [GlobalISel] Fix insertion of stack-protector epilogue
* Tell the StackProtector pass to generate the epilogue instrumentation
  when GlobalISel is enabled because GISel currently does not implement
  the same deferred epilogue insertion as SelectionDAG.
* Update StackProtector::InsertStackProtectors() to find a stack guard
  slot by searching for the llvm.stackprotector intrinsic when the
  prologue was not created by StackProtector itself but the pass still
  needs to generate the epilogue instrumentation. This fixes a problem
  when the pass would abort because the stack guard AllocInst pointer
  was null when generating the epilogue -- test
  CodeGen/AArch64/GlobalISel/arm64-irtranslator-stackprotect.ll.

Differential Revision: https://reviews.llvm.org/D54518


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347862 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-29 13:22:53 +00:00
Petr Pavlu
3834f85200 [GlobalISel] Make EnableGlobalISel always set when GISel is enabled
Change meaning of TargetOptions::EnableGlobalISel. The flag was
previously set only when a target switched on GlobalISel but it is now
always set when the GlobalISel pipeline is enabled. This makes the flag
consistent with TargetOptions::EnableFastISel and allows its use in
other parts of the compiler to determine when GlobalISel is enabled.

The EnableGlobalISel flag had previouly only one use in
TargetPassConfig::isGlobalISelAbortEnabled(). The method used its value
to determine if GlobalISel was enabled by a target and returned false in
such a case. To preserve the current behaviour, a new flag
TargetOptions::GlobalISelAbort is introduced to separately record the
abort behaviour.

Differential Revision: https://reviews.llvm.org/D54518


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347861 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-29 12:56:32 +00:00
Serguei Katkov
1e1f81208a [CGP] Improve compile time for complex addressing mode
This is a fix for PR39625 with improvement the compile time
by reducing the number of intermediate Phi nodes created.

Reviewers: john.brawn, reames
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54932


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347839 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-29 06:45:18 +00:00
Francis Visoiu Mistrih
9ba9c03b67 [MachineScheduler] Add support for clustering mem ops with FI base operands
Before this patch, the following stores in `merge_fail` would fail to be
merged, while they would get merged in `merge_ok`:

```
void use(unsigned long long *);
void merge_fail(unsigned key, unsigned index)
{
  unsigned long long args[8];
  args[0] = key;
  args[1] = index;
  use(args);
}
void merge_ok(unsigned long long *dst, unsigned a, unsigned b)
{
  dst[0] = a;
  dst[1] = b;
}
```

The reason is that `getMemOpBaseImmOfs` would return false for FI base
operands.

This adds support for this.

Differential Revision: https://reviews.llvm.org/D54847

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347747 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-28 12:00:28 +00:00
Francis Visoiu Mistrih
83895b33cb [CodeGen][NFC] Make TII::getMemOpBaseImmOfs return a base operand
Currently, instructions doing memory accesses through a base operand that is
not a register can not be analyzed using `TII::getMemOpBaseRegImmOfs`.

This means that functions such as `TII::shouldClusterMemOps` will bail
out on instructions using an FI as a base instead of a register.

The goal of this patch is to refactor all this to return a base
operand instead of a base register.

Then in a separate patch, I will add FI support to the mem op clustering
in the MachineScheduler.

Differential Revision: https://reviews.llvm.org/D54846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347746 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-28 12:00:20 +00:00
Simon Atanasyan
0884e6d64e [DebugInfo] Rename EmitDebugThreadLocal back to EmitDebugValue. NFC
This reverts r294500. DwarfCompileUnit::addAddressExpr uses DIEExpr
for PCOffset. In that case the expression is unrelated to thread locals
and so emitting a value of the DIEExpr does not have to always mean
emit-debug-thread-local.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347744 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-28 11:48:07 +00:00
Craig Topper
689482be85 [LegalizeVectorTypes][X86][ARM][AArch64][PowerPC] Don't use SplitVecOp_TruncateHelper for FP_TO_SINT/UINT.
SplitVecOp_TruncateHelper tries to promote the result type while splitting FP_TO_SINT/UINT. It then concatenates the result and introduces a truncate to the original result type. But it does this without inserting the AssertZExt/AssertSExt that the regular result type promotion would insert. Nor does it turn FP_TO_UINT into FP_TO_SINT the way normal result type promotion for these operations does. This is bad on X86 which doesn't support FP_TO_SINT until AVX512.

This patch disables the use of SplitVecOp_TruncateHelper for these operations and just lets normal promotion handle it. I've tweaked a couple things in X86ISelLowering to avoid a few obvious regressions there. I believe all the changes on X86 are improvements. The other targets look neutral.

Differential Revision: https://reviews.llvm.org/D54906

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347593 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-26 21:12:39 +00:00
Craig Topper
406f960c8c [SelectionDAG] Teach BaseIndexOffset::match to unwrap the base after looking through an add/or
We might find a target specific node that needs to be unwrapped after we look through an add/or. Otherwise we get inconsistent results if one pointer is just X86WrapperRIP and the other is (add X86WrapperRIP, C)

Differential Revision: https://reviews.llvm.org/D54818

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347591 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-26 20:16:33 +00:00
Than McIntosh
4b607e9bf6 [CodeGen] Support custom format of stack maps
Summary:
Add a hook to the GCMetadataPrinter for emitting stack maps in
custom format. The hook will be called at stack map generation
time. The default stack map format is used if there is no hook.

For this to be useful a few data structures and accessors are
exposed from the StackMaps class, so the custom printer can
access the stack map data.

This patch authored by Cherry Zhang <cherryyz@google.com>.

Reviewers: thanm, apilipenko, reames

Reviewed By: reames

Subscribers: reames, apilipenko, nemanjai, javed.absar, kbarton, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D53892

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347584 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-26 18:43:48 +00:00
Erich Keane
c088edcccf Delete dead code introduced in r347354.
ParentTy is never used other than an assignment, and since it is a
pointer, there is no side effect. Some versions of GCC notice and warn
on this.

Change-Id: I37dc1a18c7b58040419afb803621de13d8904a8f

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347581 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-26 17:51:27 +00:00
Than McIntosh
b4a52bd514 [CodeGen] Take SPAdj into account for STATEPOINT liveness args
Summary:
STATEPOINT records its args' locations on stack relative to SP.
If the SP is changed, take that into account.

This patch authored by Cherry Zhang <cherryyz@google.com>.

Reviewers: thanm, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D53603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347569 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-26 16:16:09 +00:00
Aaron Ballman
6de08e4557 Remove an unnecessary file; NFC.
This source file has not been needed since r346522 and was triggering diagnostics in MSVC about an object file which exports no public symbols (LNK4221).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347565 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-26 15:54:36 +00:00
Diana Picus
2de9fad1ee [ARM GlobalISel] Support G_CTLZ and G_CTLZ_ZERO_UNDEF
We can now select CLZ via the TableGen'erated code, so support G_CTLZ
and G_CTLZ_ZERO_UNDEF throughout the pipeline for types <= s32.

Legalizer:
If the CLZ instruction is available, use it for both G_CTLZ and
G_CTLZ_ZERO_UNDEF. Otherwise, use a libcall for G_CTLZ_ZERO_UNDEF and
lower G_CTLZ in terms of it.

In order to achieve this we need to add support to the LegalizerHelper
for the legalization of G_CTLZ_ZERO_UNDEF for s32 as a libcall (__clzsi2).

We also need to allow lowering of G_CTLZ in terms of G_CTLZ_ZERO_UNDEF
if that is supported as a libcall, as opposed to just if it is Legal or
Custom. Due to a minor refactoring of the helper function in charge of
this, we will also allow the same behaviour for G_CTTZ and G_CTPOP.
This is not going to be a problem in practice since we don't yet have
support for treating G_CTTZ and G_CTPOP as libcalls (not even in
DAGISel).

Reg bank select:
Map G_CTLZ to GPR. G_CTLZ_ZERO_UNDEF should not make it to this point.

Instruction select:
Nothing to do.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347545 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-26 11:07:02 +00:00
Diana Picus
845ecdf06a Fix typo in comment. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347544 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-26 11:06:53 +00:00
Sanjay Patel
d604b8cf93 [x86] limit transform for select-of-fp-constants
This should likely be adjusted to limit this transform
further, but these diffs should be clear wins.

If we have blendv/conditional move, then we should assume 
those are cheap ops. The loads become independent of the
compare, so those can be speculated before we need to use 
the values in the blend/mov.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347526 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-25 17:27:02 +00:00
Sanjay Patel
3ca2553042 [SelectionDAG] move constant or splat functions to common location
rL347502 moved the null sibling, so we should group all of these
together. I'm not sure why these aren't methods of the SDValue
class itself, but that's another patch if that's possible.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347523 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-25 16:09:32 +00:00
Sanjay Patel
703a4c27a1 [DAG] consolidate shift simplifications
...and use them to avoid creating obviously undef values as
discussed in the post-commit thread for r347478.

The diffs in vector div/rem show that we were missing real
optimizations by creating bogus shift nodes.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347502 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-23 20:05:12 +00:00
Luke Cheeseman
c8ecb42041 Revert r347490 as it breaks address sanitizer builds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347499 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-23 17:13:06 +00:00
Luke Cheeseman
7ef46a70af Revert r343341
- Cannot reproduce the build failure locally and the build logs have
  been deleted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347490 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-23 11:01:47 +00:00
Craig Topper
5a2d06490c [LegalizeVectorTypes] Don't use SplitVecOp_TruncateHelper if we're heading towards scalarizing the type.
This code takes a truncate, fp_to_int, or int_to_fp with a legal result type and an input type that needs to be split and enlarges the elements in the result type before doing the split. Then inserts a follow up truncate or fp_round after concatenating the two halves back together.

But if the input type of the original op is being split on its way to ultimately being scalarized we're just going to end up building a vector from scalars and then truncating or rounding it in the vector register. Seems kind of silly to enlarge the result element type of the operation only to end up with scalar code and then building a vector with large elements only to make the elements smaller again in the vector register. Seems better to just try to get away producing smaller result types in the scalarized code.

The X86 test case that changes is a pretty contrived test case that exists because of a bug we used to have in our AVG matching code. I think the code is better now, but its not realistic anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347482 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-23 02:32:13 +00:00
Craig Topper
091e955d20 [LegalizeVectorTypes] Have SplitVecOp_TruncateHelper fall back to SplitVecOp_UnaryOp if splitting the output type would be a legal type.
SplitVecOp_TruncateHelper tries to introduce a multilevel truncate to avoid scalarization. But if splitting the result type would still be a legal type we don't need to do that.

The comment block at the top of the function implied that this was already implemented. I looked back through the history and it doesn't look to have ever been checked.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347479 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-22 22:56:52 +00:00
Sanjay Patel
c93c84c762 [DAGCombiner] form 'not' ops ahead of shifts (PR39657)
We fail to canonicalize IR this way (prefer 'not' ops to arbitrary 'xor'),
but that would not matter without this patch because DAGCombiner was 
reversing that transform. I think we need this transform in the backend 
regardless of what happens in IR to catch cases where the shift-xor 
is formed late from GEP or other ops.

https://rise4fun.com/Alive/NC1

  Name: shl
  Pre: (-1 << C2) == C1
  %shl = shl i8 %x, C2
  %r = xor i8 %shl, C1
  =>
  %not = xor i8 %x, -1
  %r = shl i8 %not, C2
  
  Name: shr
  Pre: (-1 u>> C2) == C1
  %sh = lshr i8 %x, C2
  %r = xor i8 %sh, C1
  =>
  %not = xor i8 %x, -1
  %r = lshr i8 %not, C2

https://bugs.llvm.org/show_bug.cgi?id=39657


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347478 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-22 19:24:10 +00:00
Reid Kleckner
a71fc88a92 [mingw] Use unmangled name after the $ in the section name
GCC does it this way, and we have to be consistent. This includes
stdcall and fastcall functions with suffixes. I confirmed that a
fastcall function named "foo" ends up in ".text$foo", not
".text$@foo@8".

Based on a patch by Andrew Yohn!

Fixes PR39218.

Differential Revision: https://reviews.llvm.org/D54762

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347431 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-21 22:01:10 +00:00
Sanjay Patel
39abb33b7d [DAGCombiner] refactor select-of-FP-constants transform
This transform needs to be limited. 

We are converting to a constant pool load very early, and we 
are turning loads that are independent of the select condition 
(and therefore speculatable) into a dependent non-speculatable 
load.

We may also be transferring a condition code from an FP register
to integer to create that dependent load.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347424 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-21 20:54:47 +00:00
Sanjay Patel
36e75104e1 [DAGCombiner] reduce code duplication; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347410 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-21 20:00:32 +00:00
Simon Pilgrim
d0658beb2f [TargetLowering] SimplifyDemandedBits - only reduce known bits for integer constants
Avoids fuzzing crash found by Mikael Holmén.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347393 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-21 14:26:19 +00:00
Nikita Popov
3912ef61b8 Test commit: Delete trailing space in comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347385 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-21 10:57:22 +00:00
Sanjay Patel
bd109e244a [DAGCombiner] look through bitcasts when trying to narrow vector binops
This is another step in vector narrowing - a follow-up to D53784
(and hoping to eventually squash potential regressions seen in
D51553).

The x86 test diffs are wins, but the AArch64 diff is probably not.
That problem already exists independent of this patch (see PR39722), but it
went unnoticed in the previous patch because there were no regression tests
that showed the possibility.

The x86 diff in i64-mem-copy.ll is close. Given the frequency throttling
concerns with using wider vector ops, an extra extract to reduce vector
width is the right trade-off at this level of codegen.

Differential Revision: https://reviews.llvm.org/D54392


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347356 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-20 22:26:35 +00:00
Zachary Turner
9213bb9bc8 [CodeView] Add support for ref-qualified member functions.
When you have a member function with a ref-qualifier, for example:

struct Foo {
  void Func() &;
  void Func2() &&;
};

clang-cl was not emitting this information. Doing so is a bit
awkward, because it's not a property of the LF_MFUNCTION type, which
is what you'd expect. Instead, it's a property of the this pointer
which is actually an LF_POINTER. This record has an attributes
bitmask on it, and our handling of this bitmask was all wrong. We
had some parts of the bitmask defined incorrectly, but importantly
for this bug, we didn't know about these extra 2 bits that represent
the ref qualifier at all.

Differential Revision: https://reviews.llvm.org/D54667

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347354 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-20 22:13:43 +00:00
Zachary Turner
da30f51534 [CodeView] Mark this pointers as const.
This is for compatibility with MSVC, which also marks this pointers
as being const-qualified.

Fixes llvm.org/pr36526

Differential Revision: https://reviews.llvm.org/D54736

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347353 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-20 22:13:23 +00:00
Simon Pilgrim
f1f291e60c [DAGCombine] Add calls to SimplifyDemandedVectorElts from visitINSERT_SUBVECTOR (PR37989)
This uncovered an off-by-one typo in SimplifyDemandedVectorElts's INSERT_SUBVECTOR handling as its bounds check was bailing on safe indices.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347313 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-20 15:23:50 +00:00