Commit Graph

6414 Commits

Author SHA1 Message Date
Ayman Musa
eadb58fda7 [X86] Relocate code of replacement of subtarget unsupported masked memory intrinsics to run also on -O0 option.
Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional).

CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0).

Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation.

Differential Revision: https://reviews.llvm.org/D32487



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303050 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-15 11:30:54 +00:00
Vivek Pandya
e3abce209b This reverts r302984
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302985 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-13 10:59:05 +00:00
Vivek Pandya
1ec28e35ff Simplify MIR Output used for Codegen Testing
- MIRYamlMapping: Default value provided for fields which have optional
mappings. Implemented == operators for required classes. When a field's value is
same as default value specified YAML IO class will not print it.

- MIRPrinter: Above mentioned behaviour is not on by default. If -simplify-mir
option not specified, then make yaml::Output to print fields with default values
too.

Differential Revision: https://reviews.llvm.org/D32304


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302984 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-13 08:55:43 +00:00
Aditya Nandakumar
f171aff2b8 [GISel]: Add a getConstantFPVRegVal utility
This might be useful across various GISel Passes

https://reviews.llvm.org/D33051

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302964 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-12 22:54:52 +00:00
Aditya Nandakumar
db0798140d [GISel]: Fix undefined behavior while accessing DefaultAction map
We end up dereferencing the end iterator here when the Aspect doesn't exist in the DefaultAction map.
Change the API to return Optional<LLT> and return None when not found.
Also update the callers to handle the None case

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302963 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-12 22:43:58 +00:00
Dehao Chen
0faf9ed31e Add LiveRangeShrink pass to shrink live range within BB.
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.

Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb

Reviewed By: MatzeB, andreadb

Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32563

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302938 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-12 19:29:27 +00:00
Chad Rosier
2d534b8ab2 [AArch64][MachineCombine] Fold FNMUL+FSUB -> FNMADD.
Differential Revision: http://reviews.llvm.org/D33101.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302822 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-11 20:07:24 +00:00
David L. Jones
538282cc5e Revert "[SDAG] Relax conditions under stores of loaded values can be merged"
This reverts r302712.

The change fails with ASAN enabled:

ERROR: AddressSanitizer: use-after-poison on address ... at ...
READ of size 2 at ... thread T0
  #0 ... in llvm::SDNode::getNumValues() const <snip>/include/llvm/CodeGen/SelectionDAGNodes.h:855:42
  #1 ... in llvm::SDNode::hasAnyUseOfValue(unsigned int) const <snip>/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7270:3
  #2 ... in llvm::SDValue::use_empty() const <snip> include/llvm/CodeGen/SelectionDAGNodes.h:1042:17
  #3 ... in (anonymous namespace)::DAGCombiner::MergeConsecutiveStores(llvm::StoreSDNode*) <snip>/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12944:7

Reviewers: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33081

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302746 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-10 23:56:21 +00:00
Nirav Dave
a7aa63a594 [SDAG] Relax conditions under stores of loaded values can be merged
Summary:

Allow consecutive stores whose values come from consecutive loads to
merged in the presense of other uses of the loads. Previously this was
disallowed as in general the merged load cannot be shared with the
other uses. Merging N stores into 1 may cause as many as N redundant
loads. However in the context of caching this should have neglible
affect on memory pressure and reduce instruction count making it
almost always a win.

Fixes PR32086.

Reviewers: spatel, jyknight, andreadb, hfinkel, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30471

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302712 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-10 19:53:41 +00:00
Amara Emerson
0dd30f878b Add a late IR expansion pass for the experimental reduction intrinsics.
This pass uses a new target hook to decide whether or not to expand a particular
intrinsic to the shuffevector sequence.

Differential Revision: https://reviews.llvm.org/D32245



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302631 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-10 09:42:49 +00:00
Ahmed Bougacha
d5c43cc5c9 [CodeGen] Don't require AA in SDAGISel at -O0.
Before r247167, the pass manager builder controlled which AA
implementations were used, exporting them all in the AliasAnalysis
analysis group.

Now, AAResultsWrapperPass always uses BasicAA, but still uses other AA
implementations if made available in the pass pipeline.

But regardless, SDAGISel is required at O0, and really doesn't need to
be doing fancy optimizations based on useful AA results.

Don't require AA at CodeGenOpt::None, and only use it otherwise.

This does have a functional impact (and one testcase is pessimized
because we can't reuse a load).  But I think that's desirable no matter
what.

Note that this alone doesn't result in less DT computations: TwoAddress
was previously able to reuse the DT we computed for SDAG.  That will be
fixed separately.

Differential Revision: https://reviews.llvm.org/D32766

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302611 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-10 00:39:30 +00:00
Serge Pavlov
1f4a80fdc1 Add extra operand to CALLSEQ_START to keep frame part set up previously
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to  CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
  affects all targets that use frame pseudo instructions and touched many
  files although the changes are uniform.
- Access to frame properties are implemented using special instructions
  rather than calls getOperand(N).getImm(). For X86 and ARM such
  replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
  instruction. These involve proper instruction initialization and
  methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
  frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302527 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-09 13:35:13 +00:00
Amara Emerson
8f1f7ce9d1 Introduce experimental generic intrinsics for horizontal vector reductions.
- This change allows targets to opt-in to using them instead of the log2
  shufflevector algorithm.
- The SLP and Loop vectorizers have the common code to do shuffle reductions
  factored out into LoopUtils, and now have a unified interface for generating
  reductions regardless of the preference of the target. LoopUtils now uses TTI
  to determine what kind of reductions the target wants to handle.
- For CodeGen, basic legalization support is added.

Differential Revision: https://reviews.llvm.org/D30086



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302514 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-09 10:43:25 +00:00
Dean Michael Berris
638f2cdc22 [XRay] Custom event logging intrinsic
This patch introduces an LLVM intrinsic and a target opcode for custom event
logging in XRay. Initially, its use case will be to allow users of XRay to log
some type of string ("poor man's printf"). The target opcode compiles to a noop
sled large enough to enable calling through to a runtime-determined relative
function call. At runtime, when X-Ray is enabled, the sled is replaced by
compiler-rt with a trampoline to the logic for creating the custom log entries.

Future patches will implement the compiler-rt parts and clang-side support for
emitting the IR corresponding to this intrinsic.

Reviewers: timshen, dberris

Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits

Differential Revision: https://reviews.llvm.org/D27503

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302405 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-08 05:45:21 +00:00
Lang Hames
98dab84b5c Fix comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302366 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-07 03:54:53 +00:00
Quentin Colombet
ce22b10a6e [RegisterBankInfo] Uniquely allocate instruction mapping.
This is a step toward having statically allocated instruciton mapping.
We are going to tablegen them eventually, so let us reflect that in
the API.

NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302316 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 22:48:22 +00:00
Matthias Braun
97beda0626 ARM: Compute MaxCallFrame size early
This exposes a method in MachineFrameInfo that calculates
MaxCallFrameSize and calls it after instruction selection in the ARM
target.

This avoids
ARMBaseRegisterInfo::canRealignStack()/ARMFrameLowering::hasReservedCallFrame()
giving different answers in early/late phases of codegen.

The testcase shows a particular nasty example result of that where we
would fail to properly align an alloca.

Differential Revision: https://reviews.llvm.org/D32622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302303 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 22:04:05 +00:00
Matthias Braun
0cb25a2a10 MIParser/MIRPrinter: Compute block successors if not explicitely specified
- MIParser: If the successor list is not specified successors will be
  added based on basic block operands in the block and possible
  fallthrough.

- MIRPrinter: Adds a new `simplify-mir` option, with that option set:
  Skip printing of block successor lists in cases where the
  parser is guaranteed to reconstruct it. This means we still print the
  list if some successor cannot be determined (happens for example for
  jump tables), if the successor order changes or branch probabilities
  being unequal.

Differential Revision: https://reviews.llvm.org/D31262

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302289 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 21:09:30 +00:00
Craig Topper
ace8b39f82 [KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits.
This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown.

Differential Revision: https://reviews.llvm.org/D32637

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302262 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 17:36:09 +00:00
Quentin Colombet
197e49d664 [RegisterBankInfo] Fix 80-col introduced in r293506.
NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302202 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-04 22:43:08 +00:00
Quentin Colombet
b39d7934a9 [GlobalISel] Add missing doxygen keyword for doxygen groups.
NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302201 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-04 22:43:04 +00:00
Dean Michael Berris
acf912e5ba [XRay] Create an Index of sleds per function
Summary:
This change adds a new section to the xray-instrumented binary that
stores an index into ranges of the instrumentation map, where sleds
associated with the same function can be accessed as an array. At
runtime, we can get access to this index by function ID offset allowing
for selective patching and unpatching by function ID.

Each entry in this new section (xray_fn_idx) will include two pointers
indicating the start and one past the end of the sleds associated with
the same function. These entries will be 16 bytes long on x86 and
aarch64. On arm, we align to 16 bytes anyway so the runtime has to take
that into consideration.

__{start,stop}_xray_fn_idx will be the symbols that the runtime will
look for when we implement the selective patching/unpatching by function
id APIs. Because XRay synthesizes the function id's in a monotonically
increasing manner at runtime now, implementations (and users) can use
this table to look up the sleds associated with a specific function.
This is useful in implementations that want to do things like:

  - Implement coverage mode for functions by patching everything
    pre-main, then as functions are encountered, the installed handler
    can unpatch the function that's been encountered after recording
    that it's been called.
  - Do "learning mode", so that the implementation can figure out some
    statistical information about function calls by function id for a
    time being, and then determine which functions are worth
    uninstrumenting at runtime.
  - Do "selective instrumentation" where an implementation can
    specifically instrument only certain function id's at runtime
    (either based on some external data, or through some other
    heuristics) instead of patching all the instrumented functions at
    runtime.

Reviewers: dblaikie, echristo, chandlerc, javed.absar

Subscribers: pelikan, aemerson, kpw, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D32693

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302109 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-04 03:37:57 +00:00
Reid Kleckner
dac7487074 Re-land r301697 "[IR] Make add/remove Attributes use AttrBuilder instead of AttributeList"
This time, I fixed, built, and tested clang.

This reverts r301712.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301981 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-02 22:07:37 +00:00
Simon Pilgrim
ed79276e6b [SelectionDAG] Improve support for promotion of <1 x fX> floating point argument types (PR31088)
PR31088 demonstrated that we were assuming that only integers require promotion from <1 x iX> types, when in fact float types may require it as well - in this case half floats.

This patch adds support for extension/truncation for both integer and float types.

Differential Revision: https://reviews.llvm.org/D32391

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301910 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-02 10:33:08 +00:00
Matthias Braun
5927be1ab4 MachineFrameInfo: Track whether MaxCallFrameSize is computed yet; NFC
This tracks whether MaxCallFrameSize is computed yet. Ideally we would
assert and fail when the value is queried before it is computed, however
this fails various targets that need to be fixed first.

Differential Revision: https://reviews.llvm.org/D32570

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301851 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 22:32:25 +00:00
Amara Emerson
195d3fa988 Generalize the specialized flag-carrying SDNodes by moving flags into SDNode.
This removes BinaryWithFlagsSDNode, and flags are now all passed by value.

Differential Revision: https://reviews.llvm.org/D32527



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301803 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 15:17:51 +00:00
Amaury Sechet
bcb9816097 Do not legalize large add with addc/adde, introduce addcarry and do it with uaddo/addcarry
Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D29872

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301775 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-30 19:24:09 +00:00
Guy Blank
0a4ec8f0b2 [MVT] fix typo in size of v1i8 MVT.
Ths issue was found in the review of another patch https://reviews.llvm.org/D32540



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301770 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-30 12:47:57 +00:00
Daniel Sanders
f31ac9d1e8 [globalisel][tablegen] Compute available feature bits correctly.
Summary:
Predicate<> now has a field to indicate how often it must be recomputed.
Currently, there are two frequencies, per-module (RecomputePerFunction==0)
and per-function (RecomputePerFunction==1). Per-function predicates are
currently recomputed more frequently than necessary since the only predicate
in this category is cheap to test. Per-module predicates are now computed in
getSubtargetImpl() while per-function predicates are computed in selectImpl().

Tablegen now manages the PredicateBitset internally. It should only be
necessary to add the required includes.

Also fixed a problem revealed by the test case where
constrainSelectedInstRegOperands() would attempt to tie operands that
BuildMI had already tied.

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32491

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301750 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-29 17:30:09 +00:00
Hans Wennborg
dfc1ffb1c9 Revert r301697 "[IR] Make add/remove Attributes use AttrBuilder instead of AttributeList"
This broke the Clang build. (Clang-side patch missing?)

Original commit message:

> [IR] Make add/remove Attributes use AttrBuilder instead of
> AttributeList
>
> This change cleans up call sites and avoids creating temporary
> AttributeList objects.
>
> NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301712 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 23:01:32 +00:00
Reid Kleckner
fde3916ada [IR] Make add/remove Attributes use AttrBuilder instead of AttributeList
This change cleans up call sites and avoids creating temporary
AttributeList objects.

NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301697 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 21:48:28 +00:00
Jun Bum Lim
98e89d3bb4 [InlineCost] Improve the cost heuristic for Switch
Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets

```
lor.lhs.false2:                                   ; preds = %if.then
  switch i32 %Status, label %if.then27 [
    i32 -7012, label %if.end35
    i32 -10008, label %if.end35
    i32 -10016, label %if.end35
    i32 15000, label %if.end35
    i32 14013, label %if.end35
    i32 10114, label %if.end35
    i32 10107, label %if.end35
    i32 10105, label %if.end35
    i32 10013, label %if.end35
    i32 10011, label %if.end35
    i32 7008, label %if.end35
    i32 7007, label %if.end35
    i32 5002, label %if.end35
  ]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)

```
.LBB853_9:                              // %lor.lhs.false2
        mov     w8, #10012
        cmp             w19, w8
        b.gt    .LBB853_14
// BB#10:                               // %lor.lhs.false2
        mov     w8, #5001
        cmp             w19, w8
        b.gt    .LBB853_18
// BB#11:                               // %lor.lhs.false2
        mov     w8, #-10016
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#12:                               // %lor.lhs.false2
        mov     w8, #-10008
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#13:                               // %lor.lhs.false2
        mov     w8, #-7012
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_14:                             // %lor.lhs.false2
        mov     w8, #14012
        cmp             w19, w8
        b.gt    .LBB853_21
// BB#15:                               // %lor.lhs.false2
        mov     w8, #-10105
        add             w8, w19, w8
        cmp             w8, #9          // =9
        b.hi    .LBB853_17
// BB#16:                               // %lor.lhs.false2
        orr     w9, wzr, #0x1
        lsl     w8, w9, w8
        mov     w9, #517
        and             w8, w8, w9
        cbnz    w8, .LBB853_23
.LBB853_17:                             // %lor.lhs.false2
        mov     w8, #10013
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_18:                             // %lor.lhs.false2
        mov     w8, #-7007
        add             w8, w19, w8
        cmp             w8, #2          // =2
        b.lo    .LBB853_23
// BB#19:                               // %lor.lhs.false2
        mov     w8, #5002
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#20:                               // %lor.lhs.false2
        mov     w8, #10011
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_21:                             // %lor.lhs.false2
        mov     w8, #14013
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#22:                               // %lor.lhs.false2
        mov     w8, #15000
        cmp             w19, w8
        b.ne    .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.

This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
  -inline-generic-switch-cost=false

This change was originally proposed by Haicheng in D29870.

Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier

Reviewed By: hans

Subscribers: joerg, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D31085

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301649 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 16:04:03 +00:00
Craig Topper
8b430f87e6 [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.

This is largely a mechanical transformation from KnownZero to Known.Zero.

Differential Revision: https://reviews.llvm.org/D32569

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301620 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 05:31:46 +00:00
Matthias Braun
3e8c96253a MachineFrameInfo.h: Remove unnecessary forward declarations; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301496 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-26 23:37:04 +00:00
Simon Pilgrim
ea5eba046b [SelectionDAG] Added getBuildVector(ArrayRef<SDUse>) helper.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301322 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-25 16:41:28 +00:00
Daniel Sanders
cc3830e7da [globalisel][tablegen] Revise API for ComplexPattern operands to improve flexibility.
Summary:
Some targets need to be able to do more complex rendering than just adding an
operand or two to an instruction. For example, it may need to insert an
instruction to extract a subreg first, or it may need to perform an operation
on the operand.

In SelectionDAG, targets would create SDNode's to achieve the desired effect
during the complex pattern predicate. This worked because SelectionDAG had a
form of garbage collection that would take care of SDNode's that were created
but not used due to a later predicate rejecting a match. This doesn't translate
well to GlobalISel and the churn was wasteful.

The API changes in this patch enable GlobalISel to accomplish the same thing
without the waste. The API is now:
	InstructionSelector::OptionalComplexRendererFn selectArithImmed(MachineOperand &Root) const;
where Root is the root of the match. The return value can be omitted to
indicate that the predicate failed to match, or a function with the signature
ComplexRendererFn can be returned. For example:
	return OptionalComplexRendererFn(
	       [=](MachineInstrBuilder &MIB) { MIB.addImm(Immed).addImm(ShVal); });
adds two immediate operands to the rendered instruction. Immed and ShVal are
captured from the predicate function.

As an added bonus, this also reduces the amount of information we need to
provide to GIComplexOperandMatcher.

Depends on D31418

Reviewers: aditya_nandakumar, t.p.northover, qcolombet, rovka, ab, javed.absar

Reviewed By: ab

Subscribers: dberris, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D31761

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301079 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-22 15:11:04 +00:00
David Blaikie
36e99fa5b0 Avoid using relocations for ref_addr in .dwo files
In dwo files the fixed offset can be used - if the dwos are linked into
a dwp, the dwo consumer must use the dwp tables to find out where the
original range of the debug_info was and resolve the "section relative"
value relative to that original range - effectively
avoiding/reimplementing the relocation handling.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301072 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-22 07:53:44 +00:00
David Blaikie
a753d9a103 Remove the unnecessary virtual dtor from the DIEUnit hierarchy (in favor of protected dtor in the base, final derived classes with public non-virtual dtors)
These objects are never polymorphically owned/destroyed, so the virtual
dtor was unnecessary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301068 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-22 02:18:00 +00:00
Daniel Sanders
e8660ea63d [globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).

Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.

Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab

Reviewed By: rovka

Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D31418

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300993 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-21 15:59:56 +00:00
Daniel Sanders
9eb6db17b6 Revert r300964 + r300970 - [globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
It's causing llvm-clang-x86_64-expensive-checks-win to fail to compile and I
haven't worked out why. Reverting to make it green while I figure it out.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300978 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-21 14:09:20 +00:00
Daniel Sanders
d3ed5b78e5 [globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).

Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.

Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab

Reviewed By: rovka

Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D31418

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300964 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-21 10:27:20 +00:00
Benjamin Kramer
175caa6d02 Remove stray ^S. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300880 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-20 20:03:36 +00:00
Amara Emerson
0f69ba8243 [MVT][SVE] Scalable vector MVTs (3/3)
Adds MVT::ElementCount to represent the length of a
vector which may be scalable, then adds helper functions
that work with it.

Patch by Graham Hunter.

Differential Revision: https://reviews.llvm.org/D32019



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300842 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-20 13:54:09 +00:00
Amara Emerson
0e3700625d [MVT][SVE] Scalable vector MVTs (2/3)
Adds scalable vector machine value types, and updates
the switch statements required for tablegen.

Patch by Graham Hunter.

Differential Revision: https://reviews.llvm.org/D32018



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300840 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-20 13:36:58 +00:00
Amara Emerson
780f89d961 [MVT][SVE] Scalable vector MVTs (1/3)
This patch adds a few helper functions to obtain new vector
value types based on existing ones without needing to care
about whether they are scalable or not.

I've confined their use to a few common locations right now,
and targets that don't have scalable vectors should never
need to care about these.

Patch by Graham Hunter.

Differential Revision: https://reviews.llvm.org/D32017



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300838 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-20 13:08:17 +00:00
Aditya Nandakumar
4925efae1f [GISEL]: Move getConstantVReg to Utils
NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300751 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-19 20:48:50 +00:00
Adrian Prantl
b560ea777b PR32382: Fix emitting complex DWARF expressions.
The DWARF specification knows 3 kinds of non-empty simple location
descriptions:
1. Register location descriptions
  - describe a variable in a register
  - consist of only a DW_OP_reg
2. Memory location descriptions
  - describe the address of a variable
3. Implicit location descriptions
  - describe the value of a variable
  - end with DW_OP_stack_value & friends

The existing DwarfExpression code is pretty much ignorant of these
restrictions. This used to not matter because we only emitted very
short expressions that we happened to get right by accident.  This
patch makes DwarfExpression aware of the rules defined by the DWARF
standard and now chooses the right kind of location description for
each expression being emitted.

This would have been an NFC commit (for the existing testsuite) if not
for the way that clang describes captured block variables. Based on
how the previous code in LLVM emitted locations, DW_OP_deref
operations that should have come at the end of the expression are put
at its beginning. Fixing this means changing the semantics of
DIExpression, so this patch bumps the version number of DIExpression
and implements a bitcode upgrade.

There are two major changes in this patch:

I had to fix the semantics of dbg.declare for describing function
arguments. After this patch a dbg.declare always takes the *address*
of a variable as the first argument, even if the argument is not an
alloca.

When lowering a DBG_VALUE, the decision of whether to emit a register
location description or a memory location description depends on the
MachineLocation — register machine locations may get promoted to
memory locations based on their DIExpression. (Future) optimization
passes that want to salvage implicit debug location for variables may
do so by appending a DW_OP_stack_value. For example:
  DBG_VALUE, [RBP-8]                        --> DW_OP_fbreg -8
  DBG_VALUE, RAX                            --> DW_OP_reg0 +0
  DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0

All testcases that were modified were regenerated from clang. I also
added source-based testcases for each of these to the debuginfo-tests
repository over the last week to make sure that no synchronized bugs
slip in. The debuginfo-tests compile from source and run the debugger.

https://bugs.llvm.org/show_bug.cgi?id=32382
<rdar://problem/31205000>

Differential Revision: https://reviews.llvm.org/D31439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300522 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-18 01:21:53 +00:00
Reid Kleckner
1f8f049069 [IR] Make paramHasAttr to use arg indices instead of attr indices
This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern.

Previously we were testing return value attributes with index 0, so I
introduced hasReturnAttr() for that use case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300367 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-14 20:19:02 +00:00
Andrew V. Tischenko
3796561c6e This patch closes PR#32216: Better testing of schedule model instruction latencies/throughputs.
The details are here: https://reviews.llvm.org/D30941


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300311 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-14 07:44:23 +00:00
Jonas Paulsson
c33bdfa7b1 [SystemZ] TargetTransformInfo cost functions implemented.
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleavedMemoryOpCost() implemented.

Interleaved access vectorization enabled.

BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads,
in which case the cost of the z/sext instruction becomes 0.

Review: Ulrich Weigand, Renato Golin.
https://reviews.llvm.org/D29631

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300052 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 11:49:08 +00:00