7875 Commits

Author SHA1 Message Date
Austin Kerbow
34e11f6a0a AMDGPU/GlobalISel: Legalize fast unsafe FDIV
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69231

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375460 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-21 22:18:26 +00:00
Quentin Colombet
c1a274e767 [GISel][CombinerHelper] Add a combine turning shuffle_vector into concat_vectors
Teach the CombinerHelper how to turn shuffle_vectors, that
concatenate vectors, into concat_vectors and add this combine
to the AArch64 pre-legalizer combiner.

Differential Revision: https://reviews.llvm.org/D69149

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375452 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-21 20:39:58 +00:00
Sander de Smalen
f5e25f84fa Reverted r375425 as it broke some buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375444 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-21 19:11:40 +00:00
Raphael Isemann
a3e4f5daf6 [NFC] Add missing include to fix modules build
This header doesn't seem to be parsable on its own and breaks the module build therefore with
the following error:

While building module 'LLVM_Backend' imported from llvm-project/llvm/lib/CodeGen/MachineScheduler.cpp:14:
In file included from <module-includes>:62:
llvm-project/llvm/include/llvm/CodeGen/MachinePipeliner.h:91:20: error: declaration of 'AAResultsWrapperPass' must be imported from module 'LLVM_Analysis.AliasAnalysis' before it is required
    AU.addRequired<AAResultsWrapperPass>();
                   ^
llvm-project/llvm/include/llvm/Analysis/AliasAnalysis.h:1157:7: note: previous declaration is here
class AAResultsWrapperPass : public FunctionPass {
      ^
llvm-project/llvm/lib/CodeGen/MachineScheduler.cpp:14:10: fatal error: could not build module 'LLVM_Backend'
 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375433 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-21 17:43:38 +00:00
Sander de Smalen
4548d296c1 [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2)
Commit message from D66935:

This patch fixes a bug exposed by D65653 where a subsequent invocation
of `determineCalleeSaves` ends up with a different size for the callee
save area, leading to different frame-offsets in debug information.

In the invocation by PEI, `determineCalleeSaves` tries to determine
whether it needs to spill an extra callee-saved register to get an
emergency spill slot. To do this, it calls 'estimateStackSize' and
manually adds the size of the callee-saves to this. PEI then allocates
the spill objects for the callee saves and the remaining frame layout
is calculated accordingly.

A second invocation in LiveDebugValues causes estimateStackSize to return
the size of the stack frame including the callee-saves. Given that the
size of the callee-saves is added to this, these callee-saves are counted
twice, which leads `determineCalleeSaves` to believe the stack has
become big enough to require spilling an extra callee-save as emergency
spillslot. It then updates CalleeSavedStackSize with a larger value.

Since CalleeSavedStackSize is used in the calculation of the frame
offset in getFrameIndexReference, this leads to incorrect offsets for
variables/locals when this information is recalculated after PEI.

This patch fixes the lldb unit tests in `functionalities/thread/concurrent_events/*`

Changes after D66935:

Ensures AArch64FunctionInfo::getCalleeSavedStackSize does not return
the uninitialized CalleeSavedStackSize when running `llc` on a specific
pass where the MIR code has already been expected to have gone through PEI.

Instead, getCalleeSavedStackSize (when passed the MachineFrameInfo) will try
to recalculate the CalleeSavedStackSize from the CalleeSavedInfo. In debug
mode, the compiler will assert the recalculated size equals the cached
size as calculated through a call to determineCalleeSaves.

This fixes two tests:
  test/DebugInfo/AArch64/asan-stack-vars.mir
  test/DebugInfo/AArch64/compiler-gen-bbs-livedebugvalues.mir
that otherwise fail when compiled using msan.

Reviewed By: omjavaid, efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375425 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-21 17:12:56 +00:00
David Green
4bdbcfddaf [Types] Define a getWithNewBitWidth for Types and make use of it
This is designed to change the bitwidth of a type without altering the number
of vector lanes. Also useful in D68651. Otherwise an NFC.

Differential Revision: https://reviews.llvm.org/D69139


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375417 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-21 14:51:49 +00:00
Guillaume Chatelet
0e61c30e7d [Alignment][NFC] TargetCallingConv::setByValAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69248

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375410 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-21 12:05:33 +00:00
Guillaume Chatelet
f69d71733b [Alignment][NFC] TargetCallingConv::setOrigAlign and TargetLowering::getABIAlignmentForCallingConv
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69243

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375407 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-21 11:01:55 +00:00
Guillaume Chatelet
2bcdb486dd Use Align for TFL::TransientStackAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, dschuff, jyknight, sdardis, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69216

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375398 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-21 08:31:25 +00:00
Vladimir Vereschaka
eec7ef7443 Reverted r375254 as it has broken some build bots for a long time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375375 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-20 20:39:33 +00:00
Sanjay Patel
0d04cbb578 [TargetLowering][DAGCombine][MSP430] add/use hook for Shift Amount Threshold (1/2)
Provides a TLI hook to allow targets to relax the emission of shifts, thus enabling
codegen improvements on targets with no multiple shift instructions and cheap selects
or branches.

Contributes to a Fix for PR43559:
https://bugs.llvm.org/show_bug.cgi?id=43559

Patch by: @joanlluch (Joan LLuch)

Differential Revision: https://reviews.llvm.org/D69116

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375347 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-19 16:57:02 +00:00
Reid Kleckner
a83ebc9d3b Prune a LegacyDivergenceAnalysis and MachineLoopInfo include each
Now X86ISelLowering doesn't depend on many IR analyses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375320 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-19 01:31:09 +00:00
Reid Kleckner
c4185e8c3c Prune Analysis includes from SelectionDAG.h
Only forward declarations are needed here. Follow-on to r375311.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375319 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-19 01:07:48 +00:00
Reid Kleckner
ccf0b77ef9 Prune two MachineInstr.h includes, fix up deps
MachineInstr.h included AliasAnalysis.h, which includes a world of IR
constructs mostly unneeded in CodeGen. Prune it. Same for
DebugInfoMetadata.h.

Noticed with -ftime-trace.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375311 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-19 00:22:07 +00:00
Quentin Colombet
8df88e8ad7 [GISel][CallLowering] Make isIncomingArgumentHandler a pure virtual method
The default implementation of isIncomingArgumentHandler could lead
to generating incorrect code.
Make it a pure virtual method, so that targets know they have to
override it to produce correct code.

NFC

Differential Revision: https://reviews.llvm.org/D69187

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375277 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-18 20:13:42 +00:00
Hiroshi Yamauchi
7c4ed1ed66 [PGO][PGSO] SizeOpts changes.
Summary:
(Split of off D67120)

SizeOpts/MachineSizeOpts changes for profile guided size optimization.

Reviewers: davidxl

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69070

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375254 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-18 16:46:01 +00:00
Graham Hunter
77a2562985 [AArch64][SVE] Add SPLAT_VECTOR ISD Node
Adds a new ISD node to replicate a scalar value across all elements of
a vector. This is needed for scalable vectors, since BUILD_VECTOR cannot
be used.

Fixes up default type legalization for scalable vectors after the
new MVT type ranges were introduced.

At present I only use this node for scalable vectors. A DAGCombine has
been added to transform a BUILD_VECTOR into a SPLAT_VECTOR if all
elements are the same, but only if the default operation action of
Expand has been overridden by the target.

I've only added result promotion legalization for scalable vector
i8/i16/i32/i64 types in AArch64 for now.

Reviewers: t.p.northover, javed.absar, greened, cameron.mcinally, jmolloy

Reviewed By: jmolloy

Differential Revision: https://reviews.llvm.org/D47775

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375222 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-18 11:48:35 +00:00
James Molloy
9d008e4cc9 [DFAPacketizer] Use DFAEmitter. NFC.
Summary:
This is a NFC change that removes the NFA->DFA construction and emission logic from DFAPacketizerEmitter and instead uses the generic DFAEmitter logic. This allows DFAPacketizer to use the Automaton class from Support and remove a bunch of logic there too.

After this patch, DFAPacketizer is mostly logic for grepping Itineraries and collecting functional units, with no state machine logic. This will allow us to modernize by removing the 16-functional-unit limit and supporting non-itinerary functional units. This is all for followup patches.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68992

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375086 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-17 08:34:29 +00:00
Guillaume Chatelet
2078d8cdd7 [Alignment][NFC] Use Align for TargetFrameLowering/Subtarget
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68993

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375084 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-17 07:49:39 +00:00
Marcello Maggioni
8326c41046 Clang-formatting of some files in LiveRangeCalc header (LiveRangeCalc.h)
NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375076 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-17 03:12:58 +00:00
Marcello Maggioni
95a064c8ac Move LiveRangeCalc header to publicily available position. NFC
Differential Revision: https://reviews.llvm.org/D69078

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375075 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-17 03:12:51 +00:00
Quentin Colombet
17cdb7fdd5 [GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => build_vector
Teach the combiner helper how to flatten concat_vectors of build_vectors
into a build_vector.

Add this combine as part of AArch64 pre-legalizer combiner.

Differential Revision: https://reviews.llvm.org/D69071

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375066 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-17 00:34:32 +00:00
Daniel Sanders
2ab944293c [gicombiner] Hoist pure C++ combine into the tablegen definition
Summary:
This is just moving the existing C++ code around and will be NFC w.r.t
AArch64. Renamed 'CombineBr' to something more descriptive
('ElideByByInvertingCond') at the same time.

The remaining combines in AArch64PreLegalizeCombiner require features that
aren't implemented at this point and will be hoisted as they are added.

Depends on D68424

Reviewers: bogner, volkan

Subscribers: kristof.beyls, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68426

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375057 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-16 23:53:35 +00:00
Matt Arsenault
2097552a47 GlobalISel: Implement lower for G_SADDO/G_SSUBO
Port directly from SelectionDAG, minus the path using
ISD::SADDSAT/ISD::SSUBSAT.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375042 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-16 20:46:32 +00:00
David Stenberg
633cd24e87 [DebugInfo] Add a DW_OP_LLVM_entry_value operation
Summary:
Internally in LLVM's metadata we use DW_OP_entry_value operations with
the same semantics as DWARF; that is, its operand specifies the number
of bytes that the entry value covers.

At the time of emitting entry values we don't know the emitted size of
the DWARF expression that the entry value will cover. Currently the size
is hardcoded to 1 in DIExpression, and other values causes the verifier
to fail. As the size is 1, that effectively means that we can only have
valid entry values for registers that can be encoded in one byte, which
are the registers with DWARF numbers 0 to 31 (as they can be encoded as
single-byte DW_OP_reg0..DW_OP_reg31 rather than a multi-byte
DW_OP_regx). It is a bit confusing, but it seems like llvm-dwarfdump
will print an operation "correctly", even if the byte size is less than
that, which may make it seem that we emit correct DWARF for registers
with DWARF numbers > 31. If you instead use readelf for such cases, it
will interpret the number of specified bytes as a DWARF expression. This
seems like a limitation in llvm-dwarfdump.

As suggested in D66746, a way forward would be to add an internal
variant of DW_OP_entry_value, DW_OP_LLVM_entry_value, whose operand
instead specifies the number of operations that the entry value covers,
and we then translate that into the byte size at the time of emission.

In this patch that internal operation is added. This patch keeps the
limitation that a entry value can only be applied to simple register
locations, but it will fix the issue with the size operand being
incorrect for DWARF numbers > 31.

Reviewers: aprantl, vsk, djtodoro, NikolaPrica

Reviewed By: aprantl

Subscribers: jyknight, fedor.sergeev, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D67492

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374881 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-15 11:31:21 +00:00
David Stenberg
3804b7fd5f [DebugInfo] Add interface for pre-calculating the size of emitted DWARF
Summary:
DWARF's DW_OP_entry_value operation has two operands; the first is a
ULEB128 operand that specifies the size of the second operand, which is
a DWARF block. This means that we need to be able to pre-calculate and
emit the size of DWARF expressions before emitting them. There is
currently no interface for doing this in DwarfExpression, so this patch
introduces that.

When implementing this I initially thought about running through
DwarfExpression's emission two times; first with a temporary buffer to
emit the expression, in order to being able to calculate the size of
that emitted data. However, DwarfExpression is a quite complex state
machine, so I decided against that, as it seemed like the two runs could
get out of sync, resulting in incorrect size operands. Therefore I have
implemented this in a way that we only have to run DwarfExpression once.
The idea is to emit DWARF to a temporary buffer, for which it is
possible to query the size. The data in the temporary buffer can then be
emitted to DwarfExpression's main output.

In the case of DIEDwarfExpression, a temporary DIE is used. The values
are all allocated using the same BumpPtrAllocator as for all other DIEs,
and the values are then transferred to the real value list. In the case
of DebugLocDwarfExpression, the temporary buffer is implemented using a
BufferByteStreamer which emits to a buffer in the DwarfExpression
object.

Reviewers: aprantl, vsk, NikolaPrica, djtodoro

Reviewed By: aprantl

Subscribers: hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D67768

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374879 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-15 11:14:35 +00:00
Zi Xuan Wu
704914973a recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize
In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not
estimate different register pressure for different register class separately(especially for scalar type,
float type should not be on the same position with int type), so it's not accurate. Specifically,
it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.

So we need classify the register classes in IR level, and importantly these are abstract register classes,
and are not the target register class of backend provided in td file. It's used to establish the mapping between
the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.

For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR),
float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled,
and 3 kinds of register class when VSX is NOT enabled.

It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.

Differential revision: https://reviews.llvm.org/D67148


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374634 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-12 02:53:04 +00:00
Marcello Maggioni
96335c9f25 [GISel] Allow getConstantVRegVal() to return G_FCONSTANT values.
In GISel we have both G_CONSTANT and G_FCONSTANT, but because
in GISel we don't really have a concept of Float vs Int value
the only difference between the two is where the data originates
from.

What both G_CONSTANT and G_FCONSTANT return is just a bag of bits
with the constant representation in it.

By making getConstantVRegVal() return G_FCONSTANTs bit representation
as well we allow ConstantFold and other things to operate with
G_FCONSTANT.

Adding tests that show ConstantFolding to work on mixed G_CONSTANT
and G_FCONSTANT sources.

Differential Revision: https://reviews.llvm.org/D68739

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374458 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-10 21:46:26 +00:00
David Greene
4f2b50c62d [System Model] [TTI] Move default cache/prefetch implementations
Move the default implementations of cache and prefetch queries to
TargetTransformInfoImplBase and delete them from NoTIIImpl.  This brings these
interfaces in line with how other TTI interfaces work.

Differential Revision: https://reviews.llvm.org/D68804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374446 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-10 20:39:27 +00:00
Guillaume Chatelet
f038ba49f4 [Alignment][NFC] Use llv::Align in GISelKnownBits
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68786

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374369 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-10 15:38:22 +00:00
Oliver Stannard
65b47b2f6b [IfCvt][ARM] Optimise diamond if-conversion for code size
Currently, the heuristics the if-conversion pass uses for diamond if-conversion
are based on execution time, with no consideration for code size. This adds a
new set of heuristics to be used when optimising for code size.

This is mostly target-independent, because the if-conversion pass can
see the code size of the instructions which it is removing. For thumb,
there are a few passes (insertion of IT instructions, selection of
narrow branches, and selection of CBZ instructions) which are run after
if conversion and affect these heuristics, so I've added target hooks to
better predict the code-size effect of a proposed if-conversion.

Differential revision: https://reviews.llvm.org/D67350

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374301 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-10 09:58:28 +00:00
Matt Arsenault
64f5ca7e80 GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR
Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374252 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-09 22:44:43 +00:00
Vitaly Buka
9d720c4d8b [System Model] [TTI] Fix virtual destructor warning
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374221 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-09 20:48:52 +00:00
David Greene
53a00b6727 [System Model] [TTI] Update cache and prefetch TTI interfaces
Re-apply 9fdfb045ae8b/r365676 with fixes for PPC and Hexagon.  This involved
moving defaults from TargetTransformInfoImplBase to MCSubtargetInfo.

Rework the TTI cache and software prefetching APIs to prepare for the
introduction of a general system model.  Changes include:

- Marking existing interfaces const and/or override as appropriate
- Adding comments
- Adding BasicTTIImpl interfaces that delegate to a subtarget
  implementation
- Moving the default TargetTransformInfoImplBase implementation to a default
  MCSubtarget implementation

Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC
and SystemZ.  AArch64 already has a custom subtarget implementation, so its
custom TTI implementation is migrated to use the new facilities in BasicTTIImpl
to invoke its custom subtarget implementation.  The custom TTI implementations
continue to exist for the other targets with this change.  They are not moved
over to subtarget-based implementations.

The end goal is to have the default subtarget implementation defer to the system
model defined by the target.  With this change, the default MCSubtargetInfo
implementation essentially returns the defaults TargetTransformInfoImplBase used
to return.  Existing users of TTI defaults will hit the defaults now in
MCSubtargetInfo.  Targets that define their own custom TTI implementations won't
use the BasicTTIImpl implementations that route to the subtarget.

Once system models are in place for the targets that use these interfaces, their
custom TTI implementations can be removed.

Differential Revision: https://reviews.llvm.org/D63614

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374205 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-09 19:51:48 +00:00
Jinsong Ji
cf65f7210c Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize"
Also Revert "[LoopVectorize] Fix non-debug builds after rL374017"

This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a.
This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04.

The patch is breaking PowerPC internal build, checked with author, reverting
on behalf of him for now due to timezone.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374091 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-08 17:32:56 +00:00
Nikola Prica
68af26d127 [DebugInfo][If-Converter] Update call site info during the optimization
During the If-Converter optimization pay attention when copying or
deleting call instructions in order to keep call site information in
valid state.

Reviewers: aprantl, vsk, efriedma

Reviewed By: vsk, efriedma

Differential Revision: https://reviews.llvm.org/D66955

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374068 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-08 15:43:12 +00:00
Sebastian Pop
930fdcd02b fix fmls fp16
Tim Northover remarked that the added patterns for fmls fp16
produce wrong code in case the fsub instruction has a
multiplication as its first operand, i.e., all the patterns FMLSv*_OP1:

> define <8 x half> @test_FMLSv8f16_OP1(<8 x half> %a, <8 x half> %b, <8 x half> %c) {
> ; CHECK-LABEL: test_FMLSv8f16_OP1:
> ; CHECK: fmls {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h
> entry:
>
>   %mul = fmul fast <8 x half> %c, %b
>   %sub = fsub fast <8 x half> %mul, %a
>   ret <8 x half> %sub
> }
>
> This doesn't look right to me. The exact instruction produced is "fmls
> v0.8h, v2.8h, v1.8h", which I think calculates "v0 - v2*v1", but the
> IR is calculating "v2*v1-v0". The equivalent <4 x float> code also
> doesn't emit an fmls.

This patch generates an fmla and negates the value of the operand2 of the fsub.

Inspecting the pattern match, I found that there was another mistake in the
opcode to be selected: matching FMULv4*16 should generate FMLSv4*16
and not FMLSv2*32.

Tested on aarch64-linux with make check-all.

Differential Revision: https://reviews.llvm.org/D67990

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374044 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-08 13:23:57 +00:00
Zi Xuan Wu
ee8d82e802 [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize
In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not
estimate different register pressure for different register class separately(especially for scalar type,
float type should not be on the same position with int type), so it's not accurate. Specifically,
it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.

So we need classify the register classes in IR level, and importantly these are abstract register classes,
and are not the target register class of backend provided in td file. It's used to establish the mapping between
the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.

For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR),
float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled,
and 3 kinds of register class when VSX is NOT enabled.

It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.

Differential revision: https://reviews.llvm.org/D67148


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374017 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-08 03:28:33 +00:00
Jonas Devlieghere
845ee44381 [AccelTable] Remove stale comment (NFC)
rdar://55857228

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373956 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-07 20:33:20 +00:00
Matt Arsenault
0401164ba7 GlobalISel: Partially implement lower for G_INSERT
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373946 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-07 19:13:27 +00:00
Matt Arsenault
b4bef9ad2f GlobalISel: Add target pre-isel instructions
Allows targets to introduce regbankselectable
pseudo-instructions. Currently the closet feature to this is an
intrinsic. However this requires creating a public intrinsic
declaration. This litters the public intrinsic namespace with
operations we don't necessarily want to expose to IR producers, and
would rather leave as private to the backend.

Use a new instruction bit. A previous attempt tried to keep using enum
value ranges, but it turned into a mess.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373937 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-07 18:43:29 +00:00
Kevin P. Neal
6c92be1c6c [FPEnv] Add constrained intrinsics for lrint and lround
Earlier in the year intrinsics for lrint, llrint, lround and llround were
added to llvm. The constrained versions are now implemented here.

Reviewed by:	andrew.w.kaylor, craig.topper, cameron.mcinally
Approved by:	craig.topper
Differential Revision:	https://reviews.llvm.org/D64746


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373900 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-07 13:20:00 +00:00
Simon Pilgrim
2e0c9d36c2 [X86][AVX] Access a scalar float/double as a free extract from a broadcast load (PR43217)
If a fp scalar is loaded and then used as both a scalar and a vector broadcast, perform the load as a broadcast and then extract the scalar for 'free' from the 0th element.

This involved switching the order of the X86ISD::BROADCAST combines so we only convert to X86ISD::BROADCAST_LOAD once all other canonicalizations have been attempted.

Adds a DAGCombinerInfo::recursivelyDeleteUnusedNodes wrapper.

Fixes PR43217

Differential Revision: https://reviews.llvm.org/D68544

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373871 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-06 21:11:45 +00:00
Matt Arsenault
97454bf24b GlobalISel: Partially implement lower for G_EXTRACT
Turn into shift and truncate. Doesn't yet handle pointers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373838 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-06 01:37:35 +00:00
Sander de Smalen
3e9f78170e [AArch64] Static (de)allocation of SVE stack objects.
Adds support to AArch64FrameLowering to allocate fixed-stack SVE objects.

The focus of this patch is purely to allow the stack frame to
allocate/deallocate space for scalable SVE objects. More dynamic
allocation (at compile-time, i.e. determining placement of SVE objects
on the stack), or resolving frame-index references that include
scalable-sized offsets, are left for subsequent patches.

SVE objects are allocated in the stack frame as a separate region below
the callee-save area, and above the alignment gap. This is done so that
the SVE objects can be accessed directly from the FP at (runtime)
VL-based offsets to benefit from using the VL-scaled addressing modes.

The layout looks as follows:

     +-------------+
     | stack arg   |   
     +-------------+
     | Callee Saves|
     |   X29, X30  |       (if available)
     |-------------| <- FP (if available)
     |     :       |   
     |  SVE area   |   
     |     :       |   
     +-------------+
     |/////////////| alignment gap.
     |     :       |   
     | Stack objs  |
     |     :       |   
     +-------------+ <- SP after call and frame-setup

SVE and non-SVE stack objects are distinguished using different
StackIDs. The offsets for objects with TargetStackID::SVEVector should be
interpreted as purely scalable offsets within their respective SVE region.

Reviewers: thegameg, rovka, t.p.northover, efriedma, rengolin, greened

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D61437


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373585 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-03 11:33:50 +00:00
Simon Pilgrim
c766f1fc50 [CodeGen] Remove unused MachineMemOperand::print wrappers (PR41772)
As noted on PR41772, the static analyzer reports that the MachineMemOperand::print partial wrappers set a number of args to null pointers that were then dereferenced in the actual implementation.

It turns out that these wrappers are not being used at all (hence why we're not seeing any crashes), so I'd like to propose we just get rid of them.

Differential Revision: https://reviews.llvm.org/D68208

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373484 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 16:20:28 +00:00
Hans Wennborg
9a5d29b0bf Reapply r373431 "Switch lowering: omit range check for bit tests when default is unreachable (PR43129)"
This was reverted in r373454 due to breaking the expensive-checks bot.
This version addresses that by omitting the addSuccessorWithProb() call
when omitting the range check.

> Switch lowering: omit range check for bit tests when default is unreachable (PR43129)
>
> This is modeled after the same functionality for jump tables, which was
> added in r357067.
>
> Differential revision: https://reviews.llvm.org/D68131

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373477 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 14:35:06 +00:00
James Molloy
27fd3594bf [ModuloSchedule] Peel out prologs and epilogs, generate actual code
Summary:
This extends the PeelingModuloScheduleExpander to generate prolog and epilog code,
and correctly stitch uses through the prolog, kernel, epilog DAG.

The key concept in this patch is to ensure that all transforms are *local*; only a
function of a block and its immediate predecessor and successor. By defining the problem in this way
we can inductively rewrite the entire DAG using only local knowledge that is easy to
reason about.

For example, we assume that all prologs and epilogs are near-perfect clones of the
steady-state kernel. This means that if a block has an instruction that is predicated out,
we can redirect all users of that instruction to that equivalent instruction in our
immediate predecessor. As all blocks are clones, every instruction must have an equivalent in
every other block.

Similarly we can make the assumption by construction that if a value defined in a block is used
outside that block, the only possible user is its immediate successors. We maintain this
even for values that are used outside the loop by creating a limited form of LCSSA.

This code isn't small, but it isn't complex.

Enabled a bunch of testing from Hexagon. There are a couple of tests not enabled yet;
I'm about 80% sure there isn't buggy codegen but the tests are checking for patterns
that we don't produce. Those still need a bit more investigation. In the meantime we
(Google) are happy with the code produced by this on our downstream SMS implementation,
and believe it generates correct code.

Subscribers: mgorny, hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68205

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373462 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 12:46:44 +00:00
Hans Wennborg
e1e678465b Revert r373431 "Switch lowering: omit range check for bit tests when default is unreachable (PR43129)"
This broke http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/19967

> Switch lowering: omit range check for bit tests when default is unreachable (PR43129)
>
> This is modeled after the same functionality for jump tables, which was
> added in r357067.
>
> Differential revision: https://reviews.llvm.org/D68131

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373454 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 12:08:44 +00:00
Hans Wennborg
1256340bcb Switch lowering: omit range check for bit tests when default is unreachable (PR43129)
This is modeled after the same functionality for jump tables, which was
added in r357067.

Differential revision: https://reviews.llvm.org/D68131

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373431 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-02 08:32:15 +00:00