Summary:
powerpc64 big-endian is not supported, but I believe that most logic can
be shared, except for xray_powerpc64.cc.
Also add a function InvalidateInstructionCache to xray_util.h, which is
copied from llvm/Support/Memory.cpp. I'm not sure if I need to add a unittest,
and I don't know how.
Reviewers: dberris, echristo, iteratee, kbarton, hfinkel
Subscribers: mehdi_amini, nemanjai, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D29742
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294781 91177308-0d34-0410-b5e6-96231b3b80d8
Generally, the ISEL is expanded into if-then-else sequence, in some
cases (like when the destination register is the same with the true
or false value register), it may just be expanded into just the if
or else sequence.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292154 91177308-0d34-0410-b5e6-96231b3b80d8
Generally, the ISEL is expanded into if-then-else sequence, in some
cases (like when the destination register is the same with the true
or false value register), it may just be expanded into just the if
or else sequence.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292128 91177308-0d34-0410-b5e6-96231b3b80d8
This patch corresponds to review:
The newly added VSX D-Form (register + offset) memory ops target the upper half
of the VSX register set. The existing ones target the lower half. In order to
unify these and have the ability to target all the VSX registers using D-Form
operations, this patch defines Pseudo-ops for the loads/stores which are
expanded post-RA. The expansion then choses the correct opcode based on the
register that was allocated for the operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283212 91177308-0d34-0410-b5e6-96231b3b80d8
This patch corresponds to review:
https://reviews.llvm.org/D23155
This patch removes the VSHRC register class (based on D20310) and adds
exploitation of the Power9 sub-word integer loads into VSX registers as well
as vector sign extensions.
The new instructions are useful for a few purposes:
Int to Fp conversions of 1 or 2-byte values loaded from memory
Building vectors of 1 or 2-byte integers with values loaded from memory
Storing individual 1 or 2-byte elements from integer vectors
This patch implements all of those uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283190 91177308-0d34-0410-b5e6-96231b3b80d8
This patch corresponds to review:
https://reviews.llvm.org/D19825
The new lxvx/stxvx instructions do not require the swaps to line the elements
up correctly. In order to select them over the lxvd2x/lxvw4x instructions which
require swaps, the patterns for the old instruction have a predicate that
ensures they won't be selected on Power9 and newer CPUs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282143 91177308-0d34-0410-b5e6-96231b3b80d8
The main change is to return the code size from
InsertBranch/RemoveBranch.
Patch mostly by Tim Northover
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281505 91177308-0d34-0410-b5e6-96231b3b80d8
Avoid implicit conversions from MachineInstrBundleIterator to
MachineInstr* in the PowerPC backend, mainly by preferring MachineInstr&
over MachineInstr* when a pointer isn't nullable and using range-based
for loops.
There was one piece of questionable code in PPCInstrInfo::AnalyzeBranch,
where a condition checked a pointer converted from an iterator for
nullptr. Since this case is impossible (moreover, the code above
guarantees that the iterator is valid), I removed the check when I
changed the pointer to a reference.
Despite that case, there should be no functionality change here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276864 91177308-0d34-0410-b5e6-96231b3b80d8
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr. This is a
general API improvement.
Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other. Instead I've done everything as a block and just
updated what was necessary.
This is mostly mechanical fixes: adding and removing `*` and `&`
operators. The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.
As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.
Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy. I couldn't run tests
for AVR since llc doesn't link with it turned on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274189 91177308-0d34-0410-b5e6-96231b3b80d8
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272512 91177308-0d34-0410-b5e6-96231b3b80d8
Fix PR27943 "Bad machine code: Using an undefined physical register".
SUBFC8 implicitly defines the CR0 register, but this was omitted in
the instruction definition.
Patch by Jameson Nash <jameson@juliacomputing.com>
Reviewers: hfinkel
Differential Revision: http://reviews.llvm.org/D20802
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271425 91177308-0d34-0410-b5e6-96231b3b80d8
Add Word rotates to the list of instructions that are zero extending.
This allows them to be used in dot form to compare with zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264183 91177308-0d34-0410-b5e6-96231b3b80d8
Change TargetInstrInfo API to take `MachineInstr&` instead of
`MachineInstr*` in the functions related to predicated instructions
(I'll try to come back later and get some of the rest). All of these
functions require non-null parameters already, so references are more
clear. As a bonus, this happens to factor away a host of implicit
iterator => pointer conversions.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261605 91177308-0d34-0410-b5e6-96231b3b80d8
Some compilers don't do exhaustive switch checking. For those compilers,
add an initialization to prevent un-initialized variable warnings from
firing. For compilers with exhaustive switch checking, we still get a
guarantee that the switch is exhaustive, and hence the initializations
are redundant, and a non-functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257923 91177308-0d34-0410-b5e6-96231b3b80d8
Only non-weighted predicates were handled in PPCInstrInfo::insertSelect. Handle
the weighted predicates as well.
This latent bug was triggered by r255398, because it added use of the
branch-weighted predicates.
While here, switch over an enum instead of an int to get the compiler to enforce
totality in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257518 91177308-0d34-0410-b5e6-96231b3b80d8
Also, remove an enum hack where enum values were used as indexes into an array.
We may want to make this a real class to allow pattern-based queries/customization (D13417).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252196 91177308-0d34-0410-b5e6-96231b3b80d8
To commute a trivial rlwimi instructions (meaning one with a full mask and zero
shift), we'd need to ability to form an all-zero mask (instead of an all-one
mask) using rlwimi. We can't represent this, however, and we'll miscompile code
if we try.
The code quality problem that this highlights (that SDAG simplification can
lead to us generating an ISD::OR node with a constant zero LHS) will be fixed
as a follow-up.
Fixes PR24719.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246937 91177308-0d34-0410-b5e6-96231b3b80d8
Make the arrays 'static const' instead of just 'static'. Post-commit review
comment from Roman Divacky on IRC. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246376 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for MIR serialization of PowerPC-specific operand target flags
(based on the generic infrastructure added in r244185 and r245383).
I won't even pretend that this is good test coverage, but this includes the
regression test associated with r246372. Adding an MIR test for that fix is far
superior to adding an IR-level test because particular instruction-scheduling
decisions are necessary in order to expose the bug, and using an MIR test we
can start the pipeline post-scheduling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246373 91177308-0d34-0410-b5e6-96231b3b80d8
This commit removes the global manager variable which is responsible for
storing and allocating pseudo source values and instead it introduces a new
manager class named 'PseudoSourceValueManager'. Machine functions now own an
instance of the pseudo source value manager class.
This commit also modifies the 'get...' methods in the 'MachinePointerInfo'
class to construct pseudo source values using the instance of the pseudo
source value manager object from the machine function.
This commit updates calls to the 'get...' methods from the 'MachinePointerInfo'
class in a lot of different files because those calls now need to pass in a
reference to a machine function to those methods.
This change will make it easier to serialize pseudo source values as it will
enable me to transform the mips specific MipsCallEntry PseudoSourceValue
subclass into two target independent subclasses.
Reviewers: Akira Hatanaka
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244693 91177308-0d34-0410-b5e6-96231b3b80d8
This is a direct port of the code from the X86 backend (r239486/r240361), which
uses the MachineCombiner to reassociate (floating-point) adds/muls to increase
ILP, to the PowerPC backend. The rationale is the same.
There is a lot of copy-and-paste here between the X86 code and the PowerPC
code, and we should extract at least some of this into CodeGen somewhere.
However, I don't want to do that until this code is enhanced to handle FMAs as
well. After that, we'll be in a better position to extract the common parts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242279 91177308-0d34-0410-b5e6-96231b3b80d8
PowerPC uses itineraries to describe processor pipelines (and dispatch-group
restrictions for P7/P8 cores). Unfortunately, the target-independent
implementation of TII.getInstrLatency calls ItinData->getStageLatency, and that
looks for the largest cycle count in the pipeline for any given instruction.
This, however, yields the wrong answer for the PPC itineraries, because we
don't encode the full pipeline. Because the functional units are fully
pipelined, we only model the initial stages (there are no relevant hazards in
the later stages to model), and so the technique employed by getStageLatency
does not really work. Instead, we should take the maximum output operand
latency, and that's what PPCInstrInfo::getInstrLatency now does.
This caused some test-case churn, including two unfortunate side effects.
First, the new arrangement of copies we get from function parameters now
sometimes blocks VSX FMA mutation (a FIXME has been added to the code and the
test cases), and we have one significant test-suite regression:
SingleSource/Benchmarks/BenchmarkGame/spectral-norm
56.4185% +/- 18.9398%
In this benchmark we have a loop with a vectorized FP divide, and it with the
new scheduling both divides end up in the same dispatch group (which in this
case seems to cause a problem, although why is not exactly clear). The grouping
structure is hard to predict from the bottom of the loop, and there may not be
much we can do to fix this.
Very few other test-suite performance effects were really significant, but
almost all weakly favor this change. However, in light of the issues
highlighted above, I've left the old behavior available via a
command-line flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242188 91177308-0d34-0410-b5e6-96231b3b80d8
This patch corresponds to review:
http://reviews.llvm.org/D9440
It adds a new register class to the PPC back end to contain single precision
values in VSX registers. Additionally, it adds scalar loads and stores for
VSX registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236755 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07
(POWER8). The intrinsic support is based on GCC one [1], but currently only the
'PowerPC HTM Low Level Built-in Function' are implemented.
The HTM instructions follows the RC ones and the transaction initiation result
is set on RC0 (with exception of tcheck). Currently approach is to create a
register copy from CR0 to GPR and comapring. Although this is suboptimal, since
the branch could be taken directly by comparing the CR0 value, it generates code
correctly on both test and branch and just return value. A possible future
optimization could be elimitate the MFCR instruction to branch directly.
The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on
powerpc64 and powerpc64le.
This is send along a clang patch to enabled the builtins and option switch.
[1] https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html
Phabricator Review: http://reviews.llvm.org/D8247
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233204 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for the QPX vector instruction set, which is used by the
enhanced A2 cores on the IBM BG/Q supercomputers. QPX vectors are 256 bytes
wide, holding 4 double-precision floating-point values. Boolean values, modeled
here as <4 x i1> are actually also represented as floating-point values
(essentially { -1, 1 } for { false, true }). QPX shares many features with
Altivec and VSX, but is distinct from both of them. One major difference is
that, instead of adding completely-separate vector registers, QPX vector
registers are extensions of the scalar floating-point registers (lane 0 is the
corresponding scalar floating-point value). The operations supported on QPX
vectors mirrors that supported on the scalar floating-point values (with some
additional ones for permutations and logical/comparison operations).
I've been maintaining this support out-of-tree, as part of the bgclang project,
for several years. This is not the entire bgclang patch set, but is most of the
subset that can be cleanly integrated into LLVM proper at this time. Adding
this to the LLVM backend is part of my efforts to rebase bgclang to the current
LLVM trunk, but is independently useful (especially for codes that use LLVM as
a JIT in library form).
The assembler/disassembler test coverage is complete. The CodeGen test coverage
is not, but I've included some tests, and more will be added as follow-up work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230413 91177308-0d34-0410-b5e6-96231b3b80d8