Enable "Remove Redundant LEAs" part of the LEA optimization pass for -O2.
This gives 6.4% performance improve on Broadwell on nnet benchmark from Coremark-pro.
There is no significant effect on other benchmarks (Geekbench, Spec2000, Spec2006).
Differential Revision: http://reviews.llvm.org/D19659
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270036 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Implement guard widening in LLVM. Description from GuardWidening.cpp:
The semantics of the `@llvm.experimental.guard` intrinsic lets LLVM
transform it so that it fails more often that it did before the
transform. This optimization is called "widening" and can be used hoist
and common runtime checks in situations like these:
```
%cmp0 = 7 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
%cmp1 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp1) [ "deopt"(...) ]
...
```
to
```
%cmp0 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
...
```
If `%cmp0` is false, `@llvm.experimental.guard` will "deoptimize" back
to a generic implementation of the same function, which will have the
correct semantics from that point onward. It is always _legal_ to
deoptimize (so replacing `%cmp0` with false is "correct"), though it may
not always be profitable to do so.
NB! This pass is a work in progress. It hasn't been tuned to be
"production ready" yet. It is known to have quadriatic running time and
will not scale to large numbers of guards
Reviewers: reames, atrick, bogner, apilipenko, nlewycky
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D20143
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269997 91177308-0d34-0410-b5e6-96231b3b80d8
Having an enum member named Default is quite confusing: Is it distinct
from the others?
This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269988 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, we didn't add their and their operands cost, which could've
resulted in unrolling loops for no actual benefit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269985 91177308-0d34-0410-b5e6-96231b3b80d8
When looking for an available spill slot, the register scavenger would stop
after finding the first one with no register assigned to it. That slot may
have size and alignment that do not meet the requirements of the register
that is to be spilled. Instead, find an available slot that is the closest
in size and alignment to one that is needed to spill a register from RC.
Differential Revision: http://reviews.llvm.org/D20295
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269969 91177308-0d34-0410-b5e6-96231b3b80d8
with an additional fix to make RegAllocFast ignore undef physreg uses. It would
previously get confused about the "push %eax" instruction's use of eax. That
method for adjusting the stack pointer is used in X86FrameLowering::emitSPUpdate
as well, but since that runs after register-allocation, we didn't run into the
RegAllocFast issue before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269949 91177308-0d34-0410-b5e6-96231b3b80d8
Don't expand divisions by constants if it would require multiple instructions.
The current assumption is that engines will perform the desired optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269930 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The ordering of registers in BinaryRRF instructions are wrong, and
affects the copysign instruction (CPSDR). This results in the wrong
magnitude and sign being set.
Author: zhanjunl
Reviewers: kbarton, uweigand
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20308
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269922 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT
pair while adding a timer function, such that another termination of the MWAITX
instruction occurs when the timer expires. The presence of the MONITORX and
MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29.
The MONITORX and MWAITX instructions are intercepted by the same bits that
intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be
monitored. MWAITX instruction causes the processor to stop instruction execution
and enter an implementation-dependent optimized state until occurrence of a
class of events.
Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is
"0F 01 FB". These opcode information is used in adding tests for the
disassembler.
These instructions are enabled for AMD's bdver4 architecture.
Patch by Ganesh Gopalasubramanian!
Reviewers: echristo, craig.topper, RKSimon
Subscribers: RKSimon, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19795
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269911 91177308-0d34-0410-b5e6-96231b3b80d8
MC only needs to know if the output is PIC or not. It never has to
decide about creating GOTs and PLTs for example. The only thing that
MC itself uses this information for is expanding "macros" in sparc and
mips. The rest I am pretty sure could be moved to CodeGen.
This is a cleanup and isolates the code from future changes to
Reloc::Model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269909 91177308-0d34-0410-b5e6-96231b3b80d8
Restrict the creation of compact branches so that they meet the ISA encoding
requirements. Notably do not permit $zero to be used as a operand for compact
branches and ensure that some other branches fulfil the requirement that
rs != rt.
Fixup cases where $rs > $rt for bnec and beqc.
Reviewers: dsanders, vkalintiris
Differential Review: http://reviews.llvm.org/D20284
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269893 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds support for software floating point operations for Sparc targets.
This is the first in a set of patches to enable software floating point on Sparc. The next patch will enable the option to be used with Clang.
Differential Revision: http://reviews.llvm.org/D19265
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269892 91177308-0d34-0410-b5e6-96231b3b80d8
Coverage mapping data is organized in a sequence of blocks, each of which is expected
to be aligned by 8 bytes. This feature is used when reading those blocks, see
VersionedCovMapFuncRecordReader::readFunctionRecords(). If a misaligned covearge
mapping data has more than one block, it causes llvm-cov to fail.
Differential Revision: http://reviews.llvm.org/D20285
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269887 91177308-0d34-0410-b5e6-96231b3b80d8
Very few things in MC itself use the option. Most of the code that that
uses it could be move to CodeGen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269871 91177308-0d34-0410-b5e6-96231b3b80d8
structs"
This reverts commits r269845, r269846, and r269850 as they
introduce a crash in obj2yaml when trying to do a roundtrip.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269865 91177308-0d34-0410-b5e6-96231b3b80d8
I don't yet fully understand the meaning of these data strcutures,
but at least it seems that their sizes and types are correct.
With this change, we can read publics streams till end.
Differential Revision: http://reviews.llvm.org/D20343
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269861 91177308-0d34-0410-b5e6-96231b3b80d8
We currently don't represent get_local and set_local explicitly; they
are just implied by virtual register use and def. This avoids a lot of
clutter, but it does complicate stackifying: get_locals read their
operands at their position in the stack evaluation order, rather than
at their parent instruction. This patch adds code to walk the stack to
determine the precise ordering, when needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269854 91177308-0d34-0410-b5e6-96231b3b80d8
This patch moves the expansion of WIN_ALLOCA pseudo-instructions
into a separate pass that walks the CFG and lowers the instructions
based on a conservative estimate of the offset between the stack
pointer and the lowest accessed stack address.
The goal is to reduce binary size and run-time costs by removing
calls to _chkstk. While it doesn't fix all the code quality problems
with inalloca calls, it's an incremental improvement for PR27076.
Differential Revision: http://reviews.llvm.org/D20263
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269828 91177308-0d34-0410-b5e6-96231b3b80d8