1130 Commits

Author SHA1 Message Date
Craig Topper
b6d6904481 [AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275045 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11 05:36:48 +00:00
Craig Topper
c09e328b81 [X86] Add the AVX512 SET0 pseudos to foldMemoryOperandImpl since they are marked for CanFoldAsLoad.
I don't really know how to test this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275044 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11 05:36:41 +00:00
Duncan P. N. Exon Smith
4383a516d1 CodeGen: Use MachineInstr& in LiveVariables API, NFC
Change all the methods in LiveVariables that expect non-null
MachineInstr* to take MachineInstr& and update the call sites.  This
clarifies the API, and designs away a class of iterator to pointer
implicit conversions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274319 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-01 01:51:32 +00:00
Duncan P. N. Exon Smith
567409db69 CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr.  This is a
general API improvement.

Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other.  Instead I've done everything as a block and just
updated what was necessary.

This is mostly mechanical fixes: adding and removing `*` and `&`
operators.  The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency.  Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy.  I couldn't run tests
for AVR since llc doesn't link with it turned on.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274189 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-30 00:01:54 +00:00
Rafael Espindola
99b487713f Drop support for creating $stubs.
They are created by ld64 since OS X 10.5.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274130 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-29 14:59:50 +00:00
Dehao Chen
527613bc4e Relax the clearance calculating for breaking partial register dependency.
Summary: LLVM assumes that large clearance will hide the partial register spill penalty. But in our experiment, 16 clearance is too small. As the inserted XOR is normally fairly cheap, we should have a higher clearance threshold to aggressively insert XORs that is necessary to break partial register dependency.

Reviewers: wmi, davidxl, stoklund, zansari, myatsina, RKSimon, DavidKreitzer, mkuper, joerg, spatel

Subscribers: davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21560

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274068 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-28 21:19:34 +00:00
Rafael Espindola
aac123825e Convert a few more comparisons to isPositionIndependent(). NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273945 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 21:33:08 +00:00
Igor Breger
a8482b2070 [AVX512] [AVX512/AVX][Intrinsics] Fix Variable Bit Shift Right Arithmetic intrinsic lowering.
Differential Revision: http://reviews.llvm.org/D20897

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273138 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-20 07:05:43 +00:00
Benjamin Kramer
13c42d2b20 Run clang-tidy's performance-unnecessary-copy-initialization over LLVM.
No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272516 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 17:30:47 +00:00
Benjamin Kramer
af18e017d2 Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272512 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 15:39:02 +00:00
Craig Topper
1b683873d6 [X86] Bring consistent naming to the SSE/AVX and AVX512 PALIGNR instructions. Then add shuffle decode printing for the EVEX forms which is made easier by having the naming structure more similar to other instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272249 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-09 07:06:38 +00:00
Rafael Espindola
6a70b9b746 Simplify handling of hidden stub.
Since r207518 they are printed exactly like non-hidden stubs on x86 and
since r207517 on ARM.

This means we can use a single set for all stubs in those platforms.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269776 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-17 16:01:32 +00:00
David L Kreitzer
b672687bed Fix for PR27750. Correctly handle the case where the fallthrough block and
target block are the same in getFallThroughMBB.

Differential Revision: http://reviews.llvm.org/D20288


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269760 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-17 12:47:46 +00:00
Quentin Colombet
5e0f621a7b [X86] Properly check that EAX is dead when copying EFLAGS.
This fixes a bug introduced in r267623, where we got smarter and avoided to save
EAX before using it. However, we failed to check if any of the subregister of
EAX were alive and thus, missed cases where we have to save EAX before using it.

The problem may happen on every X86/i386/... platform.

This fixes llvm.org/PR27624


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269115 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-10 20:49:46 +00:00
Jonas Paulsson
32db7c31b2 [foldMemoryOperand()] Pass LiveIntervals to enable liveness check.
SystemZ (and probably other targets as well) can fold a memory operand
by changing the opcode into a new instruction that as a side-effect
also clobbers the CC-reg.

In order to do this, liveness of that reg must first be checked. When
LIS is passed, getRegUnit() can be called on it and the right
LiveRange is computed on demand.

Reviewed by Matthias Braun.
http://reviews.llvm.org/D19861

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269026 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-10 08:09:37 +00:00
Craig Topper
9abf0e829e [X86][AVX512] Strengthen the assertions from r269001. We need VLX to use the 128/256-bit move opcodes for extended registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269019 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-10 05:28:04 +00:00
Quentin Colombet
c2cb1b7b04 [X86][AVX512] Use the proper load/store for AVX512 registers.
When loading or storing AVX512 registers we were not using the AVX512
variant of the load and store for VR128 and VR256 like registers.
Thus, we ended up with the wrong encoding and actually were dropping the
high bits of the instruction. The result was that we load or store the
wrong register. The effect is visible only when we emit the object file
directly and disassemble it. Then, the output of the disassembler does
not match the assembly input.

This is related to llvm.org/PR27481.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269001 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-10 01:09:14 +00:00
Craig Topper
dba67a4fdb [AVX512] Add VLX 128/256-bit SET0 operations that encode to 128/256-bit EVEX encoded VPXORD so all 32 registers can be used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268884 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-08 21:33:53 +00:00
Matthias Braun
02073cb41c livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFC
The block must no be nullptr for the addLiveIns()/addLiveOuts()
function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268340 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-03 00:24:32 +00:00
Matthias Braun
b4756d6b2d LivePhysRegs: Automatically determine presence of pristine regs.
Remove the AddPristinesAndCSRs parameters from
addLiveIns()/addLiveOuts().

We need to respect pristine registers after prologue epilogue insertion,
Seeing that we got this wrong in at least two commits already, we should
rather pay the small price to query MachineFrameInfo for it.

There are three cases that did not set AddPristineAndCSRs to true even
after register allocation:
- ExecutionDepsFix: live-out registers are used as a hint that the
  register is used soon. This is not true for pristine registers so
  use the new addLiveOutsNoPristines() to maintain this behaviour.
- SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like
  a bug, should do the right thing automatically now.
- StackMapLivenessAnalysis: Not adding pristine registers looks like a
  bug to me. Added a FIXME comment but maintain the current behaviour
  as a change may need to get coordinated with GC runtimes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268336 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-03 00:08:46 +00:00
David L Kreitzer
9135c2ad84 Enable the X86 call frame optimization for the 64-bit targets that allow it.
Fixes PR27241.

Differential Revision: http://reviews.llvm.org/D19688


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268227 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-02 13:45:25 +00:00
Igor Breger
dcb96be9b4 Change AVX512 braodcastsd/ss patterns interaction with spilling . New implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth.
Differential Revision: http://reviews.llvm.org/D19579

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268190 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-01 08:40:00 +00:00
Craig Topper
9f60ac413e [X86] Reduce memory usage of MemOp2RegOp and RegOp2MemOp folding maps.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268164 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-30 17:59:49 +00:00
Craig Topper
dc03033554 [X86] Remove unused operand from a function and all its callers. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267854 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-28 05:58:46 +00:00
Quentin Colombet
316e8d944d [X86] Teach the expansion of copy instructions how to do proper liveness.
When the simple analysis provided by MachineBasicBlock::computeRegisterLiveness
fails, fall back on the LivePhysReg utility.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267623 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-26 23:14:32 +00:00
Andrew Kaylor
a5c0577492 Optimization bisect support in X86-specific passes
Differential Revision: http://reviews.llvm.org/D19439



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267608 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-26 21:44:24 +00:00
Mehdi Amini
f6071e14c5 [NFC] Header cleanup
Removed some unused headers, replaced some headers with forward class declarations.

Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'

Patch by Eugene Kosov <claprix@yandex.ru>

Differential Revision: http://reviews.llvm.org/D19219

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266595 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-18 09:17:29 +00:00
Aaron Ballman
59a288b05f Silencing warnings from MSVC 2015 Update 2. All of these changes silence "C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264929 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-30 21:30:00 +00:00
Hans Wennborg
6a62eecdb6 X86: Use push-pop for materializing 8-bit immediates for minsize (take 2)
This is the same as r255936, with added logic for avoiding clobbering of the
red zone (PR26023).

Differential Revision: http://reviews.llvm.org/D18246

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264375 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 01:10:56 +00:00
Simon Pilgrim
1ae39fa1a5 [X86][XOP] Fixed instruction postfixes to more closely match operands
Suggested by Sanjay in D18189 as the multiple folding options in XOP instructions can be tricky

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264305 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-24 16:31:30 +00:00
Cong Hou
58aed69858 Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed.
Currently, AnalyzeBranch() fails non-equality comparison between floating points
on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this
function can modify the branch by reversing the conditional jump and removing
unconditional jump if there is a proper fall-through. However, in the case of
non-equality comparison between floating points, this can turn the branch
"unanalyzable". Consider the following case:

jne.BB1
jp.BB1
jmp.BB2
.BB1:
...
.BB2:
...

AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be
removed:

jne.BB1
jnp.BB2
.BB1:
...
.BB2:
...

However, AnalyzeBranch() cannot analyze this branch anymore as there are two
conditional jumps with different targets. This may disable some optimizations
like block-placement: in this case the fall-through behavior is enforced even if
the fall-through block is very cold, which is suboptimal.

Actually this optimization is also done in block-placement pass, which means we
can remove this optimization from AnalyzeBranch(). However, currently
X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined
negation conditions for them.

In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP
and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them.
Here only the second conditional jump is reversed. This is valid as we only need
them to do this "unconditional jump removal" optimization.


Differential Revision: http://reviews.llvm.org/D11393




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264199 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-23 21:45:37 +00:00
Chad Rosier
cd3a68c781 [TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC.
http://reviews.llvm.org/D17967

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263021 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 16:00:35 +00:00
Craig Topper
a28c17a212 [X86] Use MCPhysReg and uint16_t for static arrays of registers and opcodes respectively should reduce size tiny bit. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262458 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-02 04:42:31 +00:00
Duncan P. N. Exon Smith
5b9b80ea30 CodeGen: TII: Take MachineInstr& in predicate API, NFC
Change TargetInstrInfo API to take `MachineInstr&` instead of
`MachineInstr*` in the functions related to predicated instructions
(I'll try to come back later and get some of the rest).  All of these
functions require non-null parameters already, so references are more
clear.  As a bonus, this happens to factor away a host of implicit
iterator => pointer conversions.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261605 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-23 02:46:52 +00:00
Ahmed Bougacha
9867695c88 [X86] Remove the now-unused X86ISD::PSIGN. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261025 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-16 22:14:12 +00:00
Igor Breger
0dd3e9d55e AVX512: Change store size of kmask. Store size of v8i1, v4i1 , v2i1 and i1 are changed to 16 bits.
If KMOVB not supported (require AVX512DQ) only KMOVW can be used so store size should be 2 bytes.

Differential Revision: http://reviews.llvm.org/D17138

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260878 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 08:25:28 +00:00
Simon Pilgrim
8464bb8c41 [X86][SSE1] Add MOVLHPS/MOVHLPS lowering and memory folding support
As discussed on PR26491, this patch adds support for lowering v4f32 shuffles to the MOVLHPS/MOVHLPS instructions. It also adds support for memory folding with their MOVLPS/MOVHPS load equivalents.

This first patch only really helps SSE1 targets as SSE2+ targets will widen the shuffle mask and use v2f64 equivalents (although they still combine to MOVLHPS/MOVHLPS for v2f64 splats). This will have to be addressed in a future patch, most likely when we add support for binary target shuffle combines.

Differential Revision: http://reviews.llvm.org/D16956

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260168 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-08 23:03:46 +00:00
Sanjoy Das
d1f5b8553b [X86] Fix a bug in getMemOpBaseRegImmOfs
Fix a crash in `getMemOpBaseRegImmOfs` that happens if the base of
`MemOp` is a frame index memory operand.  The fix is to have
`getMemOpBaseRegImmOfs` bail out in such cases.  We can possibly be more
clever here, if needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259456 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 02:32:43 +00:00
Benjamin Kramer
25ed974814 Revert "Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed."
and "Add a missing test case for r258847."

This reverts commit r258847, r258848. Causes miscompilations and backend
errors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258927 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-27 12:44:12 +00:00
Cong Hou
c207a75af0 Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed.
Currently, AnalyzeBranch() fails non-equality comparison between floating points
on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this
function can modify the branch by reversing the conditional jump and removing
unconditional jump if there is a proper fall-through. However, in the case of
non-equality comparison between floating points, this can turn the branch
"unanalyzable". Consider the following case:

jne.BB1
jp.BB1
jmp.BB2
.BB1:
...
.BB2:
...

AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be
removed:

jne.BB1
jnp.BB2
.BB1:
...
.BB2:
...

However, AnalyzeBranch() cannot analyze this branch anymore as there are two
conditional jumps with different targets. This may disable some optimizations
like block-placement: in this case the fall-through behavior is enforced even if
the fall-through block is very cold, which is suboptimal.

Actually this optimization is also done in block-placement pass, which means we
can remove this optimization from AnalyzeBranch(). However, currently
X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined
negation conditions for them.

In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP
and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them.
Here only the second conditional jump is reversed. This is valid as we only need
them to do this "unconditional jump removal" optimization.


Differential Revision: http://reviews.llvm.org/D11393




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258847 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 20:08:01 +00:00
Simon Pilgrim
841d92c472 [X86][AVX] Add commutation support for VPERM2X128 instructions
Its main use is to allow memory folding of the 1st operand

Differential Revision: http://reviews.llvm.org/D16521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258726 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-25 21:51:34 +00:00
Craig Topper
59b312cfd5 [X86] Make MOV32ri64 a post-RA pseudo instead of a CodeGenOnly instruction. It was only needed for rematerialization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256818 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 07:44:14 +00:00
David Majnemer
7f4aa70ca5 Revert "[X86] Use push-pop for materializing small constants under 'minsize'"
The red zone consists of 128 bytes beyond the stack pointer so that the
allocation of objects in leaf functions doesn't require decrementing
rsp.  In r255656, we introduced an optimization that would cheaply
materialize certain constants via push/pop.  Push decrements the stack
pointer and stores it's result at what is now the top of the stack.
However, this means that using push/pop would encroach on the red zone.
PR26023 gives an example where this corrupts an object in the red zone.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256808 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 02:32:06 +00:00
Matthias Braun
96b46d19c0 MachineInstrBundle: Fix reversed isSuperRegisterEq() call
Unfortunately this fix had the effect of exposing the
-verify-machineinstrs FIXME of X86InstrInfo.cpp in two testcases for
which I disabled it for now.
Two testcases also have additional pushq/popq where the corrected code
cannot prove that %rax is dead any longer. Looking at the examples, this
could potentially be fixed by improving computeRegisterLiveness() to check
the live-in lists of the successors blocks when reaching the end of a
block.

This fixes http://llvm.org/PR25951.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256799 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 00:45:35 +00:00
David Majnemer
f124911f7a [X86] Make hasFP constant time
We need a frame pointer if there is a push/pop sequence after the
prologue in order to unwind the stack.  Scanning the instructions to
figure out if this happened made hasFP not constant-time which is a
violation of expectations.  Let's compute this up-front and reuse that
computation when we need it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256730 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-04 04:49:41 +00:00
Sanjay Patel
3bb88046da use range-based for-loops; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256573 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-29 19:14:23 +00:00
Sanjay Patel
4c32013a9a tidy up; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256506 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 18:18:22 +00:00
David Majnemer
1342075082 [X86, Win64] Use a frame pointer if pushf is emitted
A frame pointer must be used if stack pointer is modified after the
prologue.  LLVM will emit pushf/popf if we need to save/restore the
FLAGS register, requiring us to have a frame pointer for the function.

There is a small twist: this sequence might exist in user code via
inline-assembly.  For now, conservatively assume that such functions
require a frame pointer.  For real world justification, please see
clang's implementation of __readeflags.

This fixes PR25945.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256456 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 06:07:26 +00:00
Craig Topper
5f471068e4 [X86] Replace MVT::SimpleValueType in the AsmParser library and getX86SubSuperRegister with just an unsigned representing size.
This a is step towards fixing a layering violation so the X86 AsmParser won't depending on CodeGen types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256425 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-25 22:09:45 +00:00
Elena Demikhovsky
52ebd43338 AVX-512: Kreg set 0/1 optimization
The patterns that set a mask register to 0/1
KXOR %kn, %kn, %kn / KXNOR %kn, %kn, %kn
are replaced with
KXOR %k0, %k0, %kn / KXNOR %k0, %k0, %kn - AVX-512 targets optimization.

KNL does not recognize dependency-breaking idioms for mask registers,
so kxnor %k1, %k1, %k2 has a RAW dependence on %k1.
Using %k0 as the undef input register is a performance heuristic based
on the assumption that %k0 is used less frequently than the other mask
registers, since it is not usable as a write mask.

Differential Revision: http://reviews.llvm.org/D15739



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256365 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-24 08:12:22 +00:00