15393 Commits

Author SHA1 Message Date
Derek Schuff
d9b4137f9f [WebAssembly] Support combining GEP and FrameIndex offsets in memory operand offset field
Previously we only supported putting the FI into memory operand offset
fields if there was nothing there already. Now combine them.

Differential Revision: http://reviews.llvm.org/D15941

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257084 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 18:55:52 +00:00
Dan Gohman
181f7cc0f3 [WebAssembly] Use the default private label prefixes.
The MC assembler doesn't like using the empty string as a private label
prefix because then it treats all labels as private. This commit reverts
back to the default prefix, which is .L, which is common in ELF targets
and consistent with the LLVM name mangler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257083 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 18:49:53 +00:00
Nicolai Haehnle
702b589510 AMDGPU/SI: Fold operands with sub-registers
Summary:
Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs,
increasing the code size and VGPR pressure. These moves are now folded away.

Note that this lack of operand folding was not a problem for VMEM loads,
because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register
coalescer.

Some tests are updated, note that the fsub.ll test explicitly checks that
the move is elided.

With the IR generated by current Mesa, the changes are obviously relatively
minor:

7063 shaders in 3531 tests
Totals:
SGPRS: 351872 -> 352560 (0.20 %)
VGPRS: 199984 -> 200732 (0.37 %)
Code Size: 9876968 -> 9881112 (0.04 %) bytes
LDS: 91 -> 91 (0.00 %) blocks
Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave
Wait states: 295164 -> 295337 (0.06 %)

Totals from affected shaders:
SGPRS: 65784 -> 66472 (1.05 %)
VGPRS: 38064 -> 38812 (1.97 %)
Code Size: 1993828 -> 1997972 (0.21 %) bytes
LDS: 42 -> 42 (0.00 %) blocks
Scratch: 795648 -> 783360 (-1.54 %) bytes per wave
Wait states: 54026 -> 54199 (0.32 %)

Reviewers: tstellarAMD, arsenm, mareko

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15875

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257074 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 17:10:29 +00:00
Nicolai Haehnle
64f913f14f AMDGPU/SI: xnack_mask is always reserved on VI
Summary:
Somehow, I first interpreted the docs as saying space for xnack_mask is only
reserved when XNACK is enabled via SH_MEM_CONFIG. I felt uneasy about this and
went back to actually test what is happening, and it turns out that xnack_mask
is always reserved at least on Tonga and Carrizo, in the sense that flat_scr
is always fixed below the SGPRs that are used to implement xnack_mask, whether
or not they are actually used.

I confirmed this by writing a shader using inline assembly to tease out the
aliasing between flat_scratch and regular SGPRs. For example, on Tonga, where
we fix the number of SGPRs to 80, s[74:75] aliases flat_scratch (so
xnack_mask is s[76:77] and vcc is s[78:79]).

This patch changes both the calculation of the total number of SGPRs and the
various register reservations to account for this.

It ought to be possible to use the gap left by xnack_mask when the feature
isn't used, but this patch doesn't try to do that. (Note that the same applies
to vcc.)

Note that previously, even before my earlier change in r256794, the SGPRs that
alias to xnack_mask could end up being used as well when flat_scr was unused
and the total number of SGPRs happened to fall on the right alignment
(e.g. highest regular SGPR being used s29 and VCC used would lead to number
of SGPRs being 32, where s28 and s29 alias with xnack_mask). So if there
were some conflict due to such aliasing, we should have noticed that already.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15898

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257073 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 17:10:20 +00:00
Michael Zuckerman
496a771bba [avx512] Fix test avx512bw-intrinsics.ll
Change the CHECK lablel into AVX512BW 
And fix declare lable of llvm.x86.avx512.mask.psrav32_hi 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257071 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 16:25:42 +00:00
Michael Zuckerman
6c7a788883 [AVX512] add PSLLW and PSLLV Intrinsic
Differential Revision: http://reviews.llvm.org/D15889


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257070 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 16:02:51 +00:00
Nico Weber
0a765136e6 Revert r257055, it caused PR26064.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257066 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 15:01:46 +00:00
Michael Zuckerman
83fc76e8eb [AVX512] add PSRAV Intrinsic
Differential Revision: http://reviews.llvm.org/D15856


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257063 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 14:42:20 +00:00
Michael Zuckerman
699e85dc45 [AVX512] add PSHUFHW and PSHUFLW Intrinsic
Differential Revision: http://reviews.llvm.org/D15925


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257056 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 12:35:43 +00:00
Simon Pilgrim
9233e73bf3 [X86][AVX] Match broadcast loads through a bitcast
AVX1 v8i32/v4i64 shuffles are bitcasted to v8f32/v4f64, this patch peeks through bitcasts to check for a load node to allow broadcasts to occur.

Follow up to D15310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257055 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 11:34:27 +00:00
Simon Pilgrim
ce13714bfc [X86][SSE} Add INSERTPS as a target shuffle
Follow up to D15378, added INSERTPS to the list of decodable target shuffles and enabled XFormVExtractWithShuffleIntoLoad to handle target shuffles with SentinelZero and tested this with INSERTPS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257046 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 10:24:19 +00:00
Michael Zuckerman
00e4aed86a [AVX512] add PSHUFD Intrinsic
Differential Revision: http://reviews.llvm.org/D15934


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257044 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 09:24:12 +00:00
Tim Northover
928410cd12 ARM: support TLS accesses on Darwin platforms
Darwin TLS accesses most closely resemble ELF's general-dynamic situation,
since they have to be able to handle all possible situations. The descriptors
and so on are obviously slightly different though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257039 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 09:03:03 +00:00
NAKAMURA Takumi
cb57061984 llvm/test/CodeGen/X86/statepoint-vector.ll REQUIRES asserts due to a debug option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257031 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 05:40:37 +00:00
Philip Reames
5736485a06 One more attempt at stablizing a test on all platforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257026 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 04:20:52 +00:00
Philip Reames
f6cfcfb3c8 [Statepoints] Add test cases around vectors and stablize test
Unlike my comment in 257022 said, it turns out we do handle constant vectors in the statepoint lowering, but only because SelectionDAG doesn't actually produce constants for them.  Add a couple of tests which show this working.

Also, add a triple to the same test file to hopefully fix a failing bot.

It turns out we do han



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257025 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 04:15:31 +00:00
Haicheng Wu
e6f663968c [AArch64 MachineCombine] Enhance/Add support for general reassociation to reduce the critical path
Allow fadd/fmul to be reassociated in aarch64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257024 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 04:01:02 +00:00
Philip Reames
82a04b17ae [Statepoints] Initial support for relocating vectors of pointers
Currently, we try to split vectors of pointers back into their component pointer elements during rewrite-statepoints-for-gc. This is less than ideal since presumably the vectorizer chose to vectorize for a reason. :) It's also been a source of bugs - in particular, the relocation logic as currently implemented was recently discovered to be wrong.

The alternate approach is to allow gc.relocates of vector-of-pointer type and update the backend to handle them. That's what this patch tries to do. This won't actually enable vector-of-pointers in practice - there are some RS4GC changes needed - but the lowering is standalone and testable so it makes sense to separate.

Note that there are some known cases around vector constants which this patch does not handle. Once this is in, I'll send another patch with individual fixes and test cases. 

Differential Revision: http://reviews.llvm.org/D15632



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257022 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 03:32:11 +00:00
Dan Gohman
3d5f22734f [WebAssembly] Add -m:e to the target triple.
This enables ELF-style name mangling, which primarily means using ".L" for
private symbols.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257020 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 03:19:23 +00:00
Quentin Colombet
55c7a22c04 [ShrinkWrapping] Give up on irreducible CFGs.
We need to know whether or not a given basic block is in a loop for the analysis
to be correct.
Loop information may be incomplete on irreducible CFGs, therefore we may
generate incorrect code if we use it in those situations.

This fixes PR25988.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257012 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 01:23:49 +00:00
Nicolai Haehnle
bd0b681bbd AMDGPU/SI: Fix crash when inline assembly is used in a graphics shader
Summary:
This is admittedly something that you could only run into by manually
playing around with shader assembly because the SITypeWriter pass is
skipped for compute.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15902

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256980 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-06 22:01:04 +00:00
Quentin Colombet
4377f132ae [X86] Correctly model TLS calls w.r.t. frame requirements.
TLS calls need the stack frame to be properly set up and this
implies that such calls need ADJUSTSTACK_xxx markers.

Fixes PR25820.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256959 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-06 19:09:26 +00:00
Michael Kuperstein
f8ab3ef1eb [ShrinkWrap] Fix FindIDom to only have one kind of failure.
FindIDom() can fail in two different ways - it can either return nullptr or the
block itself, depending on the circumstances. Some users of FindIDom() check
one error condition, while others check the other.

Change it to always return nullptr on failure.
This fixes PR26004.

Differential Revision: http://reviews.llvm.org/D15847



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256955 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-06 18:40:11 +00:00
Dan Gohman
c7e3f5ac69 [WebAssembly] Don't use range-based loop for a list that's being modified
The first instruction in a block is what the rend() iterator points to, so
if it moves, we need to re-evaluate rend() so that we continue to iterate
through the rest of the instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256953 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-06 18:29:35 +00:00
Geoff Berry
823b3b2fdb ScheduleDAGInstrs: Bug fix for missed memory dependency.
Summary:
In buildSchedGraph(), when adding memory dependencies for loads, move
the call to adjustChainDeps() after the call to
addChainDependency(AliasChain) to handle the case where
addChainDependency(AliasChain) ends up not adding a dependency and
instead putting the SU on the RejectMemNodes list.  The call to
adjustChainDeps() must be done after the call to addChainDependency() in
order to process the SU added to the RejectMemNodes list to create
memory dependencies for it.

Reviewers: hfinkel, atrick, jonpa, resistor

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D15927

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256950 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-06 18:14:26 +00:00
Dan Gohman
0e8649c604 [WebAssembly] Add -asm-verbose=false to llc tests.
In general, disabling comments in the output reduces the chances of a
CHECK line accidentally matching a comment instead of its intended text.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256946 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-06 16:45:05 +00:00
Artyom Skrobov
e4ee51a005 PR25754: avoid generating UDIVREM8_ZEXT_HREG nodes with i64 result
Reviewers: spatel, srking

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15331

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256924 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-06 09:41:10 +00:00
Simon Pilgrim
2d3ec5706a [X86][SSE] There is no zmm addsubpd/addsubps instruction.
Replace the assert in combineShuffleToAddSub with an early out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256922 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-06 09:08:49 +00:00
Dan Gohman
0fcad92ee3 [SelectionDAGBuilder] Set NoUnsignedWrap for inbounds gep and load/store offsets.
In an inbounds getelementptr, when an index produces a constant non-negative
offset to add to the base, the add can be assumed to not have unsigned overflow.

This relies on the assumption that addresses can't occupy more than half the
address space, which isn't possible in C because it wouldn't be possible to
represent the difference between the start of the object and one-past-the-end
in a ptrdiff_t.

Setting the NoUnsignedWrap flag is theoretically useful in general, and is
specifically useful to the WebAssembly backend, since it permits stronger
constant offset folding.

Differential Revision: http://reviews.llvm.org/D15544


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256890 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-06 00:43:06 +00:00
Nicolai Haehnle
cb2ad1e249 AMDGPU/SI: Do not move scratch resource register on Tonga & Iceland
Due to the SGPR init bug, every program claims to use the same number
of SGPRs anyway, so there's no point in trying to shift those registers
down from their initial spot of reservation.

Add a test that uses VGPR spilling and blocks most SGPRs from being used for
the scratch resource register. Previously, this would run into an assertion.

Differential Revision: http://reviews.llvm.org/D15724

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256870 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 20:42:49 +00:00
Michael Zuckerman
024ff64164 [AVX512] add PSLLD and PSLLQ Intrinsic
Differential Revision: http://reviews.llvm.org/D15885


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256840 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 15:17:39 +00:00
MinSeong Kim
eaca36fc81 [AArch64] Add support for Samsung Exynos-M1
Adds core tuning support for new Samsung Exynos-M1 core (ARMv8-A).

Differential Revision: http://reviews.llvm.org/D15663

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256828 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 12:51:59 +00:00
Tom Stellard
b5659367ca AMDGPU/SI: Select non-uniform constant addrspace loads to flat instructions for HSA
Summary: This fixes a regression caused by r256282.

Reviewers: arsenm, cfang

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15736

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256810 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 03:40:16 +00:00
David Majnemer
7f4aa70ca5 Revert "[X86] Use push-pop for materializing small constants under 'minsize'"
The red zone consists of 128 bytes beyond the stack pointer so that the
allocation of objects in leaf functions doesn't require decrementing
rsp.  In r255656, we introduced an optimization that would cheaply
materialize certain constants via push/pop.  Push decrements the stack
pointer and stores it's result at what is now the top of the stack.
However, this means that using push/pop would encroach on the red zone.
PR26023 gives an example where this corrupts an object in the red zone.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256808 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 02:32:06 +00:00
Matthias Braun
8eb12ac26e X86: Add a testcase for PR25951
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256801 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 00:48:16 +00:00
Matthias Braun
96b46d19c0 MachineInstrBundle: Fix reversed isSuperRegisterEq() call
Unfortunately this fix had the effect of exposing the
-verify-machineinstrs FIXME of X86InstrInfo.cpp in two testcases for
which I disabled it for now.
Two testcases also have additional pushq/popq where the corrected code
cannot prove that %rax is dead any longer. Looking at the examples, this
could potentially be fixed by improving computeRegisterLiveness() to check
the live-in lists of the successors blocks when reaching the end of a
block.

This fixes http://llvm.org/PR25951.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256799 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 00:45:35 +00:00
Nicolai Haehnle
fac1bfe37d AMDGPU: add +xnack feature
Summary:
Enabling this feature will account for the two SGPRs used by the hardware
to store the XNACK_MASK physically.

The hardware only requires this reservation when the XNACK feature is
explicitly enabled. At some point, HSA will probably want to do that, but
it does increase SGPR register pressure, so leave it disabled by default
for now (but do add a small test).

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15869

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256794 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-04 23:35:53 +00:00
Simon Pilgrim
472704020c [X86][SSE] Ensure BLENDPD/BLENDPS/PBLEND inputs are both of the correct input type
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256782 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-04 21:41:11 +00:00
Geoff Berry
a1c12525dc [AArch64] Optimize some simple TBZ/TBNZ cases.
Summary:
Add some AArch64 dag combines to optimize some simple TBZ/TBNZ cases:

 (tbz (and x, m), b) -> (tbz x, b)
 (tbz (shl x, c), b) -> (tbz x, b-c)
 (tbz (shr x, c), b) -> (tbz x, b+c)
 (tbz (xor x, -1), b) -> (tbnz x, b)

Reviewers: jmolloy, mcrosier, t.p.northover

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D15702

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256765 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-04 18:55:47 +00:00
Joseph Tremoulet
7f410b17a7 [WinEH] Update CoreCLR EH state numbering
Summary:
Fix the CLR state numbering to generate correct tables, and update the lit
test to verify them.

The CLR numbering assigns one state number to each catchpad and
cleanuppad.

It also computes two tree-like relations over states:
 1) Each state has a "HandlerParentState", which is the state of the next
    outer handler enclosing this state's handler (same as nearest ancestor
    per the ParentPad linkage on EH pads, but skipping over catchswitches).
 2) Each state has a "TryParentState", which:
    a) for a catchpad that's not the last handler on its catchswitch, is
       the state of the next catchpad on that catchswitch.
    b) for all other pads, is the state of the pad whose try region is the
       next outer try region enclosing this state's try region.  The "try
       regions are not present as such in the IR, but will be inferred
       based on the placement of invokes and pads which reach each other
       by exceptional exits.

Catchswitches do not get their own states, but each gets mapped to the
state of its first catchpad.

Table generation requires each state's "unwind dest" state to have a lower
state number than the given state.

Since HandlerParentState can be computed as a function of a pad's
ParentPad, and TryParentState can be computed as a function of its unwind
dest and the TryParentStates of its children, the CLR state numbering
algorithm first computes HandlerParentState in a top-down pass, then
computes TryParentState in a bottom-up pass.

Also reword some comments/names in the CLR EH table generation to make the
distinction between the different kinds of "parent" clear.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: AndyAyers, llvm-commits

Differential Revision: http://reviews.llvm.org/D15325

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256760 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-04 16:16:01 +00:00
Michael Zuckerman
09816bb549 [AVX512] add PSRAD and PSRAQ Intrinsic
Differential Revision: http://reviews.llvm.org/D15851



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256754 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-04 13:45:45 +00:00
Michael Zuckerman
ab0aa0e9d9 [AVX512] add PSRAW Intrinsic
Differential Revision: http://reviews.llvm.org/D15850



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256751 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-04 12:50:36 +00:00
Michael Zuckerman
5b63de585b [AVX512] add PSRLV Intrinsic
Differential Revision: http://reviews.llvm.org/D15838



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256747 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-04 11:39:06 +00:00
Simon Pilgrim
5da99c7bd4 [X86][MMX] Regenerated vector insertion test.
Shows the true horror of what is going on....

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256713 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-03 19:17:37 +00:00
Simon Pilgrim
0dd2ca765d [X86][SSE] Added tests for insertion of zero elements into vectors
Many of these could be much better if we just lowered them all as shuffles - especially for the 256-bit vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256708 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-03 17:33:32 +00:00
Dimitry Andric
ac6a87b06e Fix several accidental DOS line endings in source files
Summary:
There are a number of files in the tree which have been accidentally checked in with DOS line endings.  Convert these to native line endings.

There are also a few files which have DOS line endings on purpose, and I have set the svn:eol-style property to 'CRLF' on those.

Reviewers: joerg, aaron.ballman

Subscribers: aaron.ballman, sanjoy, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D15848

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256707 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-03 17:22:03 +00:00
Simon Pilgrim
c88a575ea5 [X86][SSE41] Added test cases for improving insertps shuffles
As mentioned on D14261, an upcoming patch will improve combines of insertps instructions. 

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256706 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-03 17:14:15 +00:00
Simon Pilgrim
85b6d5e1dd [X86][SSE] Added v4f32 shuffle with zero tests
This is mainly test cases for improvements to insertps matching, but pre-SSE41 shuffles could be improved as well

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256705 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-03 17:02:56 +00:00
Joseph Tremoulet
d5ab13d966 [WinEH] Update catchrets with cloned successors
Summary:
Add a pass to update catchrets when their successors get cloned; the
existing pass doesn't catch these because it walks the funclet whose
blocks are being cloned but the catchret is in a child funclet.

Also update the test for removing incoming PHI values; when the
predecessor is a catchret, the relevant color is the catchret's parentPad,
not its block's color.


Reviewers: andrew.w.kaylor, rnk, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15840

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256689 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-02 15:22:36 +00:00
David Majnemer
19d1ebc725 [X86] Add intrinsics for reading and writing to the flags register
LLVM's targets need to know if stack pointer adjustments occur after the
prologue.  This is needed to correctly determine if the red-zone is
appropriate to use or if a frame pointer is required.

Normally, LLVM can figure this out very precisely by reasoning about the
contents of the MachineFunction.  There is an interesting corner case:
inline assembly.

The vast majority of inline assembly which will perform a push or pop is
done so to pair up with pushf or popf as appropriate.  Unfortunately,
this inline assembly doesn't mark the stack pointer as clobbered
because, well, it isn't.  The stack pointer is decremented and then
immediately incremented.  Because of this, LLVM was changed in r256456
to conservatively assume that inline assembly contain a sequence of
stack operations.  This is unfortunate because the vast majority of
inline assembly will not end up manipulating the stack pointer in any
way at all.

Instead, let's provide a more principled solution: an intrinsic.
FWIW, other compilers (MSVC and GCC among them) also provide this
functionality as an intrinsic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256685 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-01 06:50:01 +00:00