Commit Graph

29883 Commits

Author SHA1 Message Date
Bill Schmidt
fa840e7dfb Correct faulty merge of r214923 due to echristo's subversion changes in trunk
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214927 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 21:15:38 +00:00
Bill Schmidt
e342c688a1 Merging r214923:
------------------------------------------------------------------------
r214923 | wschmidt | 2014-08-05 15:47:25 -0500 (Tue, 05 Aug 2014) | 12 lines

[PowerPC] Swap arguments and adjust shift count for vsldoi on little endian

Commits r213915 and r214718 fix recognition of shuffle masks for vmrg*
and vpku*um instructions for a little-endian target, by swapping the
input arguments.  The vsldoi instruction requires similar treatment,
and also needs its shift count adjusted for little endian.

Reviewed by Ulrich Weigand.

This is a bug fix candidate for release 3.5 (and hopefully the last of
those for PowerPC).

------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214926 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 20:59:06 +00:00
Tom Stellard
97fd0f6a14 Merging r214865:
------------------------------------------------------------------------
r214865 | thomas.stellard | 2014-08-05 10:40:52 -0400 (Tue, 05 Aug 2014) | 5 lines

R600/SI: Avoid generating REGISTER_LOAD instructions.

SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code
path for 8-bit and 16-bit private loads.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214895 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:38:25 +00:00
Tom Stellard
cd4bff761f Merging r214463:
------------------------------------------------------------------------
r214463 | thomas.stellard | 2014-07-31 20:32:28 -0400 (Thu, 31 Jul 2014) | 7 lines

R600/SI: Fix incorrect commute operation in shrink instructions pass

We were commuting the instruction by still shrinking it using the
original opcode.

NOTE: This is a candidate for the 3.5 branch.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214894 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:38:23 +00:00
Bill Wendling
169c2e7a89 Merging r213799:
------------------------------------------------------------------------
r213799 | grosbach | 2014-07-23 13:41:38 -0700 (Wed, 23 Jul 2014) | 5 lines

X86: restrict combine to when type sizes are safe.

The folding of unary operations through a vector compare and mask operation
is only safe if the unary operation result is of the same size as its input.
For example, it's not safe for [su]itofp from v4i32 to v4f64.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214841 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:20:22 +00:00
Bill Schmidt
81e74bc0ca Fix incorrectly resolved merge conflict
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214822 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:47:21 +00:00
Bill Schmidt
710e7192ab Merging r214800:
------------------------------------------------------------------------
r214800 | wschmidt | 2014-08-04 18:21:01 -0500 (Mon, 04 Aug 2014) | 13 lines

[PPC64LE] Fix wrong IR for vec_sld and vec_vsldoi

My original LE implementation of the vsldoi instruction, with its
altivec.h interfaces vec_sld and vec_vsldoi, produces incorrect
shufflevector operations in the LLVM IR.  Correct code is generated
because the back end handles the incorrect shufflevector in a
consistent manner.

This patch and a companion patch for Clang correct this problem by
removing the fixup from altivec.h and the corresponding fixup from the
PowerPC back end.  Several test cases are also modified to reflect the
now-correct LLVM IR.

------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214821 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:44:59 +00:00
Bill Schmidt
b777868f63 Merging r214718:
------------------------------------------------------------------------
r214718 | uweigand | 2014-08-04 08:53:40 -0500 (Mon, 04 Aug 2014) | 12 lines

[PowerPC] Swap arguments to vpkuhum/vpkuwum on little-endian

In commit r213915, Bill fixed little-endian usage of vmrgh* and vmrgl*
by swapping the input arguments.  As it turns out, the exact same fix
is also required for the vpkuhum/vpkuwum patterns.

This fixes another regression in llvmpipe when vector support is
enabled.

Reviewed by Bill Schmidt.


------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214819 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:42:09 +00:00
Bill Schmidt
c5ca284e85 Merging r214716:
------------------------------------------------------------------------
r214716 | uweigand | 2014-08-04 08:27:12 -0500 (Mon, 04 Aug 2014) | 9 lines

[PowerPC] MULHU/MULHS are not legal for vector types

I ran into some test failures where common code changed vector division
by constant into a multiply-high operation (MULHU).  But these are not
implemented by the back-end, so we failed to recognize the insn.

Fixed by marking MULHU/MULHS as Expand for vector types.


------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214818 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:41:27 +00:00
Bill Schmidt
39f807fc9f Merging r214714:
------------------------------------------------------------------------
r214714 | uweigand | 2014-08-04 08:13:57 -0500 (Mon, 04 Aug 2014) | 19 lines

[PowerPC] Fix and improve vector comparisons

This patch refactors code generation of vector comparisons.

This fixes a wrong code-gen bug for ISD::SETGE for floating-point types,
and improves generated code for vector comparisons in general.

Specifically, the patch moves all logic deciding how to implement vector
comparisons into getVCmpInst, which gets two extra boolean outputs
indicating to its caller whether its needs to swap the input operands
and/or negate the result of the comparison.  Apart from implementing
these two modifications as directed by getVCmpInst, there is no need
to ever implement vector comparisons in any other manner; in particular,
there is never a need to perform two separate comparisons (e.g. one for
equal and one for greater-than, as code used to do before this patch).

Reviewed by Bill Schmidt.


------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214817 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:37:33 +00:00
Bill Wendling
7f154753ce Merging r213665:
------------------------------------------------------------------------
r213665 | tnorthover | 2014-07-22 08:47:09 -0700 (Tue, 22 Jul 2014) | 11 lines

X86: drop relocations on __eh_frame sections globally.

Without this, we produce non-extern relocations when targeting older OS X
versions that ld64 can't cope with in the particular context of __eh_frame
sections (who'd want generic relocation-processing anyway?).

This means that an updated linker (ld64 from Xcode 3.2.6 or later) may be
needed when targeting such platforms with a modern version of LLVM, but this is
probably the case anyway and a reasonable requirement.

PR20212, rdar://problem/17544795
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214689 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 04:29:47 +00:00
Bill Wendling
ed44a9d567 Merging r213896:
------------------------------------------------------------------------
r213896 | compnerd | 2014-07-24 15:09:06 -0700 (Thu, 24 Jul 2014) | 6 lines

Target: invert condition for Windows

The Microsoft ABI and MSVCRT are considered the canonical C runtime and ABI.
The long double routines are not part of this environment.  However, cygwin and
MinGW both provide supplementary implementations.  Change the condition to
reflect this reality.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214687 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 04:28:05 +00:00
Bill Wendling
7eef54e612 Merging r213883:
------------------------------------------------------------------------
r213883 | compnerd | 2014-07-24 10:46:36 -0700 (Thu, 24 Jul 2014) | 5 lines

X86: correct library call setup for Windows itanium

This target is identical to the Windows MSVC (and follows Microsoft ABI for C).
Correct the library call setup for this target.  The same set of library calls
are missing on this environment.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214686 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 04:27:37 +00:00
Bill Wendling
0275b43b19 Merging r213899:
------------------------------------------------------------------------
r213899 | joerg | 2014-07-24 15:20:10 -0700 (Thu, 24 Jul 2014) | 2 lines

Don't use 128bit functions on PPC32.

------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214685 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 04:25:53 +00:00
Justin Holewinski
031d076156 Merging r213793:
------------------------------------------------------------------------
r213793 | jholewinski | 2014-07-23 16:23:47 -0400 (Wed, 23 Jul 2014) | 4 lines

[NVPTX] Silence a GCC warning found by the buildbots

The cast to NVPTXTargetLowering was missing a 'const', but let's
just access the right pointer through the subtarget anyway.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214310 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 14:53:00 +00:00
Justin Holewinski
100e892cf5 Merging r213773:
------------------------------------------------------------------------
r213773 | jholewinski | 2014-07-23 13:40:45 -0400 (Wed, 23 Jul 2014) | 5 lines

[NVPTX] Make sure we do not generate MULWIDE ISD nodes when optimizations are disabled

With optimizations disabled, we disable the isel patterns for mul.wide; but we
were still generating MULWIDE ISD nodes.  Now, we only try to generate MULWIDE
ISD nodes in DAGCombine if the optimization level is not zero.
------------------------------------------------------------------------



git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214309 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 14:49:09 +00:00
Daniel Sanders
29750a28c3 Merging r214180:
------------------------------------------------------------------------
r214180 | sstankovic | 2014-07-29 15:39:24 +0100 (Tue, 29 Jul 2014) | 5 lines

[mips] Don't use odd-numbered single precision registers for fastcc calling
convention if -mno-odd-spreg is used.

Differential Revision: http://reviews.llvm.org/D4682

------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@214304 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 12:39:37 +00:00
Bill Wendling
15a35f9e32 Merging r213915:
------------------------------------------------------------------------
r213915 | wschmidt | 2014-07-24 18:55:55 -0700 (Thu, 24 Jul 2014) | 21 lines

[PATCH][PPC64LE] Correct little-endian usage of vmrgh* and vmrgl*.

Because the PowerPC vmrgh* and vmrgl* instructions have a built-in
big-endian bias, it is necessary to swap their inputs in little-endian
mode when using them to implement a vector shuffle.  This was
previously missed in the vector LE implementation.

There was already logic to distinguish between unary and "normal"
vmrg* vector shuffles, so this patch extends that logic to use a third
option:  "swapped" vmrg* vector shuffles that are used for little
endian in place of the "normal" ones.

I've updated the vec-shuffle-le.ll test to check for the expected
register ordering on the generated instructions.

This bug was discovered when testing the LE and ELFv2 patches for
safety if they were backported to 3.4.  A different vectorization
decision was made in 3.4 than on mainline trunk, and that exposed the
problem.  I've verified this fix takes care of that issue.


------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@213961 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-25 17:47:30 +00:00
Filipe Cabecinhas
0fd6bdb766 Merge r213826
------------------------------------------------------------------------
r213826 | filcab | 2014-07-23 18:28:21 -0700 (Wed, 23 Jul 2014) | 7 lines

Fixed PR20411 - bug in getINSERTPS()

When we had a vector_shuffle where we had an input from each vector, we
could miscompile it because we were assuming the input from V2 wouldn't
be moved from where it was on the vector.

Added a test case.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@213911 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-25 00:00:11 +00:00
Daniel Sanders
dfd65928bd Merging r213847:
------------------------------------------------------------------------
r213847 | dsanders | 2014-07-24 10:47:14 +0100 (Thu, 24 Jul 2014) | 8 lines

[mips] Fix ll and sc instructions

Summary: The ll and sc instructions for r6 and non-r6 are misplaced. This patch fixes that.

Patch by Jyun-Yan You

Differential Revision: http://reviews.llvm.org/D4578

------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@213848 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 09:48:54 +00:00
Daniel Sanders
6c00f26957 Merging r213653:
------------------------------------------------------------------------
r213653 | sstankovic | 2014-07-22 14:36:02 +0100 (Tue, 22 Jul 2014) | 7 lines

[mips] Fix two patterns that select i32's (for MIPS32r6) / i64's (for MIPS64r6)
from setne comparison with an i32.

The patterns that are fixed:
  * (select (i32 (setne i32, immZExt16)), i32, i32) (for MIPS32r6)
  * (select (i32 (setne i32, immZExt16)), i64, i64) (for MIPS64r6)

------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@213746 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 12:45:33 +00:00
Saleem Abdulrasool
7ed655da8c R600: silence GCC warning
GCC believes it may be possible to not return a value from the switch:
  lib/Target/R600/SIRegisterInfo.cpp:187:1: warning: control reaches end of non-void function [-Wreturn-type]

Add an unreachable label to indicate that this is not possible and still permit
switch coverage checking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213572 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:52:00 +00:00
Tom Stellard
163d8ce61f R600/SI: Refactor VOP3 instruction definitions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213571 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:44:29 +00:00
Tom Stellard
3ee2c33655 R600/SI: Separate encoding and operand definitions into their own classes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213570 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:44:28 +00:00
Tom Stellard
d7858afe79 R600/SI: Initailize encoding fields of unused VOP3 modifiers to 0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213564 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:12:40 +00:00
Tom Stellard
0794af86a1 R600/SI: Initialize unused VOP3 sources to 0 instead of SIOperand.ZERO
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213563 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 17:12:37 +00:00
Tom Stellard
9787e8c76b R600/SI: Add instruction shrinking pass
This pass converts 64-bit instructions to 32-bit when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213561 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 16:55:33 +00:00
Tom Stellard
df99a7f5dc R600/SI: VOPC instructions explicitly define VCC
Therefore we don't need to add it to the implict defs list.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213558 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 16:27:24 +00:00
Tom Stellard
05388f25d7 R600/SI: Clean up some of the unused REGISTER_{LOAD,STORE} code
There are a few more cleanups to do, but I ran into some problems
with ext loads and trunc stores, when I tried to change some of the
vector loads and stores from custom to legal, so I wasn't able to
get rid of everything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213552 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 15:45:06 +00:00
Tom Stellard
3280804237 R600/SI: Use scratch memory for large private arrays
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213551 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 15:45:01 +00:00
Tom Stellard
c912b101d2 R600/SI: Specify wavefront size for SI and CI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213550 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 15:44:58 +00:00
Tom Stellard
59b8363f8a R600/SI: Remove vaddr operand from BUFFER_LOAD_*_OFFSET instructions
This operand is never used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213549 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 15:44:55 +00:00
Daniel Sanders
2479def756 [mips] Do not emit '.module fp=...' unless we really need to.
We now emit this value when we need to contradict the default value. This
restores support for binutils 2.24.

When a suitable binutils has been released we can resume unconditionally
emitting .module directives. This is preferable to omitting the .module
directives since the .module directives protect against, for example,
accidentally assembling FP32 code with -mfp64 and producing an unusuable object.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213548 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 15:25:24 +00:00
Robert Khasanov
aac33cfc08 [SKX] Enabling SKX target and AVX512BW, AVX512DQ, AVX512VL features.
Enabling HasAVX512{DQ,BW,VL} predicates.
Adding VK2, VK4, VK32, VK64 masked register classes.
Adding new types (v64i8, v32i16) to VR512.
Extending calling conventions for new types (v64i8, v32i16)

Patch by Zinovy Nis <zinovy.y.nis@intel.com>
Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213545 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:54:21 +00:00
Tom Stellard
b664d47cb0 R600/SI: Store constant initializer data in constant memory
This implements a solution for constant initializers suggested
by Vadim Girlin, where we store the data after the shader code
and then use the S_GETPC instruction to compute its address.

This saves use the trouble of creating a new buffer for constant data
and then having to pass the pointer to the kernel via user SGPRs or the
input buffer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213530 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:01:14 +00:00
Tom Stellard
b97240dba8 R600/SI: Add isCFDepth0 Predicate to SALU addc pattern
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213529 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:01:12 +00:00
Tom Stellard
54a2540fee R600/SI: Use VALU for i1 XOR
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213528 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:01:10 +00:00
Tom Stellard
ad9769a118 R600/SI: Use a custom encoding method for simm16 in SOPP branch instructions
This allows us to explicitly define the type of fixup that is needed,
so we can distinguish this from future fixup types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213527 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:01:08 +00:00
Tom Stellard
efb733cbba R600/SI: Rename SOPP operands to match the encoding fields
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213526 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:01:05 +00:00
Daniel Sanders
6816d66d99 [mips] Add MipsOptionRecord abstraction and use it to implement .reginfo/.MIPS.options
This abstraction allows us to support the various records that can be placed in
the .MIPS.options section in the future. We currently use it to record register
usage information (the ODK_REGINFO record in our ELF64 spec).

Each .MIPS.options record should subclass MipsOptionRecord and provide an
implementation of EmitMipsOptionRecord.

Patch by Matheus Almeida and Toma Tabacu



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213522 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 13:30:55 +00:00
Daniel Sanders
34e658840e [mips] Try to fix the test/ExecutionEngine tests on a MIPS host.
Fix a dangerous default case that caused MipsCodeEmitter to discard pseudo
instructions it didn't recognize. It will now call llvm_unreachable() for
unrecognized pseudo's and explicitly handles PseudoReturn, PseudoReturn64,
PseudoIndirectBranch, PseudoIndirectBranch64, CFI_INSTRUCTION, IMPLICIT_DEF,
and KILL.

There may be other pseudos that need handling but this was enough for the
ExecutionEngine tests to pass on my test system.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213513 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 12:25:34 +00:00
Daniel Sanders
8ecdc6d34f [mips] Do not emit '.module [no]oddspreg' unless we really need to.
We now emit this directive when we need to contradict the default value (e.g.
-mno-odd-spreg is given) or an option changed the default value (e.g. -mfpxx
is given).

This restores support for the currently available head of binutils. However,
at this point binutils 2.24 is still not sufficient since it does not support
'.module fp=...'.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213511 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 10:45:47 +00:00
Tim Northover
f8d927f22b CodeGen: emit IR-level f16 conversion intrinsics as fptrunc/fpext
This makes the first stage DAG for @llvm.convert.to.fp16 an fptrunc,
and correspondingly @llvm.convert.from.fp16 an fpext. The legalisation
path is now uniform, regardless of the input IR:

  fptrunc -> FP_TO_FP16 (if f16 illegal) -> libcall
  fpext -> FP16_TO_FP (if f16 illegal) -> libcall

Each target should be able to select the version that best matches its
operations and not be required to duplicate patterns for both fptrunc
and FP_TO_FP16 (for example).

As a result we can remove some redundant AArch64 patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213507 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 09:13:56 +00:00
Andrea Di Biagio
3d1975d44b [DAG] Refactor some logic. No functional change.
This patch removes function 'CommuteVectorShuffle' from X86ISelLowering.cpp
and moves its logic into SelectionDAG.cpp as method 'getCommutedVectorShuffles'.
This refactoring is in preperation of an upcoming change to the DAGCombiner.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213503 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 07:28:51 +00:00
Ulrich Weigand
d4542a8cdc [PowerPC] ELFv2 aggregate passing support
This patch adds infrastructure support for passing array types
directly.  These can be used by the front-end to pass aggregate
types (coerced to an appropriate array type).  The details of the
array type being used inform the back-end about ABI-relevant
properties.  Specifically, the array element type encodes:
- whether the parameter should be passed in FPRs, VRs, or just
  GPRs/stack slots  (for float / vector / integer element types,
  respectively)
- what the alignment requirements of the parameter are when passed in
  GPRs/stack slots  (8 for float / 16 for vector / the element type
  size for integer element types) -- this corresponds to the
  "byval align" field

Using the infrastructure provided by this patch, a companion patch
to clang will enable two features:
- In the ELFv2 ABI, pass (and return) "homogeneous" floating-point
  or vector aggregates in FPRs and VRs (this is similar to the ARM
  homogeneous aggregate ABI)
- As an optimization for both ELFv1 and ELFv2 ABIs, pass aggregates
  that fit fully in registers without using the "byval" mechanism

The patch uses the functionArgumentNeedsConsecutiveRegisters callback
to encode that special treatment is required for all directly-passed
array types.  The isInConsecutiveRegs / isInConsecutiveRegsLast bits set
as a results are then used to implement the required size and alignment
rules in CalculateStackSlotSize / CalculateStackSlotAlignment etc.

As a related change, the ABI routines have to be modified to support
passing floating-point types in GPRs.  This is necessary because with
homogeneous aggregates of 4-byte float type we can now run out of FPRs
*before* we run out of the 64-byte argument save area that is shadowed
by GPRs.  Any extra floating-point arguments that no longer fit in FPRs
must now be passed in GPRs until we run out of those too.

Note that there was already code to pass floating-point arguments in
GPRs used with vararg parameters, which was done by writing the argument
out to the argument save area first and then reloading into GPRs.  The
patch re-implements this, however, in favor of code packing float arguments
directly via extension/truncation, BITCAST, and BUILD_PAIR operations.

This is required to support the ELFv2 ABI, since we cannot unconditionally
write to the argument save area (which the caller might not have allocated).
The change does, however, affect ELFv1 varags routines too; but even here
the overall effect should be advantageous: Instead of loading the argument
into the FPR, then storing the argument to the stack slot, and finally
reloading the argument from the stack slot into a GPR, the new code now
just loads the argument into the FPR, and subsequently loads the argument
into the GPR (via BITCAST).  That BITCAST might imply a save/reload from
a stack temporary (in which case we're no worse than before); but it
might be implemented more efficiently in some cases.

The final part of the patch enables up to 8 FPRs and VRs for argument
return in PPCCallingConv.td; this is required to support returning
ELFv2 homogeneous aggregates.  (Note that this doesn't affect other ABIs
since LLVM wil only look for which register to use if the parameter is
marked as "direct" return anyway.)

Reviewed by Hal Finkel.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213493 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 00:13:26 +00:00
Ulrich Weigand
970c019d02 [PowerPC] ELFv2 explicit CFI for CR fields
This is a minor improvement in the ELFv2 ABI.   In ELFv1, DWARF CFI
would represent a saved CR word (holding CR fields CR2, CR3, and CR4)
using just a single CFI record refering to CR2.   In ELFv2 instead,
each of the CR fields is represented by its own CFI record.  The
advantage is that the compiler can now chose to save just a single
(or two) CR fields instead of all of them, if those are the only ones
that actually need saving.  That can lead to more efficient code using
mf(o)crf instead of the (slow) mfcr instruction.

Note that this patch does not (yet) implement this more efficient
code generation, but it does implement the part that is required to
be ABI compliant: creating multiple CFI records if multiple CR fields
are saved.

Reviewed by Hal Finkel.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213492 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 00:03:18 +00:00
Ulrich Weigand
7fc5011e8d [PowerPC] ELFv2 stack space reduction
The ELFv2 ABI reduces the amount of stack required to implement an
ABI-compliant function call in two ways:
* the "linkage area" is reduced from 48 bytes to 32 bytes by
  eliminating two unused doublewords
* the 64-byte "parameter save area" is now optional and need not be
  present in certain cases (it remains mandatory in functions with
  variable arguments, and functions that have any parameter that is
  passed on the stack)

The following patch implements this required changes:
- reducing the linkage area, and associated relocation of the TOC save
  slot, in getLinkageSize / getTOCSaveOffset (this requires updating all
  callers of these routines to pass in the isELFv2ABI flag).
- (partially) handling the case where the parameter save are is optional

This latter part requires some extra explanation:  Currently, we still
always allocate the parameter save area when *calling* a function.
That is certainly always compliant with the ABI, but may cause code to
allocate stack unnecessarily.  This can be addressed by a follow-on
optimization patch.

On the *callee* side, in LowerFormalArguments, we *must* track
correctly whether the ABI guarantees that the caller has allocated
the parameter save area for our use, and the patch does so. However,
there is one complication: the code that handles incoming "byval"
arguments will currently *always* write to the parameter save area,
because it has to force incoming register arguments to the stack since
it must return an *address* to implement the byval semantics.

To fix this, the patch changes the LowerFormalArguments code to write
arguments to a freshly allocated stack slot on the function's own stack
frame instead of the argument save area in those cases where that area
is not present.

Reviewed by Hal Finkel.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213490 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-20 23:43:15 +00:00
Ulrich Weigand
edfd4f18bc [PowerPC] ELFv2 function call changes
This patch builds upon the two preceding MC changes to implement the
basic ELFv2 function call convention.  In the ELFv1 ABI, a "function
descriptor" was associated with every function, pointing to both the
entry address and the related TOC base (and a static chain pointer
for nested functions).  Function pointers would actually refer to that
descriptor, and the indirect call sequence needed to load up both entry
address and TOC base.

In the ELFv2 ABI, there are no more function descriptors, and function
pointers simply refer to the (global) entry point of the function code.
Indirect function calls simply branch to that address, after loading it
up into r12 (as required by the ABI rules for a global entry point).
Direct function calls continue to just do a "bl" to the target symbol;
this will be resolved by the linker to the local entry point of the
target function if it is local, and to a PLT stub if it is global.
That PLT stub would then load the (global) entry point address of the
final target into r12 and branch to it.  Note that when performing a
local function call, r2 must be set up to point to the current TOC
base: if the target ends up local, the ABI requires that its local
entry point is called with r2 set up; if the target ends up global,
the PLT stub requires that r2 is set up.

This patch implements all LLVM changes to implement that scheme:
- No longer create a function descriptor when emitting a function
  definition (in EmitFunctionEntryLabel)
- Emit two entry points *if* the function needs the TOC base (r2)
  anywhere (this is done EmitFunctionBodyStart; note that this cannot
  be done in EmitFunctionBodyStart because the global entry point
  prologue code must be *part* of the function as covered by debug info).
- In order to make use tracking of r2 (as needed above) work correctly,
  mark direct function calls as implicitly using r2.
- Implement the ELFv2 indirect function call sequence (no function
  descriptors; load target address into r12).
- When creating an ELFv2 object file, emit the .abiversion 2 directive
  to tell the linker to create the appropriate version of PLT stubs.  

Reviewed by Hal Finkel.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213489 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-20 23:31:44 +00:00
Ulrich Weigand
76fcace66e [MC] Pass MCSymbolData to needsRelocateWithSymbol
As discussed in a previous checking to support the .localentry
directive on PowerPC, we need to inspect the actual target symbol
in needsRelocateWithSymbol to make the appropriate decision based
on that symbol's st_other bits.

Currently, needsRelocateWithSymbol does not get the target symbol.
However, it is directly available to its sole caller.  This patch
therefore simply extends the needsRelocateWithSymbol by a new
parameter "const MCSymbolData &SD", passes in the target symbol,
and updates all derived implementations.

In particular, in the PowerPC implementation, this patch removes
the FIXME added by the previous checkin.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213487 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-20 23:15:06 +00:00
Ulrich Weigand
5ee5fc4c47 [PowerPC] ELFv2 MC support for .localentry directive
A second binutils feature needed to support ELFv2 is the .localentry
directive.  In the ELFv2 ABI, functions may have two entry points:
one for calling the routine locally via "bl", and one for calling the
function via function pointer (either at the source level, or implicitly
via a PLT stub for global calls).  The two entry points share a single
ELF symbol, where the ELF symbol address identifies the global entry
point address, while the local entry point is found by adding a delta
offset to the symbol address.  That offset is encoded into three
platform-specific bits of the ELF symbol st_other field.

The .localentry directive instructs the assembler to set those fields
to encode a particular offset.  This is typically used by a function
prologue sequence like this:

func:
        addis r2, r12, (.TOC.-func)@ha
        addi r2, r2, (.TOC.-func)@l
        .localentry func, .-func

Note that according to the ABI, when calling the global entry point,
r12 must be set to point the global entry point address itself; while
when calling the local entry point, r2 must be set to point to the TOC
base.  The two instructions between the global and local entry point in
the above example translate the first requirement into the second.

This patch implements support in the PowerPC MC streamers to emit the
.localentry directive (both into assembler and ELF object output), as
well as support in the assembler parser to parse that directive.

In addition, there is another change required in MC fixup/relocation
handling to properly deal with relocations targeting function symbols
with two entry points: When the target function is known local, the MC
layer would immediately handle the fixup by inserting the target
address -- this is wrong, since the call may need to go to the local
entry point instead.  The GNU assembler handles this case by *not*
directly resolving fixups targeting functions with two entry points,
but always emits the relocation and relies on the linker to handle
this case correctly.  This patch changes LLVM MC to do the same (this
is done via the processFixupValue routine).

Similarly, there are cases where the assembler would normally emit a
relocation, but "simplify" it to a relocation targeting a *section*
instead of the actual symbol.  For the same reason as above, this
may be wrong when the target symbol has two entry points.  The GNU
assembler again handles this case by not performing this simplification
in that case, but leaving the relocation targeting the full symbol,
which is then resolved by the linker.  This patch changes LLVM MC
to do the same (via the needsRelocateWithSymbol routine).
NOTE: The method used in this patch is overly pessimistic, since the
needsRelocateWithSymbol routine currently does not have access to the
actual target symbol, and thus must always assume that it might have
two entry points.  This will be improved upon by a follow-on patch
that modifies common code to pass the target symbol when calling
needsRelocateWithSymbol.

Reviewed by Hal Finkel.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213485 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-20 23:06:03 +00:00