15319 Commits

Author SHA1 Message Date
JF Bastien
c2a3785141 WebAssembly: remove 'external' from test
Summary: Linker testing was sad at seeing an unresolved external symbol. For now don't do that: it's valid but we're not playing with multi-file linking yet, and the LLVM tests are used as hacky sanity tests for single-file linking (the GCC torture tests are much better for this purpose). Another solution would be to use '.extern' to make the intent explicit (don't simple-file link this, there's an unresolved symbol), some assemblers use '.extern' while others ignore it, so we wouldn't really be inventing anything new.

Reviewers: sunfish, kripken

Subscribers: jfb, llvm-commits, dschuff

Differential Revision: http://reviews.llvm.org/D15753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256353 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 23:56:13 +00:00
Philip Reames
da801219ba [Statepoints] Use Indirect operands for spill slots
Teach the statepoint lowering code to emit Indirect stackmap entries for spill inserted by StatepointLowering (i.e. SelectionDAG), but Direct stackmap entries for in-IR allocas which represent manual stack slots. This is what the docs call for (http://llvm.org/docs/StackMaps.html#stack-map-format), but we've been emitting both as Direct. This was pointed out recently on the mailing list as a bug. It also blocks http://reviews.llvm.org/D15632 which extends the lowering to handle vector-of-pointers since only Indirect references can encode a variable sized slot.

To implement this, I introduced a new flag on the StackObject class used to maintian information about stack slots. I original considered (and prototyped in http://reviews.llvm.org/D15632), the idea of using the existing isSpillSlot flag, but end up deciding that was a bit too risky and that the cost of adding a new flag was low. Having the new flag will also allow us - in the future - to emit better comments in verbose assembly which indicate where a particular stack spill around a call comes from. (deopt, gc, regalloc).

Differential Revision: http://reviews.llvm.org/D15759



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256352 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 23:44:28 +00:00
Simon Pilgrim
023935c3d9 [X86][AVX] Only shuffle the lower half of vectors if the upper half is undefined
First step towards making better use of AVX's implicit zeroing of the upper half of a 256-bit vector by instructions that only act on the lower 128-bit vector - discussed on D14151.

As well as the fact that 128-bit shuffle instructions are generally more capable, this can be performant for older CPUs with 128-bit ALUs (e.g. Jaguar, Sandy Bridge) that must treat 256-bit vectors as multiple micro-ops.

Moved the similar subvector extraction shuffle combines from PerformShuffleCombine256 to lowerVectorShuffle as well.

Note: I've avoided combining shuffles that reference elements from the upper halves of the input vectors - this may be reviewed in future work as well (AVX1 would probably always gain, but AVX2 does have some cross-lane shuffle instructions).

Differential Revision: http://reviews.llvm.org/D15477

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256332 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 13:10:07 +00:00
Igor Breger
d34ae248c0 AVX512BW: Enable packed word shift for 512bit vector. Enable lowering scalar immidiate shift v64i8 .Fix predicate for AVX1/2 shifts.
Differential Revision: http://reviews.llvm.org/D15713

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256324 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 08:06:50 +00:00
David Majnemer
0f16f3c826 [WinEH] Don't visit the same catchswitch twice
We visited the same catchswitch twice because it was both the child of
another funclet and the predecessor of a cleanuppad.

Instead, change the numbering algorithm to only recurse if the unwind
destination of the inner funclet agrees with the unwind destination of
the catchswitch.

This fixes PR25926.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256317 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 03:59:04 +00:00
Changpeng Fang
89e60598f6 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

  NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256282 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 20:55:23 +00:00
Rafael Espindola
a00544a653 Revert "AMDGPU/SI: Use flat for global load/store when targeting HSA"
This reverts commit r256273.

It broke CodeGen/AMDGPU/llvm.dbg.value.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256275 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 19:46:44 +00:00
Changpeng Fang
808f9643e6 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256273 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 19:32:28 +00:00
Jun Bum Lim
3ff9f0996f [AArch64] Promote loads from stored
This is a recommit of r256004 which was reverted in r256160. The issue was the
incorrect promotion for half and byte loads transformed into mov instructions.
This fix will replace half and byte type loads only with bit field extracts.

Original commit message:

This change promotes load instructions which directly read from stored by
replacing them with mov instructions. If the store is wider than the load,
the load will be replaced with a bitfield extract.
For example :
  STRWui %W1, %X0, 1
  %W0 = LDRHHui %X0, 3
becomes
  STRWui %W1, %X0, 1
  %W0 = UBFMWri %W1, 16, 31

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256249 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 16:36:16 +00:00
Asaf Badouh
4025d5b8e1 [X86][AVX512] Add rcp14 and rsqrt14 intrinsics
Differential Revision: http://reviews.llvm.org/D15414



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256237 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 11:40:04 +00:00
Keno Fischer
f580a6229f [ASMPrinter] Fix missing handling of DW_OP_bit_piece
In r256077, I added printing for DIExpressions in DEBUG_VALUE comments,
but neglected to handle DW_OP_bit_piece operands. Thanks to
Mikael Holmen and Joerg Sonnenberger for spotting this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256236 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 07:14:50 +00:00
David Majnemer
f31d06a13f [MC] Don't use the architecture to govern which object file format to use
InitMCObjectFileInfo was trying to override the triple in awkward ways.
For example, a triple specifying COFF but not Windows was forced as ELF.
This makes it easy for internal invariants to get violated, such as
those which triggered PR25912.

This fixes PR25912.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256226 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 01:39:04 +00:00
Jun Bum Lim
e7f0062250 Enhance BranchProbabilityInfo::calcUnreachableHeuristics for InvokeInst
This is recommit of r256028 with minor fixes in unittests:
  CodeGen/Mips/eh.ll
  CodeGen/Mips/insn-zero-size-bb.ll

Original commit message:

When identifying blocks post-dominated by an unreachable-terminated block
in BranchProbabilityInfo, consider only the edge to the normal destination
block if the terminator is InvokeInst and let calcInvokeHeuristics() decide
edge weights for the InvokeInst.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256202 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-21 22:00:51 +00:00
Cong Hou
0d04e87535 [X86][SSE] Transform truncations between vectors of integers into X86ISD::PACKUS/PACKSS operations during DAG combine.
This patch transforms truncation between vectors of integers into
X86ISD::PACKUS/PACKSS operations during DAG combine. We don't do it in
lowering phase because after type legalization, the original truncation
will be turned into a BUILD_VECTOR with each element that is extracted
from a vector and then truncated, and from them it is difficult to do
this optimization. This greatly improves the performance of truncations
on some specific types.

Cost table is updated accordingly.


Differential revision: http://reviews.llvm.org/D14588




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256194 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-21 20:42:43 +00:00
Adrian Prantl
6932e6dcc1 Convert the CodeGen/ARM/sched-it-debug-nodes.ll testcase from IR -> MIR.
NFC
PR24563

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256187 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-21 19:44:42 +00:00
Adrian Prantl
7deb7ebf88 Teach ARMLoadStoreOptimizer to ignore DBG_VALUE instructions when merging
instructions.

As noted in PR24563.
rdar://problem/23963293

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256183 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-21 19:25:03 +00:00
Matthew Simpson
0ce5d69ee3 [AArch64] Add additional extract-extend patterns for smov
This patch adds to the target description two additional patterns for matching
extract-extend operations to SMOV. The patterns catch the v16i8-to-i64 and
v8i16-to-i64 cases. The existing patterns miss these cases because the
extracted elements must first be legalized to i32, resulting in any_extend
nodes.

This was originally implemented as a DAG combine (r255895), but was reverted
due to failing out-of-tree tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256176 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-21 18:31:25 +00:00
Jun Bum Lim
35733957df Revert "[AArch64] Promote loads from stores"
This reverts commit r256004 due to a failure in cortex-a53.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256160 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-21 15:36:49 +00:00
Chad Rosier
2835413343 [AArch64] Enable PostRAScheduler for AArch64 generic build.
Disable post-ra scheduler for perturbed tests to appease the bots and to
preserve the history of the tests.

http://reviews.llvm.org/D15652

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256158 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-21 14:43:45 +00:00
Igor Breger
3bb8dc4979 AVX512BW: Enable AND/OR/XOR vector byte/word paked operation by promoting to qword that natively suppored.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256157 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-21 14:40:36 +00:00
Amjad Aboud
9889174ead Implemented Support of IA interrupt and exception handlers:
http://lists.llvm.org/pipermail/cfe-dev/2015-September/045171.html

Differential Revision: http://reviews.llvm.org/D15567

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256155 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-21 14:07:14 +00:00
Craig Topper
a1d9ba7078 [X86] Prevent constant hoisting for a couple compare immediates that the selection DAG knows how to optimize into a shift.
This allows "icmp ugt %a, 4294967295" and "icmp uge %a, 4294967296" to be optimized into right shifts by 32 which can fold the immediate into the shift instruction. These patterns show up with some regularity in real code.

Unfortunately, since getImmCost can't see the icmp predicate we can't be tell if we're only catching these specific cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256126 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-20 18:41:54 +00:00
Weiming Zhao
116cc72ed4 Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructions for Thumb2
Summary:
r250697 fixed the mapping for ARM mode. We have to do the same for Thumb2 otherwise the same llvm.arm.ssat() will generate different saturating amount for ARM and Thumb.

r250697: http://reviews.llvm.org/rL250697

Reviewers: rmaprath

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D15653

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256115 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-20 06:41:44 +00:00
JF Bastien
25b4ccf2a6 WebAssembly: add vtable test
The test will mainly be useful to check that the .s file assembles and relocates properly because vtables reference functions in their data section.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256102 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-19 18:55:18 +00:00
Keno Fischer
5c2f1020cc Hopefully fix debug-info-blocks.ll test on win32 bot
llc_dwarf adds an mtriple, which forces this to use COFF, causing
the test to fail. Hopefully using regular llc without the triple
will work fine everywhere

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256084 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-19 03:32:23 +00:00
Keno Fischer
bae9026417 Clean up the processing of dbg.value in various places
Summary:
First up is instcombine, where in the dbg.declare -> dbg.value conversion,
the llvm.dbg.value needs to be called on the actual loaded value, rather
than the address (since the whole point of this transformation is to be
able to get rid of the alloca). Further, now that that's cleaned up, we
can remove a hack in the backend, that would add an implicit OP_deref if
the argument to dbg.value was an alloca. This stems from before the
existence of DIExpression and is no longer necessary since the deref can
be expressed explicitly.

Now, in order to make sure that the tests pass with this change, we need to
correct the printing of DEBUG_VALUE comments to take into account the
expression, which wasn't taken into account before.

Unfortunately, for both these changes, there were a number of incorrect
test cases (mostly the wrong number of DW_OP_derefs, but also a couple
where the test itself was broken more badly). aprantl and I have gone
through and adjusted these test case in order to make them pass with
these fixes and in some cases to make sure they're actually testing
what they are meant to test.

Reviewers: aprantl

Subscribers: dsanders

Differential Revision: http://reviews.llvm.org/D14186

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256077 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-19 02:02:44 +00:00
Matt Arsenault
7aed0ccd46 AMDGPU: Switch barrier intrinsics to using convergent
noduplicate prevents unrolling of small loops that happen to have
barriers in them. If a loop has a barrier in it, it is OK to duplicate
it for the unroll.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256075 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-19 01:46:41 +00:00
Matt Arsenault
4b9d868cc7 Fix broken type legalization of min/max
This was using an anyext when promoting the type
when zext/sext is required.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256074 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-19 01:39:48 +00:00
Nicolai Haehnle
710bb5a598 AMDGPU: fix overlapping copies in copyPhysReg
Summary:
When copying aggregate registers within the same register class, there may
be an overlap between source and destination that forces us to do the copy
backwards.

Do the simplest possible thing that guarantees the correct order of moves
when there are overlaps, and does whatever when there is no overlap. (The
last part forces some trivial adjustments to test cases.)

Together with r255906, this fixes a VM fault in Unreal Elemental Demo.

While at it, change the generation of kill and def flags to something that
looks more reasonable. This method is used very late during compilation, so
it probably doesn't matter in practice, and to be honest, I don't know if
this change is actually correct because the semantics in connection with
aggregate registers vs. sub-registers are not clear to me.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93264

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256072 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-19 01:16:06 +00:00
Rafael Espindola
a4c2a53b4b Revert "Enhance BranchProbabilityInfo::calcUnreachableHeuristics for InvokeInst"
This reverts commit r256028.

It broke:

    LLVM :: CodeGen/Mips/eh.ll
    LLVM :: CodeGen/Mips/insn-zero-size-bb.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256032 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-18 21:23:32 +00:00
Jun Bum Lim
ec63ff81dd Enhance BranchProbabilityInfo::calcUnreachableHeuristics for InvokeInst
When identifying blocks post-dominated by an unreachable-terminated block
in BranchProbabilityInfo, consider only the edge to the normal destination
block if the terminator is InvokeInst and let calcInvokeHeuristics() decide
edge weights for the InvokeInst.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256028 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-18 20:53:47 +00:00
Krzysztof Parzyszek
311fb07b3b [Hexagon] Add PIC support
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256025 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-18 20:19:30 +00:00
Jun Bum Lim
cb2bad780b [AArch64] Promote loads from stores
This change promotes load instructions which directly read from stores by
replacing them with mov instructions. If the store is wider than the load,
the load will be replaced with a bitfield extract.
For example :
  STRWui %W1, %X0, 1
  %W0 = LDRHHui %X0, 3
becomes
  STRWui %W1, %X0, 1
  %W0 = UBFMWri %W1, 16, 31

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256004 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-18 18:08:30 +00:00
Hans Wennborg
449ad4ca3b [X86] Use push-pop for materializing small constants under 'minsize'
Use the 3-byte (4 with REX prefix) push-pop sequence for materializing
small constants. This is smaller than using a mov (5, 6 or 7 bytes
depending on size and REX prefix), but it's likely to be slower, so
only used for 'minsize'.

This is a follow-up to r255656.

Differential Revision: http://reviews.llvm.org/D15549

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255936 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-17 23:18:39 +00:00
Matthew Simpson
30e21e7109 Revert "[AArch64] Add DAG combine for extract extend pattern"
This reverts commit r255895. The patch breaks internal tests. Reverting until a
fix is ready.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255928 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-17 21:29:47 +00:00
Dan Gohman
adfb389708 [WebAssembly] Switch WebAssemblyMCAsmInfo.h from MCAsmInfo to MCAsmInfoELF.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255925 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-17 20:50:45 +00:00
Tom Stellard
55618d8f5c AMDGPU/SI: Reserve appropriate number of sgprs for flat scratch init.
Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15583

Patch by: Changpeng Fang

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255908 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-17 17:05:09 +00:00
Matthew Simpson
dbf81cc034 [AArch64] Add DAG combine for extract extend pattern
This patch adds a DAG combine for (any_extend (extract_vector_elt v, i)) ->
(extract_vector_elt v, i). The combine enables us to better match some SMOV
patterns.

Differential Revision: http://reviews.llvm.org/D15515

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255895 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-17 14:30:55 +00:00
Alexey Bataev
d5572b6d8d [X86] Add option for enabling LEA optimization pass, by Andrey Turetsky
Add option to enable/disable LEA optimization pass. By default the pass is disabled.
Differential Revision: http://reviews.llvm.org/D15573


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255881 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-17 07:34:39 +00:00
Cong Hou
51a9d17bfc Fix PR25838.
This is a quick fix to PR25838. The issue comes from the restriction that we
cannot normalize probabilities containing both known and unknown ones. A patch
that removes this restriction is under the review now:

http://reviews.llvm.org/D15548



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255867 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-17 01:29:08 +00:00
Dan Gohman
f2640be5df [WebAssembly] Fix legalization of shift operators on large integer types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255847 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 23:25:51 +00:00
Derek Schuff
6b90fe4207 [WebAssembly] Implement eliminateCallFramePseudo
Summary:
Implement eliminateCallFramePsuedo to handle ADJCALLSTACKUP/DOWN
pseudo-instructions. Add a test calling a vararg function which causes non-0
adjustments. This revealed an issue with RegisterCoalescer wherein it
eliminates a COPY from SP32 to a vreg but failes to update the live ranges
of EXPR_STACK, causing a machineinstr verifier failure (so this test
is commented out).

Also add a dynamic alloca test, which causes a callseq_end dag node with
a 0 (instead of undef) second argument to be generated. We currently fail to
select that, so adjust the ADJCALLSTACKUP tablegen code to handle it.

Differential Revision: http://reviews.llvm.org/D15587

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255844 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 23:21:30 +00:00
Eric Christopher
2f9965d134 Fix funciton->function typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255841 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 23:10:53 +00:00
Manman Ren
3559ef2a36 CXX_FAST_TLS calling convention: performance improvement for AArch64.
The access function has a short entry and a short exit, the initialization
block is only run the first time. To improve the performance, we want to
have a short frame at the entry and exit.

We explicitly handle most of the CSRs via copies. Only the CSRs that are not
handled via copies will be in CSR_SaveList.

Frame lowering and prologue/epilogue insertion will generate a short frame
in the entry and exit according to CSR_SaveList. The majority of the CSRs will
be handled by register allcoator. Register allocator will try to spill and
reload them in the initialization block.

We add CSRsViaCopy, it will be explicitly handled during lowering.

1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target
   supports it for the given machine function and the function has only return
   exits). We also call TLI->initializeSplitCSR to perform initialization.
2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to
   virtual registers at beginning of the entry block and copies from virtual
   registers to CSRsViaCopy at beginning of the exit blocks.
3> we also need to make sure the explicit copies will not be eliminated.

The target independent portion was committed as r255353.
rdar://problem/23557469

Differential Revision: http://reviews.llvm.org/D15341


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255821 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 21:04:19 +00:00
Derek Schuff
30d7ede265 [WebAssembly] Print an extra local decl when the user stack pointer is used
Differential Revision: http://reviews.llvm.org/D15546

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255815 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 20:43:06 +00:00
Dan Gohman
fd7b160a14 [WebAssembly] Fix the CFG Stackifier to handle unoptimized branches
If a branch both branches to and falls through to the same block, treat it as
an explicit branch.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255803 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 19:06:41 +00:00
Dan Gohman
c1fb525cc5 [WebAssembly] Use the new offset syntax for memory operands in inline asm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255788 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 18:14:49 +00:00
Ulrich Weigand
328f32455c [SystemZ] Fix assertion failure in adjustSubwordCmp
When comparing a zero-extended value against a constant small enough to
be in range of the inner type, it doesn't matter whether a signed or
unsigned compare operation (for the outer type) is being used.  This is
why the code in adjustSubwordCmp had this assertion:

    assert(C.ICmpType == SystemZICMP::Any &&
           "Signedness shouldn't matter here.");

assuming the the caller had already detected that fact.  However, it
turns out that there cases, in particular with always-true or always-
false conditions that have not been eliminated when compiling at -O0,
where this is not true.

Instead of failing an assertion if C.ICmpType is not SystemZICMP::Any
here, we can simply *set* it safely to SystemZICMP::Any, however.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255786 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 18:04:06 +00:00
Tobias Edler von Koch
1a519052e2 [Hexagon] Make memcpy lowering thread-safe
This removes an unpleasant hack involving a global variable for special
lowering of certain memcpy calls. These are now lowered as intended in
EmitTargetCodeForMemcpy in the same way that other targets do it.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255785 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 17:29:37 +00:00
Dan Gohman
3de3334800 [WebAssembly] Support more kinds of inline asm operands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255782 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 17:15:17 +00:00