Commit Graph

34485 Commits

Author SHA1 Message Date
Zia Ansari
5ed116b04b Implemented stack symbol table ordering/packing optimization to improve data locality and code size from SP/FP offset encoding.
Differential Revision: http://reviews.llvm.org/D15393



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260917 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 23:44:13 +00:00
Simon Pilgrim
2367de18f3 [X86][SSE2] Regenerated sse2 tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260900 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 17:57:40 +00:00
Krzysztof Parzyszek
aec17f6b38 [Hexagon] Missed testcase update in r260895
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260897 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 16:15:02 +00:00
Scott Egerton
4f17f73d87 [mips] Implemented the .hword directive.
Summary:
In order to pass the tests, this required marking R_MIPS_16 relocations
as needing to point to the symbol and not the section.

Reviewers: vkalintiris, dsanders

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D17200

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260896 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 16:11:51 +00:00
Krzysztof Parzyszek
5e17ebd723 [Hexagon] Use zero-extending loads for anyext
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260895 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 16:01:01 +00:00
Silviu Baranga
23340531a1 [LV] Add support for insertelt/extractelt processing during type truncation
Summary:
While shrinking types according to the required bits, we can
encounter insert/extract element instructions. This will cause us to
reach an llvm_unreachable statement.

This change adds support for truncating insert/extract element
operations, and adds a regression test.

Reviewers: jmolloy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17078

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260893 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 15:38:17 +00:00
Simon Pilgrim
6cde049097 [X86] More thorough partial-register division checks
For when grep counts are just not enough...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260891 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 14:09:35 +00:00
Simon Pilgrim
f1de1fab94 [X86] Regenerated 64/128 bit multiply tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260890 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 14:04:05 +00:00
Simon Pilgrim
b5a2cb5b20 [X86][SSE] More thorough testing of all-ones vectors re-materialization
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260889 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 13:50:48 +00:00
Simon Pilgrim
2ad8deec9c [X86][SSE] Regenerated uint2fp special case tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260888 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 13:41:41 +00:00
NAKAMURA Takumi
a8f262c12e Make llvm/test/tools/llvm-symbolizer/pdb/pdb.test Py3-compatible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260887 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 13:19:13 +00:00
Simon Pilgrim
a6e564a058 [X86][SSE] Regenerated fast isel intrinsics tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260885 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 12:32:16 +00:00
Scott Egerton
894a6f0e19 Reverted r260879 as it caused test failures in lld.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260880 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 10:04:38 +00:00
Scott Egerton
77405b1467 [mips] Removed the SHF_ALLOC flag from the .pdr section.
Summary:
This section is used for debug information and has no need to be
in memory at runtime. With this patch, LLVM now emits the same flags as 
the GNU assembler. This patch also fixes an error when compiling 
the Linux kernel, The error is that there are relocations within the 
.pdr section in a VDSO.

Reviewers: vkalintiris, dsanders

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D17199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260879 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 09:34:15 +00:00
Igor Breger
0dd3e9d55e AVX512: Change store size of kmask. Store size of v8i1, v4i1 , v2i1 and i1 are changed to 16 bits.
If KMOVB not supported (require AVX512DQ) only KMOVW can be used so store size should be 2 bytes.

Differential Revision: http://reviews.llvm.org/D17138

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260878 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 08:25:28 +00:00
Simon Pilgrim
71895b613a [X86][AVX] Fixed copy+paste typo in shuffle test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260852 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-14 18:11:52 +00:00
Chandler Carruth
6f603acaaa [PM/AA] Wire BasicAA's new pass manager class up to the pass registry.
This ensures that all of the various pieces are working. The next patch
will wire up commandline-driven alias analysis chain building and allow
BasicAA to work with the AAManager.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260838 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 23:46:24 +00:00
Chandler Carruth
26bf6ea447 [PM/AA] Actually wire the AAManager I built for the new pass manager
into the new pass manager and fix the latent bugs there.

This lets everything live together nicely, but it isn't really useful
yet. I never finished wiring the AA layer up for the new pass manager,
and so subsequent patches will change this to do that wiring and get AA
stuff more fully integrated into the new pass manager. Turns out this is
necessary even to get functionattrs ported over. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260836 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 23:32:00 +00:00
Simon Pilgrim
34963650b8 [X86][AVX] Lower shuffles as repeated lane shuffles then lane-crossing shuffles
This patch attempts to represent a shuffle as a repeating shuffle (recognisable by is128BitLaneRepeatedShuffleMask) with the source input(s) in their original lanes, followed by a single permutation of the 128-bit lanes to their final destinations.

On AVX2 we can additionally attempt to match using 64-bit sub-lane permutation. AVX2 can also now match a similar 'broadcasted' repeating shuffle.

This patch has several benefits:

 * Avoids prematurely matching with lowerVectorShuffleByMerging128BitLanes which can require both inputs to have their input lanes permuted before shuffling.
 * Can replace PERMPS/PERMD instructions - although these are useful for cross-lane unary shuffling, they require their shuffle mask to be pre-loaded (and increase register pressure).
 * Matching the repeating shuffle makes use of a lot of existing shuffle lowering.

There is an outstanding minor AVX1 regression (combine_unneeded_subvector1 in vector-shuffle-combining.ll) of a previously 128-bit shuffle + subvector splat being converted to a subvector splat + (2 instruction) 256-bit shuffle, I intend to fix this in a followup patch for review.

Differential Revision: http://reviews.llvm.org/D16537

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260834 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 21:54:04 +00:00
Sanjay Patel
5fb1d32848 [x86-64] allow mfence even with -mno-sse (PR23203)
As shown in:
https://llvm.org/bugs/show_bug.cgi?id=23203
...we currently die because lowering believes that mfence is allowed without SSE2 on x86-64,
but the instruction def doesn't know that.

I don't know if allowing mfence without SSE is right, but if not, at least now it's consistently wrong. :)

Differential Revision: http://reviews.llvm.org/D17219



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260828 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 17:26:29 +00:00
Chandler Carruth
2c37b35a91 [attrs] Move the norecurse deduction to operate on the node set rather
than the SCC object, and have it scan the instruction stream directly
rather than relying on call records.

This makes the behavior of this routine consistent between libc routines
and LLVM intrinsics for libc routines. We can go and start teaching it
about those being norecurse, but we should behave the same for the
intrinsic and the libc routine rather than differently. I chatted with
James Molloy and the inconsistency doesn't seem intentional and likely
is due to intrinsic calls not being modelled in the call graph analyses.

This also fixes a bug where we would deduce norecurse on optnone
functions, when generally we try to handle optnone functions as-if they
were replaceable and thus unanalyzable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260813 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 08:47:51 +00:00
Matt Arsenault
626ceb277f AMDGPU: Prepare for reducing private element size.
Tests for the new scalarize all private access options will be
included with a future commit.

The only functional change is to make the split/scalarize behavior
for private access of > 4 element vectors to be consistent
with the flat/global handling. This makes the spilling worse
in the two changed tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260804 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 04:18:53 +00:00
Tom Stellard
224ee47ca1 AMDGPU/SI: Add llvm.amdgcn.mov.dpp intrinsic
This intrinsic will be used to expose dpp functionality to higher-level
languages. It will map to the dpp version of v_mov_b32.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260792 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 02:09:49 +00:00
Davide Italiano
2ad5a3cea4 [llvm-size] Make error handling uniform.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260786 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 01:38:16 +00:00
Matt Arsenault
53ea122b3d AMDGPU: Add intrinsics for sin/cos
These provide direct access to the hardware instruction without
the unit version required like llvm.sin/llvm.cos lowering requires.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260782 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 01:19:56 +00:00
Matt Arsenault
a4c1dc826a AMDGPU: Rename intrinsic to better match instruction name
Also fixes missing f32 test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260780 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 01:03:00 +00:00
Pirama Arumuga Nainar
d313df118a Don't combine fp_round (fp_round x) if f80 to f16 is generated
Summary:
This patch skips DAG combine of fp_round (fp_round x) if it results in
an fp_round from f80 to f16.

fp_round from f80 to f16 always generates an expensive (and as yet,
unimplemented) libcall to __truncxfhf2.  This prevents selection of
native f16 conversion instructions from f32 or f64.  Moreover, the first
(value-preserving) fp_round from f80 to either f32 or f64 may become a
NOP in platforms like x86.

Reviewers: ab

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D17221

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260769 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 00:08:05 +00:00
Tom Stellard
98ef447825 AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions
Reviewers: arsenm

Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260765 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 23:45:29 +00:00
Yunzhong Gao
6784b3ca2d Disable the vzeroupper insertion pass on PS4.
Differential Revision: http://reviews.llvm.org/D16837



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260764 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 23:37:57 +00:00
Krzysztof Parzyszek
b762ae1118 [Hexagon] Optimize stack slot spills
Replace spills to memory with spills to registers, if possible. This
applies mostly to predicate registers (both scalar and vector), since
they are very limited in number. A spill of a predicate register may
happen even if there is a general-purpose register available. In cases
like this the stack spill/reload may be eliminated completely.

This optimization will consider all stack objects, regardless of where
they came from and try to match the live range of the stack slot with
a dead range of a register from an appropriate register class.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260758 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 22:53:35 +00:00
David Majnemer
ffc8ad133a [llvm-pdbdump] Start to decode some streams
We can decode a little bit of the first stream now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260754 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 22:27:44 +00:00
Sanjay Patel
12e0a699e7 fix test to use FileCheck
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260751 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 22:07:54 +00:00
Reid Kleckner
43c4ddff1c [codeview] Describe local variables in registers
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260746 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 21:48:30 +00:00
Dan Gohman
f97f532728 [WebAssembly] Fix byval for empty types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260740 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 21:30:18 +00:00
Dan Gohman
73cd89a6f2 [WebAssembly] Fix insertion of a BLOCK in a loop header that also ends a BLOCK.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260737 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 21:19:25 +00:00
Andrew Kaylor
4caa75fcde [WinEH] Prevent EH state numbering from skipping nested cleanup pads that never return
Differential Revision: http://reviews.llvm.org/D17208



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260733 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 21:10:16 +00:00
Chad Rosier
676d257cf2 [LIR] Allow merging of memsets in negatively strided loops.
Last part of PR25166.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260732 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 21:03:23 +00:00
Justin Lebar
d7521eeb5d [SimplifyCFG] Don't fold conditional branches that contain calls to convergent functions.
Summary:
Performing this optimization duplicates the call to the convergent
function and adds new control-flow dependencies, which is a no-no.

Reviewers: jingyue

Subscribers: broune, hfinkel, tra, resistor, joker.eph, arsenm, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17128

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260730 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 21:01:36 +00:00
Justin Lebar
2109a5cbf9 [LoopRotate] Don't perform loop rotation if the loop header calls a convergent function.
Summary:
Calls to convergent functions can be duplicated, but only if the
duplicates are not control-flow dependent on any additional values.
Loop rotation doesn't meet the bar.

Reviewers: jingyue

Subscribers: mzolotukhin, llvm-commits, arsenm, joker.eph, resistor, tra, hfinkel, broune

Differential Revision: http://reviews.llvm.org/D17127

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260729 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 21:01:33 +00:00
Benjamin Kramer
06016cc257 Remove LLVMGetTargetMachineData leftovers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260720 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 20:26:46 +00:00
Philip Reames
2f2bedcc23 [LVI] Exploit nsw/nuw when computing constant ranges
As the title says. Modelled after similar code in SCEV.

This is useful when analysing induction variables in loops which have been canonicalized by other passes. I wrote the tests as non-loops specifically to avoid the generality introduced in http://reviews.llvm.org/D17174. While that can handle many induction variables without *needing* to exploit nsw, there's no reason not to use it if we've already proven it.

Differential Revision: http://reviews.llvm.org/D17177



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260705 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 19:05:16 +00:00
Krzysztof Parzyszek
c006b1e101 [Hexagon] Replace expansion of spill pseudo-instructions in frame lowering
Rewrite the code to handle all pseudo-instructions in a single pass.

This temporarily reverts spill slot optimization that used general-
purpose registers to hold values of spilled predicate registers.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260696 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 18:19:53 +00:00
David Majnemer
e049aa6ae3 [InstCombine] Don't aggressively replace xor with icmp
For some cases, InstCombine replaces the sequence of xor/sub instruction
followed by cmp instruction into a single cmp instruction.

However, this replacement may result suboptimal result especially when
the xor/sub has more than one use, as discussed in
bug 26465 (https://llvm.org/bugs/show_bug.cgi?id=26465).

This patch make the replacement happen only when xor/sub has only one
use.

Differential Revision: http://reviews.llvm.org/D16915

Patch by Taewook Oh!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260695 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 18:12:38 +00:00
Tom Stellard
abf168408a [AMDGPU] Assembler: Swap operands of flat_store instructions to match AMD assembler
Historically, AMD internal sp3 assembler has flat_store* addr, data
format. To match existing code and to enable reuse, change LLVM
definitions to match.  Also update MC and CodeGen tests.

Differential Revision: http://reviews.llvm.org/D16927

Patch by: Nikolay Haustov

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260694 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 17:57:54 +00:00
Changpeng Fang
3a0161ac77 AMDGPU/SI: Annotate Loops with Constant Condition in SIAnnotateControlFlow pass.
Summary:
  It is possible that the loop condition can be a boolean constant (infinite loop,
for example). So we sould handle constant condition in annotating a loop. This
patch adds this functionality to support annotating constant condition.

Reviewers: tstellarAMD, arsenm

Subscribers: llvm-commits, arsenm

Differential Revision: http://reviews.llvm.org/D15093

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260692 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 17:11:04 +00:00
Krzysztof Parzyszek
b92f69441a [Hexagon] Eliminate pseudo instructions for circ/brev loads and stores
We can generate the actual instructions from the intrinsics without the
need for pseudo-instructions. Also, since the intrinsics have a side-
effect in a form of a store, attempt to optimize away loads from the
store location.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260690 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 17:01:51 +00:00
Geoff Berry
83d0e325e4 [AArch64] Reduce number of callee-save save/restores.
Summary:
Before this change, callee-save registers would be rounded up to even
pairs of GPRs and FPRs.  This change eliminates these extra padding
load/stores, though it does keep the stack allocation the same size
unless both the GPR and FPR sets have an odd size, in which case one
full pair stack slot (16 bytes) is saved.

This optimization cannot currently be done for MachO targets since they
rely on a fast-path .debug_frame equivalent that can only encode
callee-save registers as pairs.

Reviewers: t.p.northover, rengolin, mcrosier, jmolloy

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17000

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260689 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 16:31:41 +00:00
Chad Rosier
1f88b2d0b7 [AArch64] Add support for Qualcomm Kryo CPU.
Machine model description by Dave Estes <cestes@codeaurora.org>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260686 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 15:51:51 +00:00
Jun Bum Lim
844cafe1c8 [AArch64] Merge two adjacent str WZR into str XZR
Summary:
This change merges adjacent 32 bit zero stores into a 64 bit zero store.
e.g.,
  str wzr, [x0]
  str wzr, [x0, #4]
becomes
  str xzr, [x0]

Therefore, four adjacent 32 bit zero stores will be a single stp.
e.g.,
  str wzr, [x0]
  str wzr, [x0, #4]
  str wzr, [x0, #8]
  str wzr, [x0, #12]
becomes
  stp xzr, xzr, [x0]

Reviewers: mcrosier, jmolloy, gberry, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16933

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260682 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 15:25:39 +00:00
Krzysztof Parzyszek
9f34dc17fe [Hexagon] Specify vector alignment in DataLayout string
The DataLayout can calculate alignment of vectors based on the alignment
of the element type and the number of elements. In fact, it is the product
of these two values. The problem is that for vectors of N x i1, this will
return the alignment of N bytes, since the alignment of i1 is 8 bits. The
vector types of vNi1 should be aligned to N bits instead. Provide explicit
alignment for HVX vectors to avoid such complications.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260678 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 14:47:38 +00:00