RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-02-04 01:26:41 +00:00

Author	SHA1	Message	Date
Rafael Espindola	aeb12e91d1	Convert another CodeGen test into a MC test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204412 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-20 23:35:00 +00:00
Weiming Zhao	4eb2d228e9	Fix PR19136: [ARM] Fix Folding SP Update into vpush/vpop Sicne MBB->computeRegisterLivenes() returns Dead for sub regs like s0, d0 is used in vpop instead of updating sp, which causes s0 dead before its use. This patch checks the liveness of each subreg to make sure the reg is actually dead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204411 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-20 23:28:16 +00:00
Rafael Espindola	e316e00f16	Remove unused options from test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204401 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-20 21:38:04 +00:00
Juergen Ributzka	ee3242ed0b	Revert "[Constant Hoisting] Extend coverage of the constant hoisting pass." I will break this up into smaller pieces for review and recommit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204393 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-20 20:17:13 +00:00
Juergen Ributzka	228c72a841	[Constant Hoisting] Extend coverage of the constant hoisting pass. This commit extends the coverage of the constant hoisting pass, adds additonal debug output and updates the function names according to the style guide. Related to <rdar://problem/16381500> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204389 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-20 19:55:52 +00:00
Kai Nacke	ebf9f0c6cb	[MIPS] Add cpu octeon and some instructions The Octeon cpu from Cavium Networks is mips64r2 based and has an extended instruction set. In order to utilize this with LLVM, a new cpu feature "octeon" and a subtarget feature "cnmips" is added. A small set of new instructions (baddu, dmul, pop, dpop, seq, sne) is also added. LLVM generates dmul, pop and dpop instructions with option -mcpu=octeon or -mattr=+cnmips. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204337 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-20 11:51:58 +00:00
Hao Liu	19a3e9aabe	[ARM]Fix an assertion failure in A15SDOptimizer about DPair reg class by treating DPair as QPR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204304 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-20 05:36:59 +00:00
Matt Arsenault	e3620da269	R600/SI: Add support for 64-bit LDS writes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204274 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-19 22:19:54 +00:00
Matt Arsenault	62b3e22092	R600/SI: Add support for 64-bit LDS loads. v2: -Use correct opcode for DS_READ_64 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204273 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-19 22:19:52 +00:00
Matt Arsenault	6eaa49233f	R600/SI: Match i16 immediate offset of LDS instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204272 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-19 22:19:49 +00:00
Matt Arsenault	adf5141ecd	R600/SI: Fix test checking wrong instruction operand. The source and destination happen to be the same register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204271 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-19 22:19:45 +00:00
Matt Arsenault	9c0b2d08d3	R600/SI: Don't display the GDS bit. It isn't actually used now, and probably never will be, plus it makes tests less annoying. I also think SC prints GDS instructions as a separate instruction name. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204270 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-19 22:19:43 +00:00
Eli Bendersky	21354ec60d	Expose "noduplicate" attribute as a property for intrinsics. The "noduplicate" function attribute exists to prevent certain optimizations from duplicating calls to the function. This is important on platforms where certain function call duplications are unsafe (for example execution barriers for CUDA and OpenCL). This patch makes it possible to specify intrinsics as "noduplicate" and translates that to the appropriate function attribute. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204200 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-18 23:51:07 +00:00
Hans Wennborg	523f800e90	X86 memcpy lowering: use "rep movs" even when esi is used as base pointer For functions where esi is used as base pointer, we would previously fall back from lowering memcpy with "rep movs" because that clobbers esi. With this patch, we just store esi in another physical register, and restore it afterwards. This adds a little bit of register preassure, but the more efficient memcpy should be worth it. Differential Revision: http://llvm-reviews.chandlerc.com/D2968 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204174 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-18 20:04:34 +00:00
Michael Zolotukhin	50e4d56b9f	Fix test lsr-normalization.ll broken in r204161. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204166 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-18 18:17:59 +00:00
Raul E. Silvera	370981ad17	Add support for scalarizing/splitting vector bswap. Summary: SLP Vectorization of intrinsics (r203707) has exposed cases where the expansion of vector bswap is failing (PR19151). Reviewers: hfinkel CC: chandlerc Differential Revision: http://llvm-reviews.chandlerc.com/D3104 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204163 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-18 17:49:12 +00:00
Michael Zolotukhin	13ca05e2b8	Add stride normalization to SCEV Normalize/Denormalize transformation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204161 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-18 17:34:03 +00:00
Andrea Di Biagio	6077ca9abb	[DAGCombiner] teach how to simplify xor/and/or nodes according to the following rules: 1) (AND (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (AND (A, B), C, Mask) 2) (OR (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (OR (A, B), C, Mask) 3) (XOR (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (XOR (A, B), V_0, Mask) 4) (AND (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (C, AND (A, B), Mask) 5) (OR (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (C, OR (A, B), Mask) 6) (XOR (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (V_0, XOR (A, B), Mask) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204160 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-18 17:12:59 +00:00
Bill Schmidt	d4585b941a	Fix PR19144: Incorrect offset generated for int-to-fp conversion at -O0. When converting a signed 32-bit integer to double-precision floating point on hardware without a lfiwax instruction, we have to instead use a lfd followed by fcfid. We were erroneously offsetting the address by 4 bytes in preparation for either a lfiwax or lfiwzx when generating the lfd. This fixes that silly error. This was not caught in the test suite since the conversion tests were run with -mcpu=pwr7, which implies availability of lfiwax. I've added another test case for older hardware that checks the code we expect in the absence of lfiwax and other flavors of fcfid. There are fewer tests in this test case because we punt to DAG selection in more cases on older hardware. (We must generate complex fiddly sequences in those cases, and there is marginal benefit in duplicating that logic in fast-isel.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204155 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-18 14:32:50 +00:00
NAKAMURA Takumi	3bdef4b6dc	CodeGen/R600/v_cndmask.ll: Relax an expression to unbreak msvcrt. V_CNDMASK_B32_e64 v0, v0, -1.#QNAN0e+00, s[2:3], 0, 0, 0, 0 FIXME: We really need to implement our formatter... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204118 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-18 06:17:22 +00:00
Kevin Enderby	c153e49aa9	Making a guess to fix the test case with r204056 to get the build bot working. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204073 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-17 19:00:03 +00:00
Matt Arsenault	2683baa8ac	R600: Match sign_extend_inreg to BFE instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204072 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-17 18:58:11 +00:00
Matt Arsenault	94bdb453a4	Make DAGCombiner work on vector bitshifts with constant splat vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204071 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-17 18:58:01 +00:00
Adam Nemet	8c8fe42a0d	[VectorLegalizer/X86] Don't unvectorize fp_to_uint for v8f32->v8i16 Rather than LegalizeAction::Expand, this needs LegalizeAction::Promote to get promoted to fp_to_sint v8f32->v8i32. This is a legal operation on AVX. For that to work properly, we also need to teach the legalizer about the specific promotion required here. The default vector promotion uses bitcasting to a vector type of the same total size. We want to promote the vector element type, effectively widening the operation and then truncating the result. This is analogous to the current logic of how int_to_fp is promoted. The change also factors out some code from the int_to_fp promotion code to ValueType::widenIntegerVectorElementType. This is now shared between int_to_fp and fp_to_int. There is no longer need for the custom lowering of fp_to_sint f32->v8i16 in X86. It can now go through the new target-independent fp_to_*int promotion logic. I also checked that no other target uses Promote for these ops yet, so there shouldn't be any unexpected change in behavior. Fixes <rdar://problem/16202247> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204058 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-17 17:06:14 +00:00
Tom Stellard	ad52f4f70c	R600/SI: Fix implementation of isInlineConstant() used by the verifier The type of the immediates should not matter as long as the encoding is equivalent to the encoding of one of the legal inline constants. Tested-by: Michel Dänzer <michel.daenzer@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204056 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-17 17:03:52 +00:00
Tom Stellard	eb7876083d	R600/SI: Use correct dest register class for V_READFIRSTLANE_B32 This instructions writes to an 32-bit SGPR. This change required adding the 32-bit VCC_LO and VCC_HI registers, because the full VCC register is 64 bits. This fixes verifier errors on several of the indirect addressing piglit tests. Tested-by: Michel Dänzer <michel.daenzer@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204055 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-17 17:03:51 +00:00
Lang Hames	3dd951e842	[X86] New and improved VZeroUpperInserter optimization. - Adds support for inserting vzerouppers before tail-calls. This is enabled implicitly by having MachineInstr::copyImplicitOps preserve regmask operands, which allows VZeroUpperInserter to see where tail-calls use vector registers. - Fixes a bug that caused the previous version of this optimization to miss some vzeroupper insertion points in loops. (Loops-with-vector-code that followed loops-without-vector-code were mistakenly overlooked by the previous version). - New algorithm never revisits instructions. Fixes <rdar://problem/16228798> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204021 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-17 01:22:54 +00:00
Adrian Prantl	2110a0d07b	Re-add checks that were in this testcase before it was converted to dwarfdump. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203981 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-14 23:08:21 +00:00
Ulrich Weigand	0951eecae4	[ppc64] Avoid copy relocs in named rodata sections Commit r181723 introduced code to avoid placing initialized variables needing relocations into the .rodata section, which avoid copy relocs that do not work as expected on ppc64 function references. The same treatment is also needed for named .rodata.XXX sections. This patch changes PPC64LinuxTargetObjectFile::SelectSectionForGlobal to modify "Kind" before calling the default SelectSectionForGlobal routine, instead of first calling the default routine and then just checking for the (main) .rodata section afterwards. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203921 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-14 12:45:22 +00:00
Rafael Espindola	1f21e0dd0d	Remove the linker_private and linker_private_weak linkages. These linkages were introduced some time ago, but it was never very clear what exactly their semantics were or what they should be used for. Some investigation found these uses: * utf-16 strings in clang. * non-unnamed_addr strings produced by the sanitizers. It turns out they were just working around a more fundamental problem. For some sections a MachO linker needs a symbol in order to split the section into atoms, and llvm had no idea that was the case. I fixed that in r201700 and it is now safe to use the private linkage. When the object ends up in a section that requires symbols, llvm will use a 'l' prefix instead of a 'L' prefix and things just work. With that, these linkages were already dead, but there was a potential future user in the objc metadata information. I am still looking at CGObjcMac.cpp, but at this point I am convinced that linker_private and linker_private_weak are not what they need. The objc uses are currently split in * Regular symbols (no '\01' prefix). LLVM already directly provides whatever semantics they need. * Uses of a private name (start with "\01L" or "\01l") and private linkage. We can drop the "\01L" and "\01l" prefixes as soon as llvm agrees with clang on L being ok or not for a given section. I have two patches in code review for this. * Uses of private name and weak linkage. The last case is the one that one could think would fit one of these linkages. That is not the case. The semantics are * the linker will merge these symbol by name. * the linker will hide them in the final DSO. Given that the merging is done by name, any of the private (or internal) linkages would be a bad match. They allow llvm to rename the symbols, and that is really not what we want. From the llvm point of view, these objects should really be (linkonce\|weak)(_odr)?. For now, just keeping the "\01l" prefix is probably the best for these symbols. If we one day want to have a more direct support in llvm, IMHO what we should add is not a linkage, it is just a hidden_symbol attribute. It would be applicable to multiple linkages. For example, on weak it would produce the current behavior we have for objc metadata. On internal, it would be equivalent to private (and we should then remove private). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203866 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 23:18:37 +00:00
Kevin Enderby	c5888b8d1b	Add -mtriple=x86_64-linux to this test case to fix the build bots.5 The original commit was r203829. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203844 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 20:31:19 +00:00
Ekaterina Romanova	ed2ca70ccf	Fix for http://llvm.org/bugs/show_bug.cgi?id=18590 This patch fixes the bug in peephole optimization that folds a load which defines one vreg into the one and only use of that vreg. With debug info, a DBG_VALUE that referenced the vreg considered to be a use, preventing the optimization. The fix is to ignore DBG_VALUE's during the optimization, and undef a DBG_VALUE that references a vreg that gets removed. Patch by Trevor Smigiel! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203829 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 18:47:12 +00:00
Tom Stellard	47feea0802	R600: LDS instructions shouldn't implicitly define OQAP LDS instructions are pseudo instructions which model the OQAP defs and uses within a single instruction. This fixes a hang in the opencv MedianFilter tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203818 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 17:13:04 +00:00
Mark Seaborn	d2a816fe10	Cleanup: Remove use of old "-enable-correct-eh-support" option from a test This option enables LowerInvoke's obsolete SJLJ EH support, but the target used in this test (ARM Darwin) no longer uses the LowerInvoke pass, so the option has no effect here. This target currently uses the newer SjLjEHPrepare pass instead. This cleanup will help with removing "-enable-correct-eh-support". Differential Revision: http://llvm-reviews.chandlerc.com/D3064 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203810 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 16:23:00 +00:00
Hans Wennborg	c8ed0db5aa	[ARM] Use symbolic register names in .cfi directives only with IAS (PR19110) This is a follow-up to r203635. Saleem pointed out that since symbolic register names are much easier to read, it would be good if we could turn them off only when we really need to because we're using an external assembler. Differential Revision: http://llvm-reviews.chandlerc.com/D3056 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203806 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 15:56:41 +00:00
Manuel Jacob	f8909fa140	CodeGenPrep: sink extends of illegal types into use block. Summary: This helps the instruction selector to lower an i64 * i64 -> i128 multiplication into a single instruction on targets which support it. This is an update of D2973 which was reverted because of a bug reported as PR19084. Reviewers: t.p.northover, chapuni Reviewed By: t.p.northover CC: llvm-commits, alex, chapuni Differential Revision: http://llvm-reviews.chandlerc.com/D3021 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203797 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 13:36:25 +00:00
Elena Demikhovsky	3d1ae71813	AVX-512: masked load/store + intrinsics for them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203790 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 12:05:52 +00:00
Hal Finkel	ab849adec4	[PowerPC] Initial support for the VSX instruction set VSX is an ISA extension supported on the POWER7 and later cores that enhances floating-point vector and scalar capabilities. Among other things, this adds <2 x double> support and generally helps to reduce register pressure. The interesting part of this ISA feature is the register configuration: there are 64 new 128-bit vector registers, the 32 of which are super-registers of the existing 32 scalar floating-point registers, and the second 32 of which overlap with the 32 Altivec vector registers. This makes things like vector insertion and extraction tricky: this can be free but only if we force a restriction to the right register subclass when needed. A new "minipass" PPCVSXCopy takes care of this (although it could do a more-optimal job of it; see the comment about unnecessary copies below). Please note that, currently, VSX is not enabled by default when targeting anything because it is not yet ready for that. The assembler and disassembler are fully implemented and tested. However: - CodeGen support causes miscompiles; test-suite runtime failures: MultiSource/Benchmarks/FreeBench/distray/distray MultiSource/Benchmarks/McCat/08-main/main MultiSource/Benchmarks/Olden/voronoi/voronoi MultiSource/Benchmarks/mafft/pairlocalalign MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 SingleSource/Benchmarks/CoyoteBench/almabench SingleSource/Benchmarks/Misc/matmul_f64_4x4 - The lowering currently falls back to using Altivec instructions far more than it should. Worse, there are some things that are scalarized through the stack that shouldn't be. - A lot of unnecessary copies make it past the optimizers, and this needs to be fixed. - Many more regression tests are needed. Normally, I'd fix these things prior to committing, but there are some students and other contributors who would like to work this, and so it makes sense to move this development process upstream where it can be subject to the regular code-review procedures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203768 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 07:58:58 +00:00
Adam Nemet	a65ca9dcf0	[X86] Add peephole for masked rotate amount Extend what's currently done for shift because the HW performs this masking implicitly: (rotl:i32 x, (and y, 31)) -> (rotl:i32 x, y) I use the newly factored out multiclass that was only supporting shifts so far. For testing I extended my testcase for the new rotation idiom. <rdar://problem/15295856> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203718 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 21:20:55 +00:00
Rafael Espindola	38048cdb1c	Reject alias to undefined symbols in the verifier. On ELF and COFF an alias is just another name for a position in the file. There is no way to refer to a position in another file, so an alias to undefined is meaningless. MachO currently doesn't support aliases. The spec has a N_INDR, which when implemented will have a different set of restrictions. Adding support for it shouldn't be harder than any other IR extension. For now, having the IR represent what is actually possible with current tools makes it easier to fix the design of GlobalAlias. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203705 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 20:15:49 +00:00
Matt Arsenault	054f4eccd2	R600: Fix trunc store from i64 to i1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203695 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 18:45:52 +00:00
Daniel Sanders	fe6bd52bf2	[mips] BSEL's and BINS[RL] operands are reversed compared to the vselect node used in the pattern. Summary: Correct the match patterns and the lowerings that made the CodeGen tests pass despite the mistakes. The original testcase that discovered the problem was SingleSource/UnitTests/SignlessType/factor.c in test-suite. During review, we also found that some of the existing CodeGen tests were incorrect and fixed them: * bitwise.ll: In bsel_v16i8 the IfSet/IfClear were reversed because bsel and bmnz have different operand orders and the test didn't correctly account for this. bmnz goes 'IfClear, IfSet, CondMask', while bsel goes 'CondMask, IfClear, IfSet'. * vec.ll: In the cases where a bsel is emitted as a bmnz (they are the same operation with a different input tied to the result) the operands were in the wrong order. * compare.ll and compare_float.ll: The bsel operand order was correct for a greater-than comparison, but a greater-than comparison instruction doesn't exist. Lowering this operation inverts the condition so the IfSet/IfClear need to be swapped to match. The differences between BSEL, BMNZ, and BMZ and how they map to/from vselect are rather confusing. I've therefore added a note to MSA.txt to explain this in a single place in addition to the comments that explain each case. Reviewers: matheusalmeida, jacksprat Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3028 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203657 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 11:54:00 +00:00
Tim Northover	d4517fa24d	ARM: correct Dwarf output for non-contiguous VFP saves. When the list of VFP registers to be saved was non-contiguous (so multiple vpush/vpop instructions were needed) these were being ordered oddly, as in: vpush {d8, d9} vpush {d11} This led to the layout in memory being [d11, d8, d9] which is ugly and doesn't match the CFI_INSTRUCTIONs we're generating either (so Dwarf info would be broken). This switches the order of vpush/vpop (in both prologue and epilogue, obviously) so that the Dwarf locations are correct again. rdar://problem/16264856 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203655 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 11:29:23 +00:00
Hans Wennborg	e03daa01f6	[ARM] Use DWARF register numbers for CFI directives in ELF assembly It seems gas can't handle CFI directives with VFP register names ("d12", etc.). This broke us trying to build Chromium for Android after 201423. A gas bug has been filed: https://sourceware.org/bugzilla/show_bug.cgi?id=16694 compnerd suggested making this conditional on whether we're using the integrated assembler or not. I'll look into that in a follow-up patch. Differential Revision: http://llvm-reviews.chandlerc.com/D3049 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203635 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 03:52:34 +00:00
Hans Wennborg	1332459dbb	X86: Don't generate 64-bit movd after cmpneqsd in 32-bit mode (PR19059) This fixes the bug where we would bitcast the 64-bit floating point result of cmpneqsd to a 64-bit integer even on 32-bit targets. Differential Revision: http://llvm-reviews.chandlerc.com/D3009 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203581 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 15:49:24 +00:00
Saleem Abdulrasool	90d0ed297f	ARM: honour -f{no-,}optimize-sibling-calls Use the options in the ARMISelLowering to control whether tail calls are optimised or not. Previously, this option was entirely ignored on the ARM target and only honoured on x86. This option is mostly useful in profiling scenarios. The default remains that tail call optimisations will be applied. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203577 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 15:09:54 +00:00
Saleem Abdulrasool	2b42ff6fdb	ARM: remove ancient -arm-tail-calls option This option is from 2010, designed to work around a linker issue on Darwin for ARM. According to grosbach this is no longer an issue and this option can safely be removed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203576 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 15:09:49 +00:00
Saleem Abdulrasool	cde1f2eae2	ARM: enable tail call optimisation on Thumb 2 Tail call optimisation was previously disabled on all targets other than iOS5.0+. This enables the tail call optimisation on all Thumb 2 capable platforms. The test adjustments are to remove the IR hint "tail" to function invocation. The tests were designed assuming that tail call optimisations would not kick in which no longer holds true. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203575 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 15:09:44 +00:00
Tim Northover	ca396e391e	IR: add a second ordering operand to cmpxhg for failure The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203559 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 10:48:52 +00:00
Jim Grosbach	7a37166a7a	X86: Enable ISel of 16-bit MOVBE instructions. When the MOVBE instructions are available, use them for 16-bit endian swapping as well as for 32 and 64 bit. The patterns were already present on the instructions, but weren't being matched because the operation was unconditionally marked to 'Expand.' Change that to be conditional on whether the MOVBE instructions are available. Use 'rolw' to implement the in-register version (32 and 64 bit have the dedicated 'bswap' instruction for that). Patch by Louis Gerbarg <lgg@apple.com>. rdar://15479984 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203524 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 00:44:14 +00:00

1 2 3 4 5 ...

10123 Commits