RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2024-12-23 04:28:30 +00:00

Author	SHA1	Message	Date
Peter Zotov	686e157176	[OCaml] PR19859: Add functions to query and modify branches. Patch by Gabriel Radanne <drupyog@zoho.com>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220818 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 19:47:02 +00:00
Peter Zotov	4c4f5ece75	[OCaml] PR19859: Add tests for reading the values of numeric constants. Patch by Gabriel Radanne <drupyog@zoho.com>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220816 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 19:46:52 +00:00
Saleem Abdulrasool	586cf3d88d	Transforms: reapply SVN r219899 This restores the commit from SVN r219899 with an additional change to ensure that the CodeGen is correct for the case that was identified as being incorrect (originally PR7272). In the case that during inlining we need to synthesize a value on the stack (i.e. for passing a value byval), then any function involving that alloca must be stripped of its tailness as the restriction that it does not access the parent's stack no longer holds. Unfortunately, a single alloca can cause a rippling effect through out the inlining as the value may be aliased or may be mutated through an escaped external call. As such, we simply track if an alloca has been introduced in the frame during inlining, and strip any tail calls. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220811 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 18:27:37 +00:00
Robert Khasanov	9371efbcdb	[AVX512] Extended avx512_sqrt_packed (sqrt instructions) to VL subset. Refactored through AVX512_maskable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220806 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 18:15:20 +00:00
Robert Khasanov	59cb03d329	[AVX-512] Expanded rsqrt/rcp instructions to VL subset. Refactored multiclass through AVX512_maskable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220783 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 16:37:13 +00:00
Robert Khasanov	edf556ec1f	[x86] Simplify vector selection if condition value type matches vselect value type and true value is all ones or false value is all zeros. This transformation worked if selector is produced by SETCC, however SETCC is needed only if we consider to swap operands. So I replaced SETCC check for this case. Added tests for vselect of <X x i1> values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220777 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 15:59:40 +00:00
Robert Khasanov	4a52493457	[AVX512] Bring back vector-shuffle lowering support through broadcasts Ffter commit at rev219046 512-bit broadcasts lowering become non-optimal. Most of tests on broadcasting and embedded broadcasting were changed and they doesn’t produce efficient code. Example below is from commit changes (it’s the first test from test/CodeGen/X86/avx512-vbroadcast.ll): define <16 x i32> @_inreg16xi32(i32 %a) { ; CHECK-LABEL: _inreg16xi32: ; CHECK: ## BB#0: -; CHECK-NEXT: vpbroadcastd %edi, %zmm0 +; CHECK-NEXT: vmovd %edi, %xmm0 +; CHECK-NEXT: vpbroadcastd %xmm0, %ymm0 +; CHECK-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0 ; CHECK-NEXT: retq %b = insertelement <16 x i32> undef, i32 %a, i32 0 %c = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32> zeroinitializer ret <16 x i32> %c } Here, 256-bit broadcast was generated instead of 512-bit one. In this patch 1) I added vector-shuffle lowering through broadcasts 2) Removed asserts and branches likes because this is incorrect - assert(Subtarget->hasDQI() && "We can only lower v8i64 with AVX-512-DQI"); 3) Fixed lowering tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220774 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 12:28:51 +00:00
Reid Kleckner	d5de327da0	X86: Implement the vectorcall calling convention This is a Microsoft calling convention that supports both x86 and x86_64 subtargets. It passes vector and floating point arguments in XMM0-XMM5, and passes them indirectly once they are consumed. Homogenous vector aggregates of up to four elements can be passed in sequential vector registers, but this part is not implemented in LLVM and will be handled in Clang. On 32-bit x86, it is similar to fastcall in that it uses ecx:edx as integer register parameters and is callee cleanup. On x86_64, it delegates to the normal win64 calling convention. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D5943 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220745 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 01:29:26 +00:00
Tim Northover	dd778c6c9f	AArch64: enable Cortex-A57 FP balancing on Cortex-A53. Benchmarks have shown that it's harmless to the performance there, and having a unified set of passes between the two cores where possible helps big.LITTLE deployment. Patch by Z. Zheng. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220744 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 01:24:32 +00:00
Adam Nemet	6bc8d95153	[AVX512] Add vpermil variable version This is implemented via a multiclass that derives from the vperm imm multiclass. Fixes <rdar://problem/18426089> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220737 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 23:08:40 +00:00
Pete Cooper	68aeef61f4	Fix a stackmap bug introduced in r220710. For a call to not return in to the stackmap shadow, the shadow must end with the call. To do this, we must insert any required nops before the call, and not after it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220728 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 22:38:45 +00:00
Juergen Ributzka	52a6f59d41	[FastISel][AArch64] Emit immediate version of icmp (subs) for null pointer check. This is a minor change to use the immediate version when the operand is a null value. This should get rid of an unnecessary 'mov' instruction in debug builds and align the code more with the one generated by SelectionDAG. This fixes rdar://problem/18785125. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220713 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 19:58:36 +00:00
Juergen Ributzka	e2995ff88f	[FastISel][AArch64] Optimize compare-and-branch for i1 to use 'tbz'. Minor enhancement to use 'tbz' for i1 compare-and-branch to get rid of an 'and' instruction. This fixes rdar://problem/18784953. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220712 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 19:46:23 +00:00
Pete Cooper	7476f9c513	Stackmap shadows should consider call returns a branch target. To avoid emitting too many nops, a stackmap shadow can include emitted instructions in the shadow, but these must not include branch targets. A return from a call should count as a branch target as patching over the instructions after the call would lead to incorrect behaviour for threads currently making that call, when they return. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220710 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 19:40:35 +00:00
Juergen Ributzka	b11c5b1078	[FastISel][AArch64] Use 'cbz' also for null values (pointers). The pattern matching for a 'ConstantInt' value was too restrictive. Checking for a 'Constant' with a bull value is sufficient for using an 'cbz/cbnz' instruction. This fixes rdar://problem/18784732. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220709 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 19:38:05 +00:00
Juergen Ributzka	5745cad861	[FastISel][AArch64] Don't fold the 'and' instruction into the 'tbz/tbnz' instruction if it is in a different basic block. This fixes a bug where the input register was not defined for the 'tbz/tbnz' instruction. This happened, because we folded the 'and' instruction from a different basic block. This fixes rdar://problem/18784013. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220704 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 19:16:48 +00:00
Juergen Ributzka	d3a04223e8	[FastISel][AArch64] Fix load/store with frame indices. At higher optimization levels the LLVM IR may contain more complex patterns for loads/stores from/to frame indices. The 'computeAddress' function wasn't able to handle this and triggered an assertion. This fix extends the possible addressing modes for frame indices. This fixes rdar://problem/18783298. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220700 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 18:21:58 +00:00
Kostya Serebryany	866ee52df3	[asan] experimental tracing for indirect calls, llvm part. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220699 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 18:13:56 +00:00
Oliver Stannard	b1d8e7e77c	[ARM] Select VMAXNM and VMINNM regardless of operand order Currently, the ARM backend will select the VMAXNM and VMINNM for these C expressions: (a < b) ? a : b (a > b) ? a : b but not these expressions: (a > b) ? b : a (a < b) ? b : a This patch allows all of these expressions to be matched. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220671 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 09:23:02 +00:00
David Majnemer	fe58be3733	InstCombine: Fix a combine assuming that icmp operands were integers An icmp may have pointer arguments, it isn't limited to integers or vectors of integers. This fixes PR21388. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220664 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 05:47:49 +00:00
Elena Demikhovsky	9e19cf1ffd	AVX-512: Fixed encoding of VPBROADCASTM and added SKX forms of this instruction git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220638 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-26 09:52:24 +00:00
Peter Zotov	2e643bbeae	[OCaml] hexagon can't run MCJIT tests, XFAIL it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220621 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-25 19:01:14 +00:00
Peter Zotov	0ce2ef8c2d	[OCaml] Unbreak Llvm_executionengine.initialize_native_target. First, return true on success, as it is the OCaml convention. Second, also initialize the native assembly printer, which is, despite the name, required for MCJIT operation. Since this function did not initialize the assembly printer earlier and no function to initialize native assembly printer was available elsewhere, it is safe to break its interface: it means that it simply could not be used successfully before. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220620 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-25 18:50:02 +00:00
Peter Zotov	60d3f5918d	[OCaml] Expose Llvm_executionengine.ExecutionEngine.create_mcjit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220619 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-25 18:49:56 +00:00
Jingyue Wu	71fe4f0197	[SeparateConstOffsetFromGEP] Fixed a bug related to unsigned modulo The dividend in "signed % unsigned" is treated as unsigned instead of signed, causing unexpected behavior such as -64 % (uint64_t)24 == 0. Added a regression test in split-gep.ll Patched by Hao Liu. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220618 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-25 18:34:03 +00:00
Jingyue Wu	5c50ab84b3	[SeparateConstOffsetFromGEP] Fixed a bug in rebuilding OR expressions The two operands of the new OR expression should be NextInChain and TheOther instead of the two original operands. Added a regression test in split-gep.ll. Hao Liu reported this bug, and provded the test case and an initial patch. Thanks! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220615 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-25 17:36:21 +00:00
Jingyue Wu	1d1d705a95	[NVPTX] aligned byte-buffers for vector return types Summary: Fixes PR21100 which is caused by inconsistency between the declared return type and the expected return type at the call site. The new behavior is consistent with nvcc and the NVPTXTargetLowering::getPrototype function. Test Plan: test/Codegen/NVPTX/vector-return.ll Reviewers: jholewinski Reviewed By: jholewinski Subscribers: llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D5612 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220607 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-25 03:46:16 +00:00
Rafael Espindola	b31d53a60f	Add a test for the -suppress-warnings option. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220603 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-25 01:14:15 +00:00
Evgeniy Stepanov	e376441786	[msan] Make -msan-check-constant-shadow a bit stronger. Allow (under the experimental flag) non-Instructions to participate in MSan checks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220601 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 23:34:15 +00:00
Kevin Enderby	44ccedc273	Fix a Mach-O assembler segfault for a subtraction expression with an undefined symbol. In a Mach-O object file a relocatable expression of the form SymbolA - SymbolB + constant is allowed when both symbols are defined in a section. But when either symbol is undefined it is an error. The code was crashing when it had an undefined symbol in this case. And should have printed a error message using the location information in the relocation entry. rdar://18678402 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220599 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 22:39:40 +00:00
Simon Pilgrim	44efa200e2	[X86][SSE] Bitcast assertion in XFormVExtractWithShuffleIntoLoad Minor patch to fix an issue in XFormVExtractWithShuffleIntoLoad where a load is unary shuffled, then bitcast (to a type with the same number of elements) before extracting an element. An undef was created for the second shuffle operand using the original (post-bitcasted) vector type instead of the pre-bitcasted type like the rest of the shuffle node - this was then causing an assertion on the different types later on inside SelectionDAG::getVectorShuffle. Differential Revision: http://reviews.llvm.org/D5917 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220592 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 21:04:41 +00:00
Colin LeMahieu	8699f5390b	[Hexagon] Resubmission of 220427 Modified library structure to deal with circular dependency between HexagonInstPrinter and HexagonMCInst. Adding encoding bits for add opcode. Adding llvm-mc tests. Removing unit tests. http://reviews.llvm.org/D5624 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220584 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 19:00:32 +00:00
Sanjay Patel	51fa1bcef3	Allow AVX vrsqrtps generation. This is a follow-on to r220570 that allows a 256-bit (v8f32) version of vrsqrtps to be generated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220579 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 17:59:18 +00:00
Sanjay Patel	a46f06efe2	Use rsqrt (X86) to speed up reciprocal square root calcs This is a first step for generating SSE rsqrt instructions for reciprocal square root calcs when fast-math is allowed. For now, be conservative and only enable this for AMD btver2 where performance improves significantly - for example, 29% on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c (if we convert the data type to single-precision float). This patch adds a two constant version of the Newton-Raphson refinement algorithm to DAGCombiner that can be selected by any target via a parameter returned by getRsqrtEstimate().. See PR20900 for more details: http://llvm.org/bugs/show_bug.cgi?id=20900 Differential Revision: http://reviews.llvm.org/D5658 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220570 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 17:02:16 +00:00
Daniel Sanders	de90aa00ac	[mips] For N32/N64, structs must be passed in the upper bits of a register. Summary: Most structs were fixed by r218451 but those of between >32-bits and <64-bits remained broken since they were not marked with [ASZ]ExtUpper. This patch fixes the remaining cases by using CCPromoteToUpperBitsInType<i64> on i64's in addition to i32 and smaller. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5963 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220556 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 13:09:19 +00:00
Oliver Stannard	9bb3f37aa4	[AArch64] Fix fast-isel of cbz of i1, i8, i16 This fixes a miscompilation in the AArch64 fast-isel which was triggered when a branch is based on an icmp with condition eq or ne, and type i1, i8 or i16. The cbz instruction compares the whole 32-bit register, so values with the bottom 1, 8 or 16 bits clear would cause the wrong branch to be taken. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220553 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 09:54:41 +00:00
Timur Iskhodzhanov	806ccfff11	Update test/MC/ARM/coff-debugging-secrel.ll expectations to fix breakage caused by r220544 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220548 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 06:24:07 +00:00
Timur Iskhodzhanov	92e132a36f	Fix PR21189 -- Emit symbol subsection required to debug LLVM-built binaries with VS2012+ Reviewed at http://reviews.llvm.org/D5772 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220544 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 01:27:45 +00:00
Ahmed Bougacha	636864d8e7	Make test for r220533 more robust by using GPR pattern. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220541 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 00:03:46 +00:00
Adam Nemet	7f7bf0da6a	[AVX512] FMA support for the 231 variants This is asm/diasm-only support, similar to AVX. For ISeling the register variant, they are no different from 213 other than whether the multiplication or the addition operand is destructed. For ISeling the memory variant, i.e. to fold a load, they are no different than the 132 variant. The addition operand (op3) in both cases can come from memory. Again the ony difference is which operand is destructed. There could be a post-RA pass that would convert a 213 or 132 into a 231. Part of <rdar://problem/17082571> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220540 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 00:03:00 +00:00
Ahmed Bougacha	17c1e34c12	[SelectionDAG] Teach the vector scalarizer about FP conversions. This adds support for legalization of instructions of the form: [fp_conv] <1 x i1> %op to <1 x double> where fp_conv is one of fpto[us]i, [us]itofp. This used to assert because they were simply missing from the vector operand scalarizer. A similar problem arose in r190830, with trunc instead. Fixes PR20778. Differential Revision: http://reviews.llvm.org/D5810 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220533 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 22:49:25 +00:00
Tim Northover	c0ecebb32b	ScheduleDAG: record PhysReg dependencies represented by CopyFromReg nodes x86's CMPXCHG -> EFLAGS consumer wasn't being recorded as a real EFLAGS dependency because it was represented by a pair of CopyFromReg(EFLAGS) -> CopyToReg(EFLAGS) nodes. ScheduleDAG was expecting the source to be an implicit-def on the instruction, where the result numbers in the DAG and the Uses list in TableGen matched up precisely. The Copy notation seems much more robust, so this patch extends ScheduleDAG rather than refactoring x86. Should fix PR20376. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220529 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 22:31:48 +00:00
David Blaikie	991327143f	DebugInfo: Remove DwarfDebug::CurrentFnArguments since we have to handle argument ordering of other arguments (abstract arguments) in the same way and already have code for that too. While refactoring this code I was confused by both the name I had introduced (addNonArgumentVariable... but it has all this logic to handle argument numbering and keep things in order?) and by the redundancy. Seems when I fixed the misordered inlined argument handling, I didn't realize it was mostly redundant with the argument ordering code (which I may've also written, I'm not sure). So let's just rely on the more general case. The only oddity in output this produces is that it means when we emit all the variables for the current function, we don't track when we've finished the argument variables and are about to start the local variables and insert DW_AT_unspecified_parameters (for varargs functions) there. Instead it ends up after the local variables, scopes, etc. But this isn't invalid and doesn't cause DWARF consumers problems that I know of... so we'll just go with that because it makes the code nice & simple. (though, let's see what the buildbots have to say about this - crosses fingers) There will be some cleanup commits to follow to remove the now trivial wrappers, etc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220527 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 22:27:50 +00:00
Timur Iskhodzhanov	13535f412a	PR21189: Teach llvm-readobj to dump bits of COFF symbol subsections required to debug using VS2012+ Reviewed at http://reviews.llvm.org/D5755 Thanks to Andrey Guskov for his help investigating this! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220526 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 22:25:31 +00:00
Ahmed Bougacha	4d962b05ca	[X86] Improve mul w/ overflow codegen, to MUL8+SETO. Currently, @llvm.smul.with.overflow.i8 expands to 9 instructions, where 3 are really needed. This adds X86ISD::UMUL8/SMUL8 SD nodes, and custom lowers them to MUL8/IMUL8 + SETO. i8 is a special case because there is no two/three operand variants of (I)MUL8, so the first operand and return value need to go in AL/AX. Also, we can't write patterns for these instructions: TableGen refuses patterns where output operands don't match SDNode results. In this case, instructions where the output operand is an implicitly defined register. A related special case (and FIXME) exists for MUL8 (X86InstrArith.td): // FIXME: Used for 8-bit mul, ignore result upper 8 bits. // This probably ought to be moved to a def : Pat<> if the // syntax can be accepted. [(set AL, (mul AL, GR8:$src)), (implicit EFLAGS)] Ideally, these go away with UMUL8, but we still need to improve TableGen support of implicit operands in patterns. Before this change: movsbl %sil, %eax movsbl %dil, %ecx imull %eax, %ecx movb %cl, %al sarb $7, %al movzbl %al, %eax movzbl %ch, %esi cmpl %eax, %esi setne %al After: movb %dil, %al imulb %sil seto %al Also, remove a made-redundant testcase for PR19858, and enable more FastISel ALU-overflow tests for SelectionDAG too. Differential Revision: http://reviews.llvm.org/D5809 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220516 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 21:55:31 +00:00
Sanjay Patel	d2153694e0	Handle sqrt() shrinking in SimplifyLibCalls like any other call This patch removes a chunk of special case logic for folding (float)sqrt((double)x) -> sqrtf(x) in InstCombineCasts and handles it in the mainstream path of SimplifyLibCalls. No functional change intended, but I loosened the restriction on the existing sqrt testcases to allow for this optimization even without unsafe-fp-math because that's the existing behavior. I also added a missing test case for not shrinking the llvm.sqrt.f64 intrinsic in case the result is used as a double. Differential Revision: http://reviews.llvm.org/D5919 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220514 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 21:52:45 +00:00
Peter Collingbourne	1ea06a0ead	Make llvm-go test dependency optional. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220503 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 19:51:40 +00:00
Kevin Enderby	a1afbd6421	Update llvm-objdump’s Mach-O symbolizer code for Objective-C references. This prints disassembly comments for Objective-C references to CFStrings, Selectors, Classes and method calls. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220500 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 19:37:31 +00:00
Rafael Espindola	a023f50940	Cleanup this test a bit. Use simpler names and remove unnecessary fields. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220499 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 19:36:21 +00:00
Rafael Espindola	c32d7489b0	Cleanup this test a bit. Use simpler names and remove unnecessary fields. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220498 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 19:23:42 +00:00
David Blaikie	129ab6c6c5	DebugInfo: Simplify/tidy/correct global variable decl/def emission handling. This fixes a bug (introduced by fixing the IR emitted from Clang where the definition of a static member would be scoped within the class, rather than within its lexical decl context) where the definition of a static variable would be placed inside a class. It also improves source fidelity by scoping static class member definitions inside the lexical decl context in which tehy are written (eg: namespace n { class foo { static int i; } int foo::i; } - the definition of 'i' will be within the namespace 'n' in the DWARF output now). Lastly, and the original goal, this reduces debug info size slightly (and makes debug info easier to read, etc) by placing the definitions of non-member global variables within their namespace, rather than using a separate namespace-scoped declaration along with a definition at global scope. Based on patches and discussion with Frédéric. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220497 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 19:12:43 +00:00
Rafael Espindola	b11d9944d9	Make this test a bit stricter. This now: * Forces the linker to include the internal definition. * Checks the full output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220495 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 18:52:46 +00:00
Rafael Espindola	74ff8164d7	Make this test a bit stricter. This now: * Forces the linker to include the internal definition. * Checks the full output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220494 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 18:44:07 +00:00
Reid Kleckner	e852ac7772	Revert "Don't count inreg params when mangling fastcall functions" This reverts commit r214981. I'm not sure what I was thinking when I wrote this. Testing with MSVC shows that this function is mangled to '@f@8': int __fastcall f(int a, int b); git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220492 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 17:50:42 +00:00
Renato Golin	06b11e36e5	Do not emit intermediate register for zero FP immediate This updates check for double precision zero floating point constant to allow use of instruction with immediate value rather than temporary register. Currently "a == 0.0", where "a" is of "double" type generates: vmov.i32 d16, #0x0 vcmpe.f64 d0, d16 With this change it becomes: vcmpe.f64 d0, #0 Patch by Sergey Dmitrouk. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220486 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 15:31:50 +00:00
NAKAMURA Takumi	effe629b3d	Revert r220427, "[Hexagon] Adding encoding bits for add opcode." It brought cyclic dependecy between HexagonAsmPrinter and HexagonDesc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220478 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 11:31:22 +00:00
Zoran Jovanovic	71832e7ed9	[mips][microMIPS] Implement ADDIUR1SP instruction Differential Revision: http://reviews.llvm.org/D5153 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220477 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 11:13:59 +00:00
Zoran Jovanovic	fd515137bc	ps][microMIPS] Implement ADDIUR2 instruction Differential Revision: http://reviews.llvm.org/D5151 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220476 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 11:06:34 +00:00
Zoran Jovanovic	f58c95aac0	ps][microMIPS] Implement LI16 instruction Differential Revision: http://reviews.llvm.org/D5149 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220475 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 10:59:24 +00:00
Zoran Jovanovic	558236adf0	[mips][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructions Differential Revision: http://reviews.llvm.org/D5774 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220474 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 10:42:01 +00:00
Oliver Stannard	9982879c4e	[Thumb2] Improve disassembly of memory hints Currently, the ARM disassembler will disassemble the Thumb2 memory hint instructions (PLD, PLDW and PLI), even for targets which do not have these instructions. This patch adds the required checks to the disassmebler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220472 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 08:52:58 +00:00
Akira Hatanaka	be9668675a	[ARM, stack protector] If supported, use armv7 instructions. This commit enables using movt/movw to load the stack guard address: movw r0, :lower16:(L_g3$non_lazy_ptr-(LPC0_0+8)) movt r0, :upper16:(L_g3$non_lazy_ptr-(LPC0_0+8)) ldr r0, [pc, r0] Previously a pc-relative load was emitted: ldr r0, LCPI0_0 ldr r0, [pc, r0] rdar://problem/18740489 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220470 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 04:17:05 +00:00
Frederic Riss	9970b0fcaa	[dwarfdump] Dump DW_AT_ranges values inline in the debug_info dump. The output looks like that: DW_AT_ranges [FORM_data4] (0x00000000 [0x00000001000024a0 - 0x00000001000024c2) [0x0000000100002505 - 0x000000010000268b)) Differential Revision: http://reviews.llvm.org/D5712 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220466 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 04:08:34 +00:00
Peter Collingbourne	bdc3a5b99a	Add llvm-go tool. This tool lets us build LLVM components within the tree by setting up a $GOPATH that resembles a tree fetched in the normal way with "go get". It is intended that components such as the Go frontend will be built in-tree using this tool. Differential Revision: http://reviews.llvm.org/D5902 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220462 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 02:33:23 +00:00
Derek Schuff	5385df6bba	Fix Mips nacl-mask test for new bundle-aligned label behavior After r220439 the behavior of labels in bundle-align mode changed, and I neglected to update this test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220447 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 23:32:00 +00:00
Derek Schuff	cdb105b62f	[MC] Attach labels to existing fragments instead of using a separate fragment Summary: Currently when emitting a label, a new data fragment is created for it if the current fragment isn't a data fragment. This change instead enqueues the label and attaches it to the next fragment (e.g. created for the next instruction) if possible. When bundle alignment is not enabled, this has no functionality change (it just results in fewer extra fragments being created). For bundle alignment, previously labels would point to the beginning of the bundle padding instead of the beginning of the emitted instruction. This was not only less efficient (e.g. jumping to the nops instead of past them) but also led to miscalculation of the address of the GOT (since MC uses a label difference rather than emitting a "." symbol). Fixes https://code.google.com/p/nativeclient/issues/detail?id=3982 Test Plan: regression test attached Reviewers: jvoung, eliben Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D5915 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220439 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 22:38:06 +00:00
Colin LeMahieu	545127f54d	[Hexagon] Adding encoding bits for add opcode. Adding llvm-mc tests. Removing unit tests. http://reviews.llvm.org/D5624 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220427 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 20:58:35 +00:00
Chad Rosier	fa16693864	[AArch64] Add support for the .inst directive. This has been implement using the MCTargetStreamer interface as is done in the ARM, Mips and PPC backends. Phabricator: http://reviews.llvm.org/D5891 PR20964 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220422 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 20:35:57 +00:00
Justin Bogner	6a369dbdd6	test: Make this test runnable in directories with @ in their names Jenkins likes to use directories with names involving the '@' character, which breaks the sed expression in this test. Switch to use '\|' on the assumption that it's less likely to show up in a path. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220401 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 18:18:54 +00:00
Bill Schmidt	55321cd8a7	[PATCH] Support select-cc for VSFRC when VSX is enabled A previous patch enabled SELECT_VSRC and SELECT_CC_VSRC for VSX to handle <2 x double> cases. This patch adds SELECT_VSFRC and SELECT_CC_VSFRC to allow use of all 64 vector-scalar registers for the f64 type when VSX is enabled. The changes are analogous to those in the previous patch. I've added a new variant to vsx.ll to test the code generation. (I also cleaned up a little formatting in PPCInstrVSX.td from the previous patch.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220395 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 16:58:20 +00:00
Sanjay Patel	dc18ebc4b1	Shrinkify libcalls: use float versions of double libm functions with fast-math (bug 17850) When a call to a double-precision libm function has fast-math semantics (via function attribute for now because there is no IR-level FMF on calls), we can avoid fpext/fptrunc operations and use the float version of the call if the input and output are both float. We already do this optimization using a command-line option; this patch just adds the ability for fast-math to use the existing functionality. I moved the cl::opt from InstructionCombining into SimplifyLibCalls because it's only ever used internally to that class. Modified the existing test cases to use the unsafe-fp-math attribute rather than repeating all tests. This patch should solve: http://llvm.org/bugs/show_bug.cgi?id=17850 Differential Revision: http://reviews.llvm.org/D5893 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220390 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 15:29:23 +00:00
Diego Novillo	79ca2c234d	Change error to warning when a profile cannot be found. When the profile for a function cannot be applied, we use to emit an error. This seems extreme. The compiler can continue, it's just that the optimization opportunities won't include profile information. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220386 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 13:36:35 +00:00
Diego Novillo	418d0cf1c1	Support using sample profiles with partial debug info. Summary: When using a profile, we used to require the use -gmlt so that we could get access to the line locations. This is used to match line numbers in the input profile to the line numbers in the function's IR. But this is actually not necessary. The driver can provide source location tracking without the emission of debug information. In these cases, the annotation 'llvm.dbg.cu' is missing from the IR, but the actual line location annotations are still present. This patch adds a new way of looking for the start of the current function. Instead of looking through the compile units in llvm.dbg.cu, we can walk up the scope for the first instruction in the function with a debug loc. If that describes the function, we use it. Otherwise, we keep looking until we find one. If no such instruction is found, we then give up and produce an error. Reviewers: echristo, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5887 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220382 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 12:59:00 +00:00
Arnaud A. de Grandmaison	c9ada07cca	[AArch64] Cleanup A57PBQPConstraints And add a long awaited testcase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220381 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 12:40:20 +00:00
Bruno Cardoso Lopes	5165ff9878	[InstSimplify] Support constant folding to vector of pointers ConstantFolding crashes when trying to InstSimplify the following load: @a = private unnamed_addr constant %mst { i8* inttoptr (i64 -1 to i8), i8 inttoptr (i64 -1 to i8) }, align 8 %x = load <2 x i8>* bitcast (%mst* @a to <2 x i8>), align 8 This patch fix this by adding support to this type of folding: %x = load <2 x i8> bitcast (%mst* @a to <2 x i8>), align 8 ==> gets folded to: %x = <2 x i8> <i8 inttoptr (i64 -1 to i8), i8 inttoptr (i64 -1 to i8*)> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220380 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 12:18:48 +00:00
Jyoti Allur	8546076401	[Thumb/Thumb2] Implement restrictions on SP in register list on LDM, STM variants in thumb mode git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220379 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 10:41:14 +00:00
Matt Arsenault	9b0ace6364	R600/SI: Add another failing testcase for i1 copies It's not handling phis. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220371 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 05:30:42 +00:00
Matt Arsenault	69643f47ad	R600/SI: Add failing testcase reduced from OpenCV This fails the verifier with: "Expected a VCSrc_32 register, but got a VReg_1 register" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220368 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 04:26:10 +00:00
Rafael Espindola	230b53fd1b	Handle spaces and quotes in file names in MRI scripts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220364 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-22 03:10:56 +00:00
Hans Wennborg	ec4e924836	Revert "Teach the load analysis to allow finding available values which require" (r220277) This seems to have caused PR21330. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220349 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 23:49:52 +00:00
Lang Hames	8c4980b307	[MCJIT] Defer application of AArch64 MachO GOT relocations until resolve time. On AArch64, GOT references are page relative (ADRP + LDR), so they can't be applied until we know exactly where, within a page, the GOT entry will be in the target address space. Fixes <rdar://problem/18693976>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220347 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 23:41:15 +00:00
Rafael Espindola	f6b403a753	MRI scripts: Add addlib support. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220346 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 23:18:51 +00:00
Matt Arsenault	015776f38c	Add minnum / maxnum codegen git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220342 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 23:01:01 +00:00
Matt Arsenault	252134602f	Add minnum / maxnum intrinsics These are named following the IEEE-754 names for these functions, rather than the libm fmin / fmax to avoid possible ambiguities. Some languages may implement something resembling fmin / fmax which return NaN if either operand is to propagate errors. These implement the IEEE-754 semantics of returning the other operand if either is a NaN representing missing data. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220341 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 23:00:20 +00:00
Matt Arsenault	c68710c02d	R600/SI: Add missing parameter to div_fmas intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220338 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 22:20:55 +00:00
Rafael Espindola	e539932aad	Overwrite instead of adding to archives when creating them in mri scripts. This matches the behavior of GNU ar and also makes it easier to implemnt support for the addlib command. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220336 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 21:56:47 +00:00
Matt Arsenault	3605d74d23	R600: Use default GlobalDirective The overridden one wasn't inserting a space, so you would end up with .globalfoo git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220329 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 21:08:36 +00:00
Arnaud A. de Grandmaison	de246de958	[PBQP] Teach PassConfig to tell if the default register allocator is used. This enables targets to adapt their pass pipeline to the register allocator in use. For example, with the AArch64 backend, using PBQP with the cortex-a57, the FPLoadBalancing pass is no longer necessary. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220321 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 20:47:22 +00:00
Arnaud A. de Grandmaison	5eb02a6c6a	[PBQP] Add a testcase for r220302: Fix coalescing benefits git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220316 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 20:10:21 +00:00
David Majnemer	dea8105323	InstCombine: Simplify FoldICmpCstShrCst This function was complicated by the fact that it tried to perform canonicalizations that were already preformed by InstSimplify. Remove this extra code and move the tests over to InstSimplify. Add asserts to make sure our preconditions hold before we make any assumptions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220314 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 19:51:55 +00:00
Rafael Espindola	4af7ead7bd	Drop support for an old version of ld64 (from darwin 9). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220310 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 18:31:09 +00:00
Rafael Espindola	33f014f298	Convert two tests to use llvm-readobj. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220308 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 18:24:31 +00:00
Matt Arsenault	8287a4b564	R600/SI: Add pattern for bswap git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220304 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 16:25:08 +00:00
Rafael Espindola	2afb0e4d2c	Add support for addmod to mri scripts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220294 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 14:46:17 +00:00
Bill Schmidt	41454cc88b	[PowerPC] Avoid VSX FMA mutate when killed product reg = addend reg With VSX enabled, test/CodeGen/PowerPC/recipest.ll exposes a bug in the FMA mutation pass. If we have a situation where a killed product register is the same register as the FMA target, such as: %vreg5<def,tied1> = XSNMSUBADP %vreg5<tied0>, %vreg11, %vreg5, %RM<imp-use>; VSFRC:%vreg5 F8RC:%vreg11 then the substitution makes no sense. We end up getting a crash when we try to extend the interval associated with the killed product register, as there is already a live range for %vreg5 there. This patch just disables the mutation under those circumstances. Since recipest.ll generates different code with VMX enabled, I've modified that test to use -mattr=-vsx. I've borrowed the code from that test that exposed the bug and placed it in fma-mutate.ll, where it tests several mutation opportunities including the "bad" one. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220290 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 13:02:37 +00:00
Oliver Stannard	00e0b8a016	[ARM] NEON 32-bit scalar moves are also available in VFPv2 The 32-bit variants of the NEON scalar<->GPR move instructions are also available in VFPv2. The 8- and 16-bit variants do require NEON. Note that the checks in the test file are all -DAG because they are checking a mixture of stdout and stderr, and the ordering is not guaranteed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220288 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 11:49:14 +00:00
Yuri Gorshenin	7ffc5bb51a	[asan-asm-instrumentation] Fixed memory accesses with rbp as a base or an index register. Summary: Fixed memory accesses with rbp as a base or an index register. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5819 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220283 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 10:22:27 +00:00
Oliver Stannard	d75e7ad0c8	[Thumb2] LDRS?[BH] cannot load to the PC The Thumb2 LDRS?[BH] instructions are not valid when the destination register is the PC (these encodings are used for preload hints). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220278 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 09:14:15 +00:00
Chandler Carruth	9156c5e3ba	Teach the load analysis to allow finding available values which require inttoptr or ptrtoint cast provided there is datalayout available. Eventually, the datalayout can just be required but in practice it will always be there today. To go with the ability to expose available values requiring a ptrtoint or inttoptr cast, helpers are added to perform one of these three casts. These smarts are necessary to finish canonicalizing loads and stores to the operational type requirements without regressing fundamental combines. I've added some test cases. These should actually improve as the load combining and store combining improves, but they may fundamentally be highlighting some missing combines for select in addition to exercising the specific added logic to load analysis. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220277 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 09:00:40 +00:00
Zoran Jovanovic	59e16813d2	[mips][microMIPS] Implement ADDU16 and SUBU16 instructions Differential Revision: http://reviews.llvm.org/D5118 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220276 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 08:44:58 +00:00
Zoran Jovanovic	a245b68293	[mips][microMIPS] Implement AND16, NOT16, OR16 and XOR16 instructions Differential Revision: http://reviews.llvm.org/D5117 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220275 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 08:32:40 +00:00
Rafael Espindola	45968c54e9	Fix a bit of confusion about .set and produce more readable assembly. Every target we support has support for assembly that looks like a = b - c .long a What is special about MachO is that the above combination suppresses the production of a relocation. With this change we avoid producing the intermediary labels when they don't add any value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220256 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 01:17:30 +00:00
Paul Robinson	f8c9d3d3c2	Do not attribute static allocas to the call site's DebugLoc. When functions are inlined, instructions without debug information are attributed to the call site's DebugLoc. After inlining, inlined static allocas are moved to the caller's entry block, adjacent to the caller's original static alloca instructions. By retaining the call site's DebugLoc, these instructions could cause instructions that were subsequently inserted at the entry block to pick up the same DebugLoc. Patch by Wolfgang Pieb! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220255 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 01:00:55 +00:00
Rafael Espindola	f46dd92ba4	Make this test a bit more strict. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220253 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 00:47:49 +00:00
Chandler Carruth	14e55b16df	Teach lit to filter the host LDFLAGS down from the build system and into the CGO build environment. This lets things like -rpath propagate down to the C++ code that is built along side the Go bindings when testing them. Patch by Peter Collingbourne, and verified that it works by me. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220252 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 00:36:28 +00:00
Lang Hames	acaf8f5618	[MCJIT] Temporarily revert r220245 - it broke several bots. (See e.g. http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/17653) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220249 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 00:24:02 +00:00
Philip Reames	c0c4f1b78e	Extend the verifier to validate range metadata on calls and invokes. Range metadata applies to loads, call, and invokes. We were validating that metadata applied to loads was correct according to the LangRef, but we were not validating metadata applied to calls or invokes. This change extracts the checking functionality to a common location, reuses it for all valid locations, and adds a simple test to ensure a misused range on a call gets reported. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220246 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 23:52:07 +00:00
Lang Hames	32aaaeaa05	[MCJIT] Make MCJIT honor symbol visibility settings when populating the global symbol table. Patch by Anthony Pesch. Thanks Anthony! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220245 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 23:39:54 +00:00
Quentin Colombet	a37862e2de	[X86] Fix a bug in the lowering of the mask of VSELECT. X86 code to lower VSELECT messed a bit with the bits set in the mask of VSELECT when it knows it can be lowered into BLEND. Indeed, only the high bits need to be set for those and it optimizes those accordingly. However, when the mask is a compile time constant, the lowering will be handled by the generic optimizer and those modifications will generate bad code in the generic optimizer. This patch fixes that by preventing the optimization if the VSELECT will be handled by the generic optimizer. <rdar://problem/18675020> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220242 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 23:13:30 +00:00
Philip Reames	90f3f15da5	Introduce a 'nonnull' metadata on Load instructions. The newly introduced 'nonnull' metadata is analogous to existing 'nonnull' attributes, but applies to load instructions rather than call arguments or returns. Long term, it would be nice to combine these into a single construct. The value of the load is allowed to vary between successive loads, but null is not a valid value to be loaded by any load marked nonnull. Reviewed by: Hal Finkel Differential Revision: http://reviews.llvm.org/D5220 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220240 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 22:40:55 +00:00
Simon Pilgrim	0d1978b813	[X86] Memory folding for commutative instructions (updated) This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning. Updated version of r219584 (reverted in r219595) - the commutation attempt now explicitly ensures that neither of the commuted source operands are tied to the destination operand / register, which was the source of all the regressions that occurred with the original patch attempt. Added additional regression test case provided by Joerg Sonnenberger. Differential Revision: http://reviews.llvm.org/D5818 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220239 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 22:14:22 +00:00
Tim Northover	4e399f4500	ARM: rework Thumb1 frame index rewriting The previous code had a few problems, motivating the choices here. 1. It could create instructions clobbering CPSR, but the incoming MachineInstr didn't reflect this. A potential source of corruption. This is why the patch has a new PseudoInst for before lowering. 2. Similarly, there was some code to handle the incoming instruction not being ARMCC::AL, but this would have caused massive problems if it was actually invoked when a complex offset needing more than one instruction was requested. 3. It wasn't designed to handle unaligned pointers (or offsets). These should probably be minimised anyway, but the code needs to deal with them properly regardless. 4. It had some rather dubious ad-hoc code to avoid calling emitThumbRegPlusImmediate, a function which should be designed to do precisely this job. We seem to cover the common cases correctly now, and hopefully can enhance emitThumbRegPlusImmediate to handle any extra optimisations we need to add in future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220236 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 21:28:41 +00:00
Gerolf Hoflehner	1591cf0cef	[AArch64] test case for compfail fixed by r219748 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220206 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 16:08:33 +00:00
Oliver Stannard	e7c9c44387	[Thumb2] RFE, SRS and "SUBS pc, lr" are undefined on v7M These instructions are related to the v7[AR] exception model, and are not defined on v7M. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220204 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 15:37:35 +00:00
Oliver Stannard	19d010b851	[ARM] Do not select SMULW[BT] or SMLAW[BT] The current instruction selection patterns for SMULW[BT] and SMLAW[BT] are incorrect. These instructions multiply a 32-bit and a 16-bit value (both signed) and return the top 32 bits of the 48-bit result. This preserves the 16 bits of overflow, whereas the patterns they currently match truncate the result to 16 bits then sign extend. To select these instructions, we would need to match an ISD::SMUL_LOHI, a sign extend, two shifts and an or. There is no way to match SMUL_LOHI in an instruction pattern as it defines multiple values, so this would have to be done in C++. I have raised http://llvm.org/bugs/show_bug.cgi?id=21297 to cover allowing correct selection of these instructions. This fixes http://llvm.org/bugs/show_bug.cgi?id=19396 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220196 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 11:30:35 +00:00
Oliver Stannard	508c39393a	[Thumb] Fix crash in Thumb1RegisterInfo::rewriteFrameIndex This function can, for some offsets from the SP, split one instruction into two. Since it re-uses the original instruction as the first instruction of the result, we need ensure its result register is not marked as dead before we use it in the second instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220194 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 11:00:18 +00:00
Chandler Carruth	3ac929c473	Fix a miscompile introduced in r220178. The original code had an implicit assumption that if the test for allocas or globals was reached, the two pointers were not equal. With my changes to make the pointer analysis more powerful here, I also had to guard against circumstances where the results weren't useful. That in turn violated the assumption and gave rise to a circumstance in which we could have a store with both the queried pointer and stored pointer rooted at the same alloca. Clearly, we cannot ignore such a store. There are other things we might do in this code to better handle the case of both pointers ending up at the same alloca or global, but it seems best to at least make the test explicit in what it intends to check. I've added tests for both the alloca and global case here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220190 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 10:03:01 +00:00
Chandler Carruth	080dfb5bda	Fix a somewhat subtle pair of issues with JumpThreading I introduced in r220178. First, the creation routine doesn't insert prior to the terminator of the basic block provided, but really at the end of the basic block. Instead, get the terminator and insert before that. The next issue was that we need to ensure multiple PHI node entries for a single predecessor re-use the same cast instruction rather than creating new ones. All of the logic here was without tests previously. I've reduced and added a test case from the test suite that crashed without both of these fixes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220186 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 05:34:36 +00:00
Chandler Carruth	35c4e071be	Teach the load analysis driving core instcombine logic and other bits of logic to look through pointer casts, making them trivially stronger in the face of loads and stores with intervening pointer casts. I've included a few test cases that demonstrate the kind of folding instcombine can do without pointer casts and then variations which obfuscate the logic through bitcasts. Without this patch, the variations all fail to optimize fully. This is more important now than it has been in the past as I've started moving the load canonicialization to more closely follow the value type requirements rather than the pointer type requirements and thus this needs to be prepared for more pointer casts. When I made the same change to stores several test cases regressed without logic along these lines so I wanted to systematically improve matters first. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220178 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 00:24:14 +00:00
Chandler Carruth	fc1c1ec435	Add a datalayout string to this test so that it exercises the full gamut of InstCombine rather than just the bits enabled when datalayout is optional. The primary fixes here are because now things are little endian. In good news, silliness like this seems like it will be going away as we've got pretty stong consensus on dropping optional datalayout entirely. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220176 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 00:11:31 +00:00
Bill Schmidt	551a3d7b56	[PowerPC] Clean up -mattr=+vsx tests to always specify -mcpu We recently discovered an issue that reinforces what a good idea it is to always specify -mcpu in our code generation tests, particularly for -mattr=+vsx. This patch ensures that all tests that specify -mattr=+vsx also specify -mcpu=pwr7 or -mcpu=pwr8, as appropriate. Some of the uses of -mattr=+vsx added recently don't make much sense (when specified for -mtriple=powerpc-apple-darwin8 or -march=ppc32, for example). For cases like this I've just removed the extra VSX test commands; there's enough coverage without them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220173 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-19 21:29:21 +00:00
Bill Schmidt	1d142d4d47	[PowerPC] Temporarily disable VSX for PowerPC fast-isel tests Patch by Bill Seurer; some comment formatting changes by me. There are a few PowerPC test cases for FastISel support that currently fail with VSX support enabled. The temporary workaround under discussion in http://reviews.llvm.org/D5362 helps, but the tests still fail because they specify -fast-isel-abort, and the VSX workaround punts back to SelectionDAG. We have plans to fix FastISel permanently for VSX, but until that's in place these tests are preventing us from enabling VSX by default. Therefore we are adding -mattr=-vsx to these tests until the full support is ready. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220172 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-19 20:48:47 +00:00
Bill Schmidt	ff8acb69e5	[PowerPC] Re-enable VSX test line for fma.ll with -mcpu=pwr7 The VSX testing variant in test/CodeGen/PowerPC/fma.ll had to be disabled because of unexpected behavior on many of the builders. I tracked this down to a situation that occurs when the VSX attribute is enabled for a target that disables the MI early scheduling pass. This patch adds -mcpu=pwr7 to make this predictable. The other issue will be addressed separately. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220171 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-19 20:27:56 +00:00
Chandler Carruth	63276ccdbd	Do a better and more complete job of preserving metadata when combining loads. This handles many more cases than just the AA metadata, some of them suggested by Hal in his review of the AA metadata handling patch. I've tried to test this behavior where tractable to do so. I'll point out that I have specifically not included a test for debuginfo because it was going to require 2 or 3 times as much work to craft some input which would survive the "helpful" stripping of debug info metadata that doesn't match the desired schema. This is another good example of why the current state of write-ability for our debug info metadata is unacceptable. I spent over 30 minutes trying to conjure some test case that would survive, even copying from other debug info tests, but it always failed to survive with no explanation of why or how I might fix it. =[ git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220165 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-19 10:46:46 +00:00
Chandler Carruth	4d2a706176	Move previously dead code to handle computing the known bits of an alias up to where it actually works as intended. The problem is that a GlobalAlias isa GlobalValue and so the prior block handled all of the cases. This allows us to constant fold based on the actual constant expression in the global alias. As an example, see the last function in the newly added test case which explicitly aligns an unaligned pointer using constant expression math. Without this change, we fail to see that and fold an alignment test to zero. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220164 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-19 09:06:56 +00:00
David Majnemer	0fd4e2e5a1	InstCombine: (sub (or A B) (xor A B)) --> (and A B) The following implements the transformation: (sub (or A B) (xor A B)) --> (and A B). Patch by Ankur Garg! Differential Revision: http://reviews.llvm.org/D5719 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220163 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-19 08:32:32 +00:00
David Majnemer	242aeb9d84	InstCombine: Optimize icmp eq/ne (shl Const2, A), Const1 The following implements the optimization for sequences of the form: icmp eq/ne (shl Const2, A), Const1 Such sequences can be transformed to: icmp eq/ne A, (TrailingZeros(Const1) - TrailingZeros(Const2)) This handles only the equality operators for now. Other operators need to be handled. Patch by Ankur Garg! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220162 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-19 08:23:08 +00:00
Chandler Carruth	908d4514f6	Fix a long-standing miscompile in the load analysis that was uncovered by my refactoring of this code. The method isSafeToLoadUnconditionally assumes that the load will proceed with the preferred type alignment. Given that, it has to ensure that the alloca or global is at least that aligned. It has always done this historically when a datalayout is present, but has never checked it when the datalayout is absent. When I refactored the code in r220156, I exposed this path when datalayout was present and that turned the latent bug into a patent bug. This fixes the issue by just removing the special case which allows folding things without datalayout. This isn't worth the complexity of trying to tease apart when it is or isn't safe without actually knowing the preferred alignment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220161 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-19 08:17:50 +00:00
Chandler Carruth	797e9b812e	Preserve AA metadata when combining (cast (load (...))) -> (load (cast (...))). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220141 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-18 11:00:12 +00:00
Chandler Carruth	9b2d091a9c	[InstCombine] Do an about-face on how LLVM canonicalizes (cast (load ...)) and (load (cast ...)): canonicalize toward the former. Historically, we've tried to load using the type of the pointer, and tried to match that type as closely as possible removing as many pointer casts as we could and trading them for bitcasts of the loaded value. This is deeply and fundamentally wrong. Repeat after me: memory does not have a type! This was a hard lesson for me to learn working on SROA. There is only one thing that should actually drive the type used for a pointer, and that is the type which we need to use to load from that pointer. Matching up pointer types to the loaded value types is very useful because it minimizes the physical size of the IR required for no-op casts. Similarly, the only thing that should drive the type used for a loaded value is how that value is used! Again, this minimizes casts. And in fact, the only thing motivating types in any part of LLVM's IR are the types used by the operations in the IR. We should match them as closely as possible. I've ended up removing some tests here as they were testing bugs or behavior that is no longer present. Mostly though, this is just cleanup to let the tests continue to function as intended. The only fallout I've found so far from this change was SROA and I have fixed it to not be impeded by the different type of load. If you find more places where this change causes optimizations not to fire, those too are likely bugs where we are assuming that the type of pointers is "significant" for optimization purposes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220138 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-18 06:36:22 +00:00
Chandler Carruth	a45d0c8bd1	Remove a test that was ported from the old llvm-gcc frontend test suite. This test is pretty awesome. It is claiming to test devirtualization. However, the code in question is not in fact devirtualized by LLVM. If you take the original C++ test case and run it through Clang at -O3 we fail to devirtualize it completely. It also isn't a sufficiently focused test case. The reason we fail to devirtualize it isn't because of any missing instcombine though. Instead, it is because we fail to emit an available externally vtable and thus the vtable is just an external and completely opaque. If I cause the vtable to be emitted, we successfully devirtualize things. Anyways, I'm just removing it because it is providing negative value at this point: it isn't representative of the output of Clang really, LLVM isn't doing the transform it claims to be testing, LLVM's failure to do the transform isn't actually an LLVM bug at all and we shouldn't be testing for it here, and finally the test is written in such a way that it will trivially pass even when the point of the test is failing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220137 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-18 06:36:18 +00:00
Nick Kledzik	5357e3ae1b	[llvm-objdump] don't test timestamp dump as that is time zone dependent git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220123 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-18 02:28:01 +00:00
Nick Kledzik	e0b3f29da9	[llvm-objdump] enhance test case for mach-o -private-headers git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220120 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-18 01:50:55 +00:00
Nick Kledzik	50ede1623d	[llvm-objdump] Fix mach-o binding decompression error git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220119 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-18 01:21:02 +00:00
Chandler Carruth	2402e6315d	[SROA] Change how SROA does vector-based promotion of allocas to handle cases where the alloca type, the load types, and the store types used all disagree. Previously, the only way that vector-based promotion occured was if the alloca type was a vector type. This was one of the very few remaining uses of the alloca's type to guide SROA/mem2reg left in LLVM. It turns out it was a bad idea. The alloca type can change very easily based on the mixture of types loaded and stored to that alloca. We shouldn't be relying on it as a signal for very much. Instead, the source of truth should be loads and stores. We should canonicalize the loads and stores as much as possible and then rely on them exclusively in SROA. When looking and loads and stores, we may find many different candidate vector types. This change will let SROA try all of them to find a vector type which is a viable way to promote the entire alloca to a vector register. With this change, it becomes possible to do better canonicalization and optimization of loads and stores without breaking SROA in random ways, and that should allow fixing a core source of performance loss in hot numerical loops such as those in Eigen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220116 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-18 00:44:02 +00:00
Aaron Watry	4e00650f58	R600/SI: Add global atomicrmw xchg v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220110 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 23:33:03 +00:00
Aaron Watry	2107be5bc7	R600/SI: Add global atomicrmw xor v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220109 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 23:33:01 +00:00
Aaron Watry	e81b68b86c	R600/SI: Add global atomicrmw or v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220108 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 23:32:59 +00:00
Aaron Watry	1883b51d2e	R600/SI: Add global atomicrmw min/umin v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220107 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 23:32:57 +00:00
Aaron Watry	387e397ecd	R600/SI: Add global atomicrmw max/umax v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220106 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 23:32:56 +00:00
Aaron Watry	beac0c1403	R600/SI: Add global atomicrmw and v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220105 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 23:32:54 +00:00
Aaron Watry	892bb7df98	R600/SI: Add global atomicrmw sub v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220104 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 23:32:52 +00:00
Aaron Watry	d9f9b51223	R600/SI: Fix/add tests for atomicrmw add The previous tests claimed to test constant offsets in the function name, but the tests weren't actually testing them. Clone the tests, and do testing of all combinations of the following: 1) with/without constant pointer offset 2) 32/64-bit addressing modes 3) Usage and non-usage of the return value from the atomicrmw Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220103 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 23:32:50 +00:00
Aaron Watry	802463d861	R600: Rename atomic_load global tests to atomic_add The function name now matches what it's actually testing. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220102 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 23:32:49 +00:00
Evgeniy Stepanov	c83c81a62e	[msan] Fix handling of byval arguments with large alignment. MSan param-tls slots are 8-byte aligned. This change clips alignment of memcpy into param-tls to 8. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220101 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 23:29:44 +00:00
Pete Cooper	185a992dcf	Check for dynamic alloca's when selecting lifetime intrinsics. TL;DR: Indexing maps with [] creates missing entries. The long version: When selecting lifetime intrinsics, we index the static alloca map with the AllocaInst we find for that lifetime. Trouble is, we don't first check to see if this is a dynamic alloca. On the attached example, this causes a dynamic alloca to create an entry in the static map, and returns 0 (the default) as the frame index for that lifetime. 0 was used for the frame index of the stack protector, which given that it now has a lifetime, is coloured, and merged with other stack slots. PEI would later trigger an assert because it expects the stack protector to not be dead. This fix ensures that we only get frame indices for static allocas, ie, those in the map. Dynamic ones are effectively dropped, which is suboptimal, but at least isn't completely broken. rdar://problem/18672951 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220099 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 22:59:33 +00:00
Bill Schmidt	3b362b3568	[PowerPC] Disable +vsx RUN line for fma.ll due to inconsistency on other builders git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220094 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 21:32:22 +00:00
Rafael Espindola	ec51f45338	Revert "TRE: make TRE a bit more aggressive" This reverts commit r219899. This also updates byval-tail-call.ll to make it clear what was breaking. Adding r219899 again will cause the load/store to disappear. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220093 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 21:25:48 +00:00
Bill Schmidt	d2dcbd00f7	[PowerPC] Change liveness testing in VSX FMA mutation pass With VSX enabled, LLVM crashes when compiling test/CodeGen/PowerPC/fma.ll. I traced this to the liveness test that's revised in this patch. The interval test is designed to only work for virtual registers, but in this case the AddendSrcReg is physical. Since there is already a walk of the MIs between the AddendMI and the FMA, I added a check for def/kill of the AddendSrcReg in that loop. At Hal Finkel's request, I converted the liveness test to an assert restricted to virtual registers. I've changed the fma.ll test to have VSX and non-VSX variants so we can test both kinds of multiply-adds. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220090 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 21:02:44 +00:00
Peter Collingbourne	560e2700e2	Disable ccache for go tests. Should fix llvm-clang-lld-x86_64-debian-fast bot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220071 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 18:32:36 +00:00
Matt Arsenault	7d8f1710a3	R600/SI: Allow commuting with source modifiers git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220066 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 18:00:48 +00:00
Matt Arsenault	84895bd2e6	R600/SI: Allow comuting fp immediates git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220062 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 18:00:39 +00:00
Peter Collingbourne	e7b03ee85c	We also need to catch OSError here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220058 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 17:46:46 +00:00
Matt Arsenault	46b53c9e4b	R600/SI: Remove SI_BUFFER_RSRC pseudo Just use REG_SEQUENCE directly, so there are fewer instructions to need to deal with later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220056 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 17:42:56 +00:00
Juergen Ributzka	32ef68718d	[Stackmaps] Enable invoking the patchpoint intrinsic. Patch by Kevin Modzelewski Reviewers: atrick, ributzka Reviewed By: ributzka Subscribers: llvm-commits, reames Differential Revision: http://reviews.llvm.org/D5634 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220055 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 17:39:00 +00:00
Andrea Di Biagio	5512b50db5	[X86] Fix missed selection of non-temporal store of zero vector. When the input to a store instruction was a zero vector, the backend always selected a normal vector store regardless of the non-temporal hint. This is fixed by this patch. This fixes PR19370. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220054 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 17:27:06 +00:00
James Molloy	7023b85187	[AArch64] Fix a silent codegen fault in BUILD_VECTOR lowering. We should be talking about the number of source elements, not the number of destination elements, given we know at this point that the source and dest element numbers are not the same. While we're at it, avoid writing to std::vector::end()... Bug found with random testing and a lot of coffee. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220051 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 17:06:31 +00:00
Rafael Espindola	ad8eef5a90	Don't crash if find_executable return None. This was crashing when trying to run the tests on Windows. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220048 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 16:07:43 +00:00
Bill Schmidt	b76f5ba103	[PowerPC] Enable use of lxvw4x/stxvw4x in VSX code generation Currently the VSX support enables use of lxvd2x and stxvd2x for 2x64 types, but does not yet use lxvw4x and stxvw4x for 4x32 types. This patch adds that support. As with lxvd2x/stxvd2x, this involves straightforward overriding of the patterns normally recognized for lvx/stvx, with preference given to the VSX patterns when VSX is enabled. In addition, the logic for permitting misaligned memory accesses is modified so that v4r32 and v4i32 are treated the same as v2f64 and v2i64 when VSX is enabled. Finally, the DAG generation for unaligned loads is changed to just use a normal LOAD (which will become lxvw4x) on P8 and later hardware, where unaligned loads are preferred over lvsl/lvx/lvx/vperm. A number of tests now generate the VSX loads/stores instead of lvx/stvx, so this patch adds VSX variants to those tests. I've also added <4 x float> tests to the vsx.ll test case, and created a vsx-p8.ll test case to be used for testing code generation for the P8Vector feature. For now, that simply tests the unaligned load/store behavior. This has been tested along with a temporary patch to enable the VSX and P8Vector features, with no new regressions encountered with or without the temporary patch applied. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220047 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 15:13:38 +00:00
Jan Vesely	9c1d1fa266	R600: Add EG to FMA test Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220045 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 14:45:27 +00:00
Jan Vesely	cef793e8c7	SelectionDAG: Add sext_inreg optimizations v2: use dyn_cast fixup comments v3: use cast Reviewed-by: Matt Arsenault <arsenm2@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220044 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 14:45:25 +00:00
Vasileios Kalintiris	eaf8f5efe9	[mips] Add support for COP1's Branch-On-Cond-Likely instructions Summary: Depends on D5782 Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5802 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220042 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 14:08:28 +00:00
Vasileios Kalintiris	0f22fe9b56	[mips] Add support for COP0's Branch-On-Cond-Likely instructions Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5782 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220036 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 12:38:35 +00:00
Hal Finkel	9d85eff56a	[DSE] Remove no-data-layout-only type-based overlap checking DSE's overlap checking contained special logic, used only when no DataLayout was available, which inferred a complete overwrite when the pointee types were equal. This logic seems fine for regular loads/stores, but does not work for memcpy and friends. Instead of fixing this, I'm just removing it. Philosophically, transformations should not contain enhanced behavior used only when data layout is lacking (data layout should be strictly additive), and maintaining these rarely-tested code paths seems not worthwhile at this stage. Credit to Aliaksei Zasenka for the bug report and the diagnosis. The test case (slightly reduced from that provided by Aliaksei) replaces the original contents of test/Transforms/DeadStoreElimination/no-targetdata.ll -- a few other tests have been updated to have a data layout. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220035 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 11:56:00 +00:00
Rafael Espindola	d8ee23f34c	Add back commits r219835 and a fixed version of r219829. The only difference from r219829 is using getOrCreateSectionSymbol(*ELFSec) instead of GetOrCreateSymbol(ELFSec->getSectionName()) in ELFObjectWriter which causes us to use the correct section symbol even if we have multiple sections with the same name. Original messages: r219829: Correctly handle references to section symbols. When processing assembly like .long .text we were creating a new undefined symbol .text. GAS on the other hand would handle that as a reference to the .text section. This patch implements that by creating the section symbols earlier so that they are visible during asm parsing. The patch also updates llvm-readobj to print the symbol number in the relocation dump so that the test can differentiate between two sections with the same name. r219835: Allow forward references to section symbols. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220021 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:48:58 +00:00
Bill Schmidt	101025c33d	[PPC] Adjust some PowerPC tests to account for presence/absence of VSX Patch by Bill Seurer; committed on his behalf. These test cases generate slightly different code sequences when VSX is activated and thus fail. The update turns off VSX explicitly for the existing checks and then adds a second set of checks for most of them that test the VSX instruction output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220019 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:41:22 +00:00
Rafael Espindola	410bde5171	Add a test that would have found the bug in r219829. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220016 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:34:23 +00:00
Akira Hatanaka	1cdebe50c1	ARM: Fix a bug which was causing convergence failure in constant-island pass. The bug is in ARMConstantIslands::createNewWater where the upper bound of the new water split point is computed: // This could point off the end of the block if we've already got constant // pool entries following this block; only the last one is in the water list. // Back past any possible branches (allow for a conditional and a maximally // long unconditional). if (BaseInsertOffset + 8 >= UserBBI.postOffset()) { BaseInsertOffset = UserBBI.postOffset() - UPad - 8; DEBUG(dbgs() << format("Move inside block: %#x\n", BaseInsertOffset)); } The split point is supposed to be somewhere between the machine instruction that loads from the constant pool entry and the end of the basic block, before branch instructions. The code above is fine if the basic block is large enough and there are a sufficient number of instructions following the machine instruction. However, if the machine instruction is near the end of the basic block, BaseInsertOffset can point to the machine instruction or another instruction that precedes it, and this can lead to convergence failure. This commit fixes this bug by ensuring BaseInsertOffset is larger than the offset of the instruction following the constant-loading instruction. rdar://problem/18581150 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220015 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:31:47 +00:00
Rafael Espindola	70a1be3f76	Revert commit r219835 and r219829. Revert "Correctly handle references to section symbols." Revert "Allow forward references to section symbols." Rui found a regression I am debugging. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220010 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:06:02 +00:00
Peter Zotov	46b94aa80e	[OCaml] Add Llvm.instr_clone. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220008 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:02:40 +00:00
Alexander Potapenko	0fea775e5c	[llvm-symbolizer] Introduce the -dsym-hint option. llvm-symbolizer will consult one of the .dSYM paths passed via -dsym-hint if it fails to find the .dSYM bundle at the default location. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220004 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 00:50:19 +00:00
Peter Collingbourne	770f3af232	Add our own copy of the find_executable function to cope with installations that do not have the distutils.spawn package. Should hopefully fix the aarch64 buildbot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219991 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 23:43:20 +00:00
Peter Collingbourne	798ace2e58	Initial version of Go bindings. This code is based on the existing LLVM Go bindings project hosted at: https://github.com/go-llvm/llvm Note that all contributors to the gollvm project have agreed to relicense their changes under the LLVM license and submit them to the LLVM project. Differential Revision: http://reviews.llvm.org/D5684 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219976 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 22:48:02 +00:00
Robin Morisset	d310963833	Erase fence insertion from SelectionDAGBuilder.cpp (NFC) Summary: Backends can use setInsertFencesForAtomic to signal to the middle-end that montonic is the only memory ordering they can accept for stores/loads/rmws/cmpxchg. The code lowering those accesses with a stronger ordering to fences + monotonic accesses is currently living in SelectionDAGBuilder.cpp. In this patch I propose moving this logic out of it for several reasons: - There is lots of redundancy to avoid: extremely similar logic already exists in AtomicExpand. - The current code in SelectionDAGBuilder does not use any target-hooks, it does the same transformation for every backend that requires it - As a result it is plain unsound, as it was apparently designed for ARM. It happens to mostly work for the other targets because they are extremely conservative, but Power for example had to switch to AtomicExpand to be able to use lwsync safely (see r218331). - Because it produces IR-level fences, it cannot be made sound ! This is noted in the C++11 standard (section 29.3, page 1140): ``` Fences cannot, in general, be used to restore sequential consistency for atomic operations with weaker ordering semantics. ``` It can also be seen by the following example (called IRIW in the litterature): ``` atomic<int> x = y = 0; int r1, r2, r3, r4; Thread 0: x.store(1); Thread 1: y.store(1); Thread 2: r1 = x.load(); r2 = y.load(); Thread 3: r3 = y.load(); r4 = x.load(); ``` r1 = r3 = 1 and r2 = r4 = 0 is impossible as long as the accesses are all seq_cst. But if they are lowered to monotonic accesses, no amount of fences can prevent it.. This patch does three things (I could cut it into parts, but then some of them would not be tested/testable, please tell me if you would prefer that): - it provides a default implementation for emitLeadingFence/emitTrailingFence in terms of IR-level fences, that mimic the original logic of SelectionDAGBuilder. As we saw above, this is unsound, but the best that can be done without knowing the targets well (and there is a comment warning about this risk). - it then switches Mips/Sparc/XCore to use AtomicExpand, relying on this default implementation (that exactly replicates the logic of SelectionDAGBuilder, so no functional change) - it finally erase this logic from SelectionDAGBuilder as it is dead-code. Ideally, each target would define its own override for emitLeading/TrailingFence using target-specific fences, but I do not know the Sparc/Mips/XCore memory model well enough to do this, and they appear to be dealing fine with the ARM-inspired default expansion for now (probably because they are overly conservative, as Power was). If anyone wants to compile fences more agressively on these platforms, the long comment should make it clear why he should first override emitLeading/TrailingFence. Test Plan: make check-all, no functional change Reviewers: jfb, t.p.northover Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D5474 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219957 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 20:34:57 +00:00
Matt Arsenault	0134a9bed3	R600: Fix nonsensical implementation of computeKnownBits for BFE This was resulting in invalid simplifications of sdiv git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219953 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 20:07:40 +00:00
Rafael Espindola	2f8f1d34e3	Delete -std-compile-opts. These days -std-compile-opts was just a silly alias for -O3. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219951 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 20:00:02 +00:00
Bjorn Steinbrink	6eaa62af77	Allow call-slop optzn for destinations with a suitable dereferenceable attribute Summary: Currently, call slot optimization requires that if the destination is an argument, the argument has the sret attribute. This is to ensure that the memory access won't trap. In addition to sret, we can also allow the optimization to happen for arguments that have the new dereferenceable attribute, which gives the same guarantee. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5832 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219950 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 19:43:08 +00:00
Sanjay Patel	d8214db086	fold: sqrt(x * x * y) -> fabs(x) * sqrt(y) If a square root call has an FP multiplication argument that can be reassociated, then we can hoist a repeated factor out of the square root call and into a fabs(). In the simplest case, this: y = sqrt(x * x); becomes this: y = fabs(x); This patch relies on an earlier optimization in instcombine or reassociate to put the multiplication tree into a canonical form, so we don't have to search over every permutation of the multiplication tree. Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to use function-level attributes to do this optimization. This needs to be fixed for both the intrinsics and in the backend. Differential Revision: http://reviews.llvm.org/D5787 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219944 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 18:48:17 +00:00
Juergen Ributzka	c40dab2069	[AArch64] Fix miscompile of sdiv-by-power-of-2. When the constant divisor was larger than 32bits, then the optimized code generated for the AArch64 backend would emit the wrong code, because the shift was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would loose the upper 32bits. This fixes rdar://problem/18678801. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219934 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 16:41:15 +00:00
Vasileios Kalintiris	3b72ec5083	[mips] Account for endianess when expanding BuildPairF64/ExtractElementF64 nodes. Summary: In order to support big endian targets for the BuildPairF64 nodes we just need to swap the low/high pair registers. Additionally, for the ExtractElementF64 nodes we have to calculate the correct stack offset with respect to the node's register/operand that we want to extract. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5753 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219931 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 15:41:51 +00:00
Vasileios Kalintiris	02065a65cd	[mips] Marked the DI/EI instruction aliases as MIPS32r2 Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5751 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219927 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 15:23:52 +00:00
Akira Hatanaka	4eb03123df	Reapply r219832 - InstCombine: Narrow switch instructions using known bits. The code committed in r219832 asserted when it attempted to shrink a switch statement whose type was larger than 64-bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219902 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 06:00:46 +00:00
Saleem Abdulrasool	ebe6584c32	TRE: make TRE a bit more aggressive Make tail recursion elimination a bit more aggressive. This allows us to get tail recursion on functions that are just branches to a different function. The fact that the function takes a byval argument does not restrict it from being optimised into just a tail call. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219899 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 03:27:30 +00:00
Akira Hatanaka	608d59f535	Revert r219832. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219884 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 01:17:02 +00:00
Sanjoy Das	a0b0184b33	Revert "r219834 - Teach ScalarEvolution to sharpen range information" This change breaks the asan buildbots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13468 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219878 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:46:04 +00:00
Hal Finkel	43141a0764	Preserve non-byval pointer alignment attributes using @llvm.assume when inlining For pointer-typed function arguments, enhanced alignment can be asserted using the 'align' attribute. When inlining, if this enhanced alignment information is not otherwise available, preserve it using @llvm.assume-based alignment assumptions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219876 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:44:41 +00:00
Adam Nemet	fb9d61a8d6	[AVX512] Add DQ subvector inserts In AVX512f we support 64x2 and 32x8 inserts via matching them to 32x4 and 64x4 respectively. These are matched by "Alt" Pat<>'s (Alt stands for alternative VTs). Since DQ has native support for these intructions, I peeled off the non-"Alt" part of the baseclass into vinsert_for_size_no_alt. The DQ instructions are derived from this multiclass. The "Alt" Pat<>'s are disabled with DQ. Fixes <rdar://problem/18426089> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219874 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:42:17 +00:00
Adam Nemet	ec7f30662e	[AVX512] Add SKX testing to avx512-insert-extract.ll This is in preparation to adding DQ subvector inserts to this testcase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219873 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:42:14 +00:00
Adam Nemet	f9e3a3afa1	[AVX512] Fix test to produce a defined value We're inserting into a 8 wide vector, so the index should be < 8. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219872 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:42:11 +00:00
Tom Stellard	d3fc10a525	R600/SI: Fix bug where immediates were being used in DS addr operands The SelectDS1Addr1Offset complex pattern always tries to store constant lds pointers in the offset operand and store a zero value in the addr operand. Since the addr operand does not accept immediates, the zero value needs to first be copied to a register. This newly created zero value will not go through normal instruction selection, so we need to manually insert a V_MOV_B32_e32 in the complex pattern. This bug was hidden by the fact that if there was another zero value in the DAG that had not been selected yet, then the CSE done by the DAG would use the unselected node for the addr operand rather than the one that was just created. This would lead to the zero value being selected and the DAG automatically inserting a V_MOV_B32_e32 instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219848 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 21:08:59 +00:00
Rafael Espindola	fc6e0f6f87	Allow forward references to section symbols. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219835 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 19:30:18 +00:00
Sanjoy Das	40edbf130e	Teach ScalarEvolution to sharpen range information. If x is known to have the range [a, b) in a loop predicated by (icmp ne x, a), its range can be sharpened to [a + 1, b). Get ScalarEvolution and hence IndVars to exploit this fact. This change triggers an optimization to widen-loop-comp.ll, so it had to be edited to get it to pass. phabricator: http://reviews.llvm.org/D5639 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219834 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 19:25:28 +00:00
Akira Hatanaka	38537634e2	InstCombine: Narrow switch instructions using known bits. Truncate the operands of a switch instruction to a narrower type if the upper bits are known to be all ones or zeros. rdar://problem/17720004 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219832 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 19:05:50 +00:00
Juergen Ributzka	7440a83e60	Reapply "[FastISel][AArch64] Add custom lowering for GEPs." This is mostly a copy of the existing FastISel GEP code, but we have to duplicate it for AArch64, because otherwise we would bail out even for simple cases. This is because the standard fastEmit functions don't cover MUL at all and ADD is lowered very inefficientily. The original commit had a bug in the add emit logic, which has been fixed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219831 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 18:58:07 +00:00
Rafael Espindola	ad04f5db82	Correctly handle references to section symbols. When processing assembly like .long .text we were creating a new undefined symbol .text. GAS on the other hand would handle that as a reference to the .text section. This patch implements that by creating the section symbols earlier so that they are visible during asm parsing. The patch also updates llvm-readobj to print the symbol number in the relocation dump so that the test can differentiate between two sections with the same name. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219829 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 18:55:30 +00:00
Matt Arsenault	8b3a9205b7	R600/SI: Also try to use 0 base for misaligned 8-byte DS loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219823 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 18:06:43 +00:00
Matt Arsenault	7fdd553b66	R600: Fix miscompiles when BFE has multiple uses SimplifyDemandedBits would break the other uses of the operand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219819 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 17:58:34 +00:00
Hal Finkel	6c15862fd3	[SLPVectorize] Basic ephemeral-value awareness The SLP vectorizer should not vectorize ephemeral values. These are used to express information to the optimizer, and vectorizing them does not lead to faster code (because the ephemeral values are dropped prior to code generation, vectorized or not), and obscures the information the instructions are attempting to communicate (the logic that interprets the arguments to @llvm.assume generically does not understand vectorized conditions). Also, uses by ephemeral values are free (because they, and the necessary extractelement instructions, will be dropped prior to code generation). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219816 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 17:35:01 +00:00
Derek Schuff	279b5504a3	[MC] Make bundle alignment mode setting idempotent and support nested bundles Summary: Currently an error is thrown if bundle alignment mode is set more than once per module (either via the API or the .bundle_align_mode directive). This change allows setting it multiple times as long as the alignment doesn't change. Also nested bundle_lock groups are currently not allowed. This change allows them, with the effect that the group stays open until all nests are exited, and if any of the bundle_lock directives has the align_to_end flag, the group becomes align_to_end. These changes make the bundle aligment simpler to use in the compiler, and also better match the corresponding support in GNU as. Reviewers: jvoung, eliben Differential Revision: http://reviews.llvm.org/D5801 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219811 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 17:10:04 +00:00
Juergen Ributzka	0081070cfd	Revert "[FastISel][AArch64] Add custom lowering for GEPs." This breaks our internal build bots. Reverting it to get the bots green again. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219776 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 04:55:48 +00:00

... 2 3 4 5 6 ...

26906 Commits