RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-01-09 21:50:50 +00:00

Author	SHA1	Message	Date
Michael Kuperstein	6d30abd62e	[X86] Emit .cfi_escape GNU_ARGS_SIZE when adjusting the stack before calls When outgoing function arguments are passed using push instructions, and EH is enabled, we may need to indicate to the stack unwinder that the stack pointer was adjusted before the call. This should fix the exception handling issues in PR24792. Differential Revision: http://reviews.llvm.org/D13132 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249522 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-07 07:01:31 +00:00
Igor Breger	2fe1f6e4f3	AVX512: Change encoding of vpshuflw and vpshufhw instructions. Implement WIG as W0 and not W1, like all other instruction have been implemented. Add encoding tests. Differential Revision: http://reviews.llvm.org/D13471 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249521 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-07 06:31:18 +00:00
Hans Wennborg	4d651e440b	Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249482 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-06 23:24:35 +00:00
Joseph Tremoulet	8136c73a75	[WinEH] Recognize CoreCLR personality function Summary: - Add CoreCLR to if/else ladders and switches as appropriate. - Rename isMSVCEHPersonality to isFuncletEHPersonality to better reflect what it captures. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, AndyAyers, llvm-commits Differential Revision: http://reviews.llvm.org/D13449 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249455 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-06 20:28:16 +00:00
Craig Topper	50bf4331f4	[X86] Teach constant hoisting that ANDs with 64-bit immediates in the range 0x80000000-0xffffffff can be handled cheaply and don't need to be hoisted. Most importantly, this keeps constant hoisting from preventing instruction selections ability to turn an AND with 0xffffffff into a move into a 32-bit subregister. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249370 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-06 02:50:24 +00:00
Craig Topper	a1a1f2a090	[X86] Remove unnecessary AddComplexity directive. The instruction is already wrapped in the equivalent earlier. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249369 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-06 02:50:21 +00:00
David Majnemer	6212b4d0b8	[WinEH] Update CATCHRET's operand to match its successor The CATCHRET operand did not match the MachineFunction's CFG. This mismatch happened because FrameLowering created a new MachineBasicBlock and updated the CFG but forgot to update the CATCHRET operand. Let's make sure this doesn't happen again by strengthing the funclet membership analysis: it can now reason about the membership of all basic blocks, not just those inside of funclets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249344 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-05 20:09:16 +00:00
Igor Breger	295b19789d	AVX512: Implemented encoding and intrinsics for VPERMILPS/PD instructions. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12690 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249261 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-04 07:20:41 +00:00
Simon Pilgrim	3e07e8cf77	[X86] Lower SEXTLOAD using SIGN_EXTEND_VECTOR_INREG. NCI. The custom lowering in LowerExtendedLoad is doing the equivalent shuffle, so make use of existing lowering code to reduce duplication. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249243 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-03 18:55:43 +00:00
Andrea Di Biagio	93ccc33e98	Reapply r249121 : "[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types." This patch teaches FastIsel the following two things: 1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types; 2) On AVX, no instructions are needed for bitcasts between 256-bit vector types. Example: %1 = bitcast <4 x i31> %V to <2 x i64> Before (-fast-isel -fast-isel-abort=1): FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64> Now we don't fall back to SelectionDAG and we correctly fold that computation propagating the register associated to %V. Originally reviewed here: http://reviews.llvm.org/D13347 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249147 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-02 16:08:05 +00:00
Andrea Di Biagio	053f844194	Revert: [FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types. r249121 caused a Clang test failure (avx2-buitins.c). Revert r249121 while I keep investigating on the reason why that test failed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249124 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-02 13:06:19 +00:00
Andrea Di Biagio	f008b99120	[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types. This patch teaches FastIsel the following two things: 1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types; 2) On AVX, no instructions are needed for bitcasts between 256-bit vector types. Example: %1 = bitcast <4 x i31> %V to <2 x i64> Before (-fast-isel -fast-isel-abort=1): FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64> Now we don't fall back to SelectionDAG and we correctly fold that computation propagating the register associated to %V. Differential Revision: http://reviews.llvm.org/D13347 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249121 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-02 12:45:37 +00:00
Reid Kleckner	646073b30f	[WinEH] Emit __C_specific_handler tables for the new IR We emit denormalized tables, where every range of invokes in the same state gets a complete list of EH action entries. This is significantly simpler than trying to infer the correct nested scoping structure from the MI. Fortunately, for SEH, the nesting structure is really just a size optimization. With this, some basic __try / __except examples work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249078 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-01 21:38:24 +00:00
David Majnemer	68754da26d	[WinEH] Make FuncletLayout more robust against catchret Catchret transfers control from a catch funclet to an earlier funclet. However, it is not completely clear which funclet the catchret target is part of. Make this clear by stapling the catchret target's funclet membership onto the CATCHRET SDAG node. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249052 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-01 18:44:59 +00:00
NAKAMURA Takumi	c3f6054a2c	Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64" It broke; LLVM :: CodeGen__Generic__2009-11-16-BadKillsCrash.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249032 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-01 17:00:56 +00:00
Ahmed Bougacha	c97ac42320	[X86] Don't custom-lower vNi32 uint_to_fp when unsafe-fp-math. The custom code produces incorrect results if later reassociated. Since r221657, on x86, vNi32 uitofp is lowered using an optimized sequence: movdqa LCPI0_0(%rip), %xmm1 ## xmm1 = [65535, ...] pand %xmm0, %xmm1 por LCPI0_1(%rip), %xmm1 ## [0x4b000000, ...] psrld $16, %xmm0 por LCPI0_2(%rip), %xmm0 ## [0x53000000, ...] addps LCPI0_3(%rip), %xmm0 ## [float -5.497642e+11, ...] addps %xmm1, %xmm0 Since r240361, the machine combiner opportunistically reassociates 2-instruction sequences (with -ffast-math). In the new code sequence, the ADDPS' are eligible. In isolation, for simple examples (without reassociable users), this makes no performance difference (the goal being to enable reassociation of longer chains). In the trivial example (just one uitofp), the reassociation doesn't happen, because (I think) it would require the emission of a separate movaps for a constantpool load (instead of folding it into addps). However, when we have multiple uitofp sequences, and the constantpool loads are CSE'd earlier, the machine combiner can do the reassociation. When the ADDPS' are reassociated, the resulting sequence isn't correct anymore, as we'd be adding large (239) constants with comparatively smaller values (~223). Given that two of the three inputs are powers of 2 larger than 216, and that ulp(239) == 2(39-24) == 215, the reassociated chain will produce 0 for any input in [0, 214[. In my testing, it also produces wrong results for 99.5% of [0, 232[. Avoid this by disabling the new lowering when -ffast-math. It does mean that we'll get slower code than without it, but at least we won't get egregiously incorrect code. One might argue that, considering -ffast-math is all but meaningless, uitofp producing wrong results isn't a compiler bug. But it really is. Fixes PR24512. ...though this is really more of a workaround. Ideally, we'd have some sort of Machine FMF, but that's a problem that's not worth tackling until we do more with machine IR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248965 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-01 00:11:07 +00:00
Reid Kleckner	39b3d70179	[WinEH] Emit int3 after noreturn calls on Win64 The Win64 unwinder disassembles forwards from each PC to try to determine if this PC is in an epilogue. If so, it skips calling the EH personality function for that frame. Typically, this means you cannot catch an exception in the same frame that you threw it, because 'throw' calls a noreturn runtime function. Previously we avoided this problem with the TrapUnreachable TargetOption, but that's a much bigger hammer than we need. All we need is a 1 byte non-epilogue instruction right after the call. Instead, what we got was an unconditional branch to a shared block containing the ud2, potentially 7 bytes instead of 1. So, this reverts r206684, which added TrapUnreachable, and replaces it with something better. The new code pattern matches for invoke/call followed by unreachable and inserts an int3 into the DAG. To be 100% watertight, we would need to insert SEH_Epilogue instructions into all basic blocks ending in a call with no terminators or successors, but in practice this is unlikely to come up. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248959 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-30 23:09:23 +00:00
Sanjay Patel	4f55287fa6	[x86] enable machine combiner reassociations for 256-bit vector logical integer insts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248955 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-30 22:25:55 +00:00
David Blaikie	892946c873	Fix -Wsign-compare warning git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248942 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-30 20:37:48 +00:00
Simon Pilgrim	4e042482c8	[X86][XOP] Added support for the lowering of 128-bit vector shifts to XOP shift instructions The XOP shifts just have logical/arithmetic versions and the left/right shifts are controlled by whether the value is positive/negative. Because of this I've added new X86ISD nodes instead of trying to force them to use the existing shift nodes. Additionally Excavator cores (bdver4) support XOP and AVX2 - meaning that it should use the AVX2 shifts when it can and fall back to XOP in other cases. Differential Revision: http://reviews.llvm.org/D8690 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248878 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-30 08:17:50 +00:00
Reid Kleckner	a4778f21f0	[WinEH] Setup RBP correctly in Win64 funclet prologues Previously local variable captures just didn't work in 64-bit. Now we can access local variables more or less correctly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248857 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-29 23:32:01 +00:00
David Majnemer	28c61f88cc	[WinEH] Ensure that funclets obey the x64 ABI The x64 ABI requires that epilogues do not contain code other than stack adjustments and some limited control flow. However, we'd insert code to initialize the return address after stack adjustments. Instead, insert EAX/RAX with the current value before we create the stack adjustments in the epilogue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248839 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-29 22:33:36 +00:00
Maksim Panchenko	3b3752c898	HHVM calling conventions. HHVM calling convention, hhvmcc, is used by HHVM JIT for functions in translated cache. We currently support LLVM back end to generate code for X86-64 and may support other architectures in the future. In HHVM calling convention any GP register could be used to pass and return values, with the exception of R12 which is reserved for thread-local area and is callee-saved. Other than R12, we always pass RBX and RBP as args, which are our virtual machine's stack pointer and frame pointer respectively. When we enter translation cache via hhvmcc function, we expect the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed to standard ABI alignment. This affects stack object alignment and stack adjustments for function calls. One extra calling convention, hhvm_ccc, is used to call C++ helpers from HHVM's translation cache. It is almost identical to standard C calling convention with an exception of first argument which is passed in RBP (before we use RDI, RSI, etc.) Differential Revision: http://reviews.llvm.org/D12681 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248832 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-29 22:09:16 +00:00
David Majnemer	0ba86faae2	[WinEH] Teach AsmPrinter about funclets Summary: Funclets have been turned into functions by the time they hit the object file. Make sure that they have decent names for the symbol table and CFI directives explaining how to reason about their prologues. Differential Revision: http://reviews.llvm.org/D13261 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248824 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-29 20:12:33 +00:00
Jeroen Ketema	f5f9e9a3bf	Arguments spilled on the stack before a function call may have alignment requirements, for example in the case of vectors. These requirements are exploited by the code generator by using move instructions that have similar alignment requirements, e.g., movaps on x86. Although the code generator properly aligns the arguments with respect to the displacement of the stack pointer it computes, the displacement itself may cause misalignment. For example if we have %3 = load <16 x float>, <16 x float>* %1, align 64 call void @bar(<16 x float> %3, i32 0) the x86 back-end emits: movaps 32(%ecx), %xmm2 movaps (%ecx), %xmm0 movaps 16(%ecx), %xmm1 movaps 48(%ecx), %xmm3 subl $20, %esp <-- if %esp was 16-byte aligned before this instruction, it no longer will be afterwards movaps %xmm3, (%esp) <-- movaps requires 16-byte alignment, while %esp is not aligned as such. movl $0, 16(%esp) calll __bar To solve this, we need to make sure that the computed value with which the stack pointer is changed is a multiple af the maximal alignment seen during its computation. With this change we get proper alignment: subl $32, %esp movaps %xmm3, (%esp) Differential Revision: http://reviews.llvm.org/D12337 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248786 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-29 10:12:57 +00:00
NAKAMURA Takumi	763aee2082	[CMake] X86AsmParser: Prune redundant LINK_LIBS. It is described in LLVMBuild.txt. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248771 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-29 01:25:01 +00:00
Sanjay Patel	5a4293ab23	add a FIXME for a CPU model check that should have an attribute instead git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248746 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-28 22:00:24 +00:00
Andrew Kaylor	aac3c943f3	Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248735 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-28 20:33:22 +00:00
Yaron Keren	164cc1e735	Silence clang warning: variable ‘Status’ set but not used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248691 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-27 21:31:33 +00:00
Simon Pilgrim	5769b61791	[X86][SSE2] Fix zero/any extension shuffles that don't start from the first element Fix for D12561 - we weren't correctly ensuring that the base element for extension was moved to start on a boundary suitable for UNPCKL/H git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248536 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-24 21:02:17 +00:00
Sanjay Patel	ffafbd3b2a	[x86] replace integer 'xor' ops with packed SSE FP 'xor' ops when operating on FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx xorl %eax, %ecx movd %ecx, %xmm0 into this: xorps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 This is an extension of: http://reviews.llvm.org/rL248395 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248415 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-23 18:33:42 +00:00
Sanjay Patel	dfa2ab03d4	[x86] replace integer 'or' ops with packed SSE FP 'or' ops when operating on FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx orl %eax, %ecx movd %ecx, %xmm0 into this: orps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 This is an extension of: http://reviews.llvm.org/rL248395 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248409 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-23 18:19:07 +00:00
Evgeniy Stepanov	d4052cf84c	Android support for SafeStack. Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). This is a re-commit of a change in r248357 that was reverted in r248358. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248405 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-23 18:07:56 +00:00
Sanjay Patel	5e2a635f0e	move call to convertIntLogicToFPLogic up; NFCI The BEXTR comments didn't make sense before, we may want to extend the FP logic transform to work on vectors, and this way is more beautiful. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248404 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-23 18:03:37 +00:00
Sanjay Patel	46a7838d96	[x86] move code for converting int logic to FP logic to a helper function; NFCI This is a follow-on to: http://reviews.llvm.org/rL248395 so we can add the call to the or/xor combines too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248399 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-23 17:39:41 +00:00
Sanjay Patel	1813647111	[x86] replace integer 'and' ops with packed SSE FP 'and' ops when operating on FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx andl %eax, %ecx movd %ecx, %xmm0 into this: andps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 Differential Revision: http://reviews.llvm.org/D13065 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248395 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-23 17:00:06 +00:00
Simon Pilgrim	e0a23dddf0	[X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR This patches removes the x86.sse41.pmovsx* intrinsics, provides a suitable upgrade path and updates relevant tests to sign extend a subvector instead. LLVM counterpart to D12835 Differential Revision: http://reviews.llvm.org/D13002 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248368 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-23 08:48:33 +00:00
Evgeniy Stepanov	1be7ea773a	Revert "Android support for SafeStack." test/Transforms/SafeStack/abi.ll breaks when target is not supported; needs refactoring. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248358 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-23 01:23:22 +00:00
Evgeniy Stepanov	c7b6dc0535	Android support for SafeStack. Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248357 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-23 01:03:51 +00:00
Stephen Canon	eb1a6c5669	Don't raise inexact when lowering ceil, floor, round, trunc. The C standard has historically not specified whether or not these functions should raise the inexact flag. Traditionally on Darwin, these functions did raise inexact, and the llvm lowerings followed that conventions. n1778 (C bindings for IEEE-754 (2008)) clarifies that these functions should not set inexact. This patch brings the lowerings for arm64 and x86 in line with the newly specified behavior. This also lets us fold some logic into TD patterns, which is nice. Differential Revision: http://reviews.llvm.org/D12969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248266 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-22 11:43:17 +00:00
NAKAMURA Takumi	09c0ea51ca	Untabify. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248264 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-22 11:15:07 +00:00
NAKAMURA Takumi	c36e746e98	Reformat blank lines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248263 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-22 11:14:39 +00:00
NAKAMURA Takumi	6902c8db26	Reformat comment lines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248262 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-22 11:14:12 +00:00
NAKAMURA Takumi	d4cdf1962b	Reformat. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248261 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-22 11:13:55 +00:00
Simon Pilgrim	de8d7c41ca	[X86][SSE] Match zero/any extension shuffles that don't start from the first element This patch generalizes the lowering of shuffles as zero extensions to allow extensions that don't start from the first element. It now recognises extensions starting anywhere in the lower 128-bits or at the start of any higher 128-bit lane. The motivation was to reduce the number of high cost pshufb calls, but it also improves the SSE2 case as well. Differential Revision: http://reviews.llvm.org/D12561 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248250 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-22 08:16:08 +00:00
Chad Rosier	c5d4530d42	[Machine Combiner] Refactor machine reassociation code to be target-independent. No functional change intended. Patch by Haicheng Wu <haicheng@codeaurora.org>! http://reviews.llvm.org/D12887 PR24522 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248164 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-21 15:09:11 +00:00
Asaf Badouh	648a027c82	[X86][AVX512] add masked version for RSQRT14 & RCP14 Scalar FP Differential Revision: http://reviews.llvm.org/D12524 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248147 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-21 10:23:53 +00:00
Igor Breger	b0eb8fb69c	AVX512: Implemented encoding and intrinsics for vcmpss/sd. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12593 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248121 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-20 15:15:10 +00:00
Asaf Badouh	fe11ff1d50	[X86][AVX512] extend support in Scalar conversion add scalar FP to Int conversion with truncation intrinsics add scalar conversion FP32 from/to FP64 intrinsics add rounding mode and SAE mode encoding for these intrinsics Differential Revision: http://reviews.llvm.org/D12665 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248117 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-20 14:31:19 +00:00
Igor Breger	0d48c46954	AVX512: vsqrtss/sd encoding and intrinsics implementation. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12102 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248116 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-20 09:13:41 +00:00
Asaf Badouh	d3db2cf572	[X86][AVX512DQ] Add fpclass instruction Differential Revision: http://reviews.llvm.org/D12931 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248115 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-20 08:46:07 +00:00
Michael Kuperstein	5610c28853	[X86] Fix sitofp and uitofp instruction matching failures with long double and avx512 The operation action for i32 and i64 cannot be set to legal, as long double needs custom lowering. Patch by: mitch.l.bodart@intel.com Differential Revision: http://reviews.llvm.org/D12372 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248114 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-20 08:12:17 +00:00
Igor Breger	d425988995	AVX512: Implemented intrinsics for vshuff32x4, vshuff64x2, vshufi64x2, vshufi32x4 Added tests for intrinsics. Differential Revision: http://reviews.llvm.org/D12525 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248113 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-20 07:18:53 +00:00
Igor Breger	28cce2ad25	AVX512: Implement instructions encoding, lowering and intrinsics vinserti64x4, vinserti64x2, vinserti32x8, vinserti32x4, vinsertf64x4, vinsertf64x2, vinsertf32x8, vinsertf32x4 Added tests for encoding, lowering and intrinsics. Differential Revision: http://reviews.llvm.org/D11893 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248111 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-20 06:52:42 +00:00
Simon Pilgrim	afa71f40bf	[X86][SSE] Vectorize CTTZ + CTTZ_ZERO_UNDEF Now that we have fast vector CTPOP implementations we can use this to speed up vector CTTZ using the pattern (cttz(x) = ctpop((x & -x) - 1)) Additionally, for AVX512CD that provides lzcnt instructions we can use the pattern (cttz_undef(x) = (width - 1) - ctlz(x & -x)) Differential Revision: http://reviews.llvm.org/D12663 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248091 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-19 13:22:57 +00:00
Reid Kleckner	f946dd0412	[WinEH] Make funclet return instrs pseudo instrs This makes catchret look more like a branch, and less like a weird use of BlockAddress. It also lets us get away from llvm.x86.seh.restoreframe, which relies on the old parentfpoffset label arithmetic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247936 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-17 20:43:47 +00:00
Elena Demikhovsky	2e5bf5535c	AVX-512: shufflevector for i1 vectors <2 x i1> .. <64 x i1> AVX-512 does not provide an instruction that shuffles mask register. So I do the following way: mask-2-simd , shuffle simd , simd-2-mask Differential Revision: http://reviews.llvm.org/D12727 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247876 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-17 06:53:12 +00:00
Eric Christopher	973f7aa32a	constify the Function parameter to the TTI creation callback and propagate to all callers/users/etc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247864 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-16 23:38:13 +00:00
Reid Kleckner	436444d569	[WinEH] Rip out the landingpad-based C++ EH state numbering code It never really worked, and the new code is working better every day. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247860 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-16 22:14:46 +00:00
Reid Kleckner	66ef931b43	[WinEH] Pull Adjectives and CatchObj out of the catchpad arg list Clang now passes the adjectives as an argument to catchpad. Getting the CatchObj working is simply a matter of threading another static alloca through codegen, first as an alloca, then as a frame index, and finally as a frame offset. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247844 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-16 20:16:27 +00:00
Reid Kleckner	1b86a3446f	[WinEH] Skip state numbering when no EH pads are present Otherwise we'd try to emit the thunk that passes the LSDA to __CxxFrameHandler3. We don't emit the LSDA if there were no landingpads, so we'd end up with an assembler error when trying to write the COFF object. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247820 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-16 17:19:44 +00:00
Sanjay Patel	39490133e4	propagate fast-math-flags on DAG nodes After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247815 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-16 16:31:21 +00:00
Michael Kuperstein	196a1036c4	[X86] Do not generate 64-bit pops of 32-bit GPRs. When trying emit a stack adjustments using pops, frame lowering selects an arbitrary free GPR. It should always select one from an appropriate class... This fixes PR24649. Patch by: amjad.aboud@intel.com Differential Revision: http://reviews.llvm.org/D12609 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247785 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-16 11:27:20 +00:00
Michael Kuperstein	b82c32333b	[X86] Fix emitEpilogue() to make less assumptions about pops This is the mirror image of r242395. When X86FrameLowering::emitEpilogue() looks for where to insert the %esp addition that deallocates stack space used for local allocations, it assumes that any sequence of pop instructions from function exit backwards consists purely of restoring callee-save registers. This may be false, since from some point backward, the pops may be clean-up of stack space allocated for arguments to a call. Patch by: amjad.aboud@intel.com Differential Revision: http://reviews.llvm.org/D12688 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247784 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-16 11:18:25 +00:00
Daniel Sanders	47b167dd84	Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Eric has replied and has demanded the patch be reverted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247702 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 16:17:27 +00:00
Daniel Sanders	9781f90c7e	Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247692 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 14:08:28 +00:00
Daniel Sanders	a6aa0c3bcc	Revert r247684 - Replace Triple with a new TargetTuple ... LLDB needs to be updated in the same commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247686 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 13:46:21 +00:00
Daniel Sanders	7b82808e13	Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247683 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 13:17:40 +00:00
Daniel Sanders	c413998d28	Fix namespace indentation and missing blank lines before 'public:' in *MCAsmInfo.h. NFC. This is to reduce noise in a following commit. Also fixes a couple missing spaces before the reference operator. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247679 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 12:27:06 +00:00
Simon Pilgrim	29f50e9783	[X86][MMX] Added shuffle decodes for MMX/3DNow! shuffles. Added shuffle decodes for MMX PUNPCK + PSHUFW shuffles. Added shuffle decodes for 3DNow! PSWAPD shuffles. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247526 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-13 11:28:45 +00:00
Elena Demikhovsky	b21635658f	AVX-512: Fixed a bug in OR/XOR operations for 512-bit FP values on KNL. KNL does not have VXORPS, VORPS for 512-bit values. I use integer VPXOR, VPOR that actually do the same. X86ISD::FXOR/FOR are generated as a result of FSUB combining. Differential Revision: http://reviews.llvm.org/D12753 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247523 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-13 08:15:15 +00:00
Sanjay Patel	d8fd22b7ec	[x86] enable machine combiner reassociations for 128-bit vector logical integer insts (2nd try) The changes in: test/CodeGen/X86/machine-cp.ll are just due to scheduling differences after some logic instructions were reassociated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247516 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-12 19:47:50 +00:00
Simon Pilgrim	b43332c8d1	[X86] Renamed lowerVectorShuffleAsUnpack NFCI. Renamed to lowerVectorShuffleAsPermuteAndUnpack to make it clear that it lowers to more than just a UNPCK instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247513 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-12 18:26:47 +00:00
Simon Pilgrim	9114b92030	[X86] Moved lowerVectorShuffleWithUNPCK earlier to make reuse easier. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247511 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-12 16:03:06 +00:00
Sanjay Patel	69f08e598c	revert r247506; need to verify changes in existing tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247507 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-12 15:27:31 +00:00
Sanjay Patel	e6c453bf2d	[x86] enable machine combiner reassociations for 128-bit vector logical integer insts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247506 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-12 14:58:04 +00:00
Akira Hatanaka	8e2b613ef0	Use function attribute "stackrealign" to decide whether stack realignment should be forced. With this commit, we can now force stack realignment when doing LTO and do so on a per-function basis. Also, add a new cl::opt option "stackrealign" to CommandFlags.h which is used to force stack realignment via llc's command line. Out-of-tree projects currently using -force-align-stack to force stack realignment should make changes to attach the attribute to the functions in the IR. Differential Revision: http://reviews.llvm.org/D11814 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247450 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-11 18:54:38 +00:00
Ahmed Bougacha	74869be273	[CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit. We used to have this magic "hasLoadLinkedStoreConditional()" callback, which really meant two things: - expand cmpxchg (to ll/sc). - expand atomic loads using ll/sc (rather than cmpxchg). Remove it, and, instead, introduce explicit callbacks: - bool shouldExpandAtomicCmpXchgInIR(inst) - AtomicExpansionKind shouldExpandAtomicLoadInIR(inst) Differential Revision: http://reviews.llvm.org/D12557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247429 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-11 17:08:28 +00:00
Ahmed Bougacha	f3d2de3832	[CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind. This lets us generalize its usage to the other atomic instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247428 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-11 17:08:17 +00:00
Reid Kleckner	3f3f976995	[WinEH] Push and pop EBP for 32-bit funclets The Win32 EH runtime caller does not preserve EBP, even though it does preserve the CSRs (EBX, ESI, EDI) for us. The result was that each finally funclet call would leave the frame pointer off by 12 bytes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247348 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-10 22:00:02 +00:00
Hans Wennborg	07a3b97f20	Re-commit r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes" Except the changes that defined virtual destructors as =default, because that ran into problems with GCC 4.7 and overriding methods that weren't noexcept. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247298 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-10 16:49:58 +00:00
Igor Breger	7744f3ca3f	AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11802 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247276 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-10 12:54:54 +00:00
Hans Wennborg	2515069180	Revert r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes" This caused build breakges, e.g. http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/24926 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247226 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-10 00:57:26 +00:00
Ahmed Bougacha	de3869437d	[CodeGen] Make x86 nontemporal store patfrags generic. NFC. To be used by other targets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247225 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-10 00:53:15 +00:00
Reid Kleckner	75771885f8	[WinEH] Add codegen support for cleanuppad and cleanupret All of the complexity is in cleanupret, and it mostly follows the same codepaths as catchret, except it doesn't take a return value in RAX. This small example now compiles and executes successfully on win32: extern "C" int printf(const char *, ...) noexcept; struct Dtor { ~Dtor() { printf("~Dtor\n"); } }; void has_cleanup() { Dtor o; throw 42; } int main() { try { has_cleanup(); } catch (int) { printf("caught it\n"); } } Don't try to put the cleanup in the same function as the catch, or Bad Things will happen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247219 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-10 00:25:23 +00:00
Hans Wennborg	bfd007fd70	Fix Clang-tidy misc-use-override warnings, other minor fixes Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D12740 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247216 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-10 00:12:56 +00:00
Reid Kleckner	69d051c87d	[SEH] Emit 32-bit SEH tables for the new EH IR The 32-bit tables don't actually contain PC range data, so emitting them is incredibly simple. The 64-bit tables, on the other hand, use the same table for state numbering as well as label ranges. This makes things more difficult, so it will be implemented later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247192 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-09 21:10:03 +00:00
Renato Golin	792b67e240	Revert "AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding." This reverts commit r247149, as it was breaking numerous buildbots of varied architectures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247177 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-09 19:44:40 +00:00
Matthias Braun	af5ff60200	Save LaneMask with livein registers With subregister liveness enabled we can detect the case where only parts of a register are live in, this is expressed as a 32bit lanemask. The current code only keeps registers in the live-in list and therefore enumerated all subregisters affected by the lanemask. This turned out to be too conservative as the subregister may also cover additional parts of the lanemask which are not live. Expressing a given lanemask by enumerating a minimum set of subregisters is computationally expensive so the best solution is to simply change the live-in list to store the lanemasks as well. This will reduce memory usage for targets using subregister liveness and slightly increase it for other targets Differential Revision: http://reviews.llvm.org/D12442 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247171 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-09 18:08:03 +00:00
Igor Breger	1676f595b1	AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11802 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247149 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-09 14:35:09 +00:00
Reid Kleckner	dff75493e8	[WinEH] Emit prologues and epilogues for funclets Summary: 32-bit funclets have short prologues that allocate enough stack for the largest call in the whole function. The runtime saves CSRs for the funclet. It doesn't restore CSRs after we finally transfer control back to the parent funciton via a CATCHRET, but that's a separate issue. 32-bit funclets also have to adjust the incoming EBP value, which is what llvm.x86.seh.recoverframe does in the old model. 64-bit funclets need to spill CSRs as normal. For simplicity, this just spills the same set of CSRs as the parent function, rather than trying to compute different CSR sets for the parent function and each funclet. 64-bit funclets also allocate enough stack space for the largest outgoing call frame, like 32-bit. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12546 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247092 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-08 22:44:41 +00:00
Derek Schuff	0d89d696f4	x32. Fixes a bug in how struct va_list is initialized in x32 Summary: This patch modifies X86TargetLowering::LowerVASTART so that struct va_list is initialized with 32 bit pointers in x32. It also includes tests that call @llvm.va_start() for x32. Patch by João Porto Subscribers: llvm-commits, hjl.tools Differential Revision: http://reviews.llvm.org/D12346 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247069 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-08 20:51:31 +00:00
Derek Schuff	c76507d03c	x32. Fixes a bug in i8mem_NOREX declaration. The old implementation assumed LP64 which is broken for x32. Specifically, the MOVE8rm_NOREX and MOVE8mr_NOREX, when selected, would cause a 'Cannot emit physreg copy instruction' error message to be reported. This patch also enable the h-register*ll tests for x32. Differential Revision: http://reviews.llvm.org/D12336 Patch by João Porto git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247058 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-08 19:47:15 +00:00
Andrew Kaylor	b25ffb37c6	Fix for bz24500: Avoid non-deterministic code generation triggered by the x86 call frame optimization Patch by Dave Kreitzer Differential Revision: http://reviews.llvm.org/D12620 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247042 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-08 18:18:46 +00:00
Igor Breger	b23094366e	AVX512: kunpck encoding implementation Added tests for encoding. Differential Revision: http://reviews.llvm.org/D12061 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247010 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-08 13:10:00 +00:00
Elena Demikhovsky	1c82e5f791	Removed an old comment, NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247006 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-08 12:22:22 +00:00
Elena Demikhovsky	27828d7a5e	compilation issue, NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246983 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-08 07:34:06 +00:00
Elena Demikhovsky	758b9df87a	fixed compilation issue, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246982 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-08 07:10:08 +00:00
Elena Demikhovsky	1e00496f88	AVX-512: Lowering for 512-bit vector shuffles. Vector types: <8 x 64>, <16 x 32>, <32 x 16> float and integer. Differential Revision: http://reviews.llvm.org/D10683 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246981 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-08 06:38:21 +00:00
Reid Kleckner	412db355b3	Sink COFF.h MC include into .cpp files This prevents MC clients from getting COFF.h, which conflicts with winnt.h macros. Also a minor IWYU cleanup. Now the only public headers including COFF.h are in Object, and they actually need it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246784 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-03 16:41:50 +00:00
Sanjay Patel	eb8298cfe1	[x86] enable machine combiner reassociations for scalar 'xor' insts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246781 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-03 16:36:16 +00:00
Igor Breger	d951d3c8df	AVX512: Implemented encoding and intrinsics for vplzcntq, vplzcntd, vpconflictq, vpconflictd Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11931 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246750 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-03 09:05:31 +00:00
Ahmed Bougacha	1522e8d8f8	[X86] Require 32-byte alignment for 32-byte VMOVNTs. We used to accept (and even test, and generate) 16-byte alignment for 32-byte nontemporal stores, but they require 32-byte alignment, per SDM. Found by inspection. Instead of hardcoding 16 in the patfrag, check for natural alignment. Also fix the autoupgrade and the various tests. Also, use explicit -mattr instead of -mcpu: I stared at the output several minutes wondering why I get 2x movntps for the unaligned case (which is the ideal output, but needs some work: see FIXME), until I remembered corei7-avx implies +slow-unaligned-mem-32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246733 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-02 23:25:39 +00:00
Ahmed Bougacha	074165218d	[X86] Cleanup nontemporal fragments. NFCI. We can chain other fragments to avoid repeating conditions. This also fixes a potential bug (that realistically can't happen), where we would match indexed nontemporal stores for i32/i64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246719 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-02 22:27:38 +00:00
Sanjay Patel	ec44710063	[x86] fix allowsMisalignedMemoryAccesses() for 8-byte and smaller accesses This is a continuation of the fix from: http://reviews.llvm.org/D10662 and discussion in: http://reviews.llvm.org/D12154 Here, we distinguish slow unaligned SSE (128-bit) accesses from slow unaligned scalar (64-bit and under) accesses. Other lowering (eg, getOptimalMemOpType) assumes that unaligned scalar accesses are always ok, so this changes allowsMisalignedMemoryAccesses() to match that behavior. Differential Revision: http://reviews.llvm.org/D12543 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246658 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-02 15:42:49 +00:00
Asaf Badouh	05859c7cbb	[X86][AVX512VLBW] add support in byte shift and SAD add byte shift left/right add SAD - compute sum of absolute differences Differential Revision: http://reviews.llvm.org/D12479 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246654 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-02 14:21:54 +00:00
Igor Breger	1b50f7132b	AVX512: Implemented encoding and intrinsics for VGETMANTPD/S , VGETMANTSD/S instructions Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11593 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246642 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-02 11:18:55 +00:00
Igor Breger	191108c6b8	AVX512: Implemented encoding and intrinsics for vshufps/d. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11709 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246640 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-02 10:50:58 +00:00
Elena Demikhovsky	e1bb461f27	AVX-512: store <4 x i1> and <2 x i1> values in memory Enabled DAG pattern lowering for SKX with DQI predicate. Differential Revision: http://reviews.llvm.org/D12550 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246625 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-02 09:20:58 +00:00
Vedant Kumar	ec0cd29de8	[CodeGen] Fix FREM on 32-bit MSVC on x86 Patch by Dylan McKay! Differential Revision: http://reviews.llvm.org/D12099 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246615 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-02 01:31:58 +00:00
Sanjay Patel	ac515c4087	rename "slow-unaligned-mem-under-32" to slow-unaligned-mem-16" (NFCI) This is a follow-on suggested by: http://reviews.llvm.org/D12154 ( http://reviews.llvm.org/rL245729 ) http://reviews.llvm.org/D10662 ( http://reviews.llvm.org/rL245075 ) This makes the attribute name match most of the existing lowering logic and regression test expectations. But the current use of this attribute is inconsistent; see the FIXME comment for "allowsMisalignedMemoryAccesses()". That change will result in functional changes and should be coming soon. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246585 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-01 20:51:51 +00:00
Igor Breger	c02bfc6060	AVX512: Implemented intrinsics for valign. Differential Revision: http://reviews.llvm.org/D12526 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246551 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-01 15:27:18 +00:00
Sanjay Patel	63384be23d	[x86] enable machine combiner reassociations for scalar 'or' insts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246481 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-31 20:27:03 +00:00
Matthias Braun	023a6e3548	X86: Fix FastISel SSESelect register class X86FastISel has been using the wrong register class for VBLENDVPS which produces a VR128 and needs an extra copy to the target register. The problem was already hit by the existing test cases when using > llvm-lit -Dllc="llc -verify-machineinstr" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246461 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-31 18:25:11 +00:00
Igor Breger	046f79fbb0	AVX512: ktest implemantation Added tests for encoding. Differential Revision: http://reviews.llvm.org/D11979 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246439 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-31 13:30:19 +00:00
Igor Breger	c7aaf020ab	AVX512: Implemented encoding and intrinsics for vdbpsadbw Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12491 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246436 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-31 13:09:30 +00:00
Igor Breger	c21a0f3132	AVX512: kadd implementation Added tests for encoding. Differential Revision: http://reviews.llvm.org/D11973 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246432 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-31 11:50:23 +00:00
Igor Breger	66973634a5	AVX512: Implemented encoding and intrinsics for vpalignr Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12270 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246428 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-31 11:14:02 +00:00
Hal Finkel	16c92083ab	[MIR Serialization] static -> static const in getSerializable*MachineOperandTargetFlags Make the arrays 'static const' instead of just 'static'. Post-commit review comment from Roman Divacky on IRC. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246376 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-30 08:07:29 +00:00
Vedant Kumar	21f084aa72	[X86] NFC: Clean up and clang-format a few lines git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246340 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-28 21:59:00 +00:00
Sanjay Patel	4b1821fa36	[x86] enable machine combiner reassociations for scalar 'and' insts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246300 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-28 14:09:48 +00:00
Reid Kleckner	c0e64ada5c	[WinEH] Add some support for code generating catchpad We can now run 32-bit programs with empty catch bodies. The next step is to change PEI so that we get funclet prologues and epilogues. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246235 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-27 23:27:47 +00:00
Reid Kleckner	9c5b8a0117	[ms-inline-asm] Relax assertion around funky identifiers slightly A corresponding clang change will make it so that clang can consume part of an assembler token. The assembler treats '.' as an identifier character while clang does not, so it's view of the token stream is a little different. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246089 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-26 21:57:25 +00:00
Andrew Kaylor	ab3081c118	Expose hasLiveCondCodeDef as a member function of the X86InstrInfo class. NFC This takes the existing static function hasLiveCondCodeDef and makes it a member function of the X86InstrInfo class. This is a useful utility function that an upcoming change would like to use. NFC. Patch by: Kevin B. Smith Differential Revision: http://reviews.llvm.org/D12371 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246073 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-26 20:36:52 +00:00
Vedant Kumar	0c3c0acf23	[llvm-mc] Ignore opcode size prefix in 64-bit CALL disassembly This is a fix for disassembling unusual instruction sequences in 64-bit mode w.r.t the CALL rel16 instruction. It might be desirable to move the check somewhere else, but it essentially mimics the special case handling with JCXZ in 16-bit mode. The current behavior accepts the opcode size prefix and causes the call's immediate to stop disassembling after 2 bytes. When debugging sequences of instructions with this pattern, the disassembler output becomes extremely unreliable and essentially useless (if you jump midway into what lldb thinks is a unified instruction, you'll lose %rip). So we ignore the prefix and consume all 4 bytes when disassembling a 64-bit mode binary. Note: in Vol. 2A 3-99 the Intel spec states that CALL rel16 is N.S. N.S. is defined as: Indicates an instruction syntax that requires an address override prefix in 64-bit mode and is not supported. Using an address override prefix in 64-bit mode may result in model-specific execution behavior. (Vol. 2A 3-7) Since 0x66 is an operand override prefix we should be OK (although we may want to warn about 0x67 prefixes to 0xe8). On the CPUs I tested with, they all ignore the 0x66 prefix in 64-bit mode. Patch by Matthew Barney! Differential Revision: http://reviews.llvm.org/D9573 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246038 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-26 16:20:29 +00:00
Matthias Braun	ec01af9135	FastISel: Factor out common code; NFC intended This should be no functional change but for the record: For three cases in X86FastISel this will change the order in which the FalseMBB and TrueMBB of a conditional branch is addedd to the successor/predecessor lists. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245997 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-26 01:38:00 +00:00
Charles Davis	7e96f0f6ff	Make variable argument intrinsics behave correctly in a Win64 CC function. Summary: This change makes the variable argument intrinsics, `llvm.va_start` and `llvm.va_copy`, and the `va_arg` instruction behave as they do on Windows inside a `CallingConv::X86_64_Win64` function. It's needed for a Clang patch I have to add support for GCC's `__builtin_ms_va_list` constructs. Reviewers: nadav, asl, eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1622 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245990 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-25 23:27:41 +00:00
Sanjay Patel	af7c00f213	make fast unaligned memory accesses implicit with SSE4.2 or SSE4a This is a follow-on from the discussion in http://reviews.llvm.org/D12154. This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has generally fast unaligned memory ops. A motivating use case for this change is a clang invocation that doesn't explicitly set the CPU, but does target a feature that we know only exists on a CPU that supports fast unaligned memops. For example: $ clang -O1 foo.c -mavx This resolves a difference in lowering noted in PR24449: https://llvm.org/bugs/show_bug.cgi?id=24449 Before this patch, we used different store types depending on whether the example can be lowered as a memset or not. Differential Revision: http://reviews.llvm.org/D12288 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245950 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-25 16:29:21 +00:00
Michael Kuperstein	89024e86ad	[X86] Remove references to _ftol2 As of r245924, _ftol2 is no longer used for fptoui on MS platforms. Remove the dead code associated with it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245925 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-25 07:58:33 +00:00
Michael Kuperstein	f48b1beeec	[X86] Fix fptoui conversions This fixes two issues in x86 fptoui lowering. 1) Makes conversions from f80 go through the right path on AVX-512. 2) Implements an inline sequence for fptoui i64 instead of a library call. This improves performance by 6X on SSE3+ and 3X otherwise. Incidentally, it also removes the use of ftol2 for fptoui, which was wrong to begin with, as ftol2 converts to a signed i64, producing wrong results for values >= 2^63. Patch by: mitch.l.bodart@intel.com Differential Revision: http://reviews.llvm.org/D11316 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245924 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-25 07:42:09 +00:00
Steve King	46ff6da860	Pass function attributes instead of boolean in isIntDivCheap(). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245921 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-25 02:31:21 +00:00
Matthias Braun	56dd2d0886	MachineBasicBlock: Add liveins() method returning an iterator_range git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245895 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-24 22:59:52 +00:00
Michael Zuckerman	59dfeede45	[X86] Add support for mmword memory operand size for Intel-syntax x86 assembly Differential Revision: http://reviews.llvm.org/D12151 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245835 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-24 10:26:54 +00:00
Michael Zuckerman	7b854fda4a	first commit to llvm git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245825 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-24 07:48:50 +00:00
Sanjay Patel	bf83737adb	[x86] enable machine combiner reassociations for 256-bit vector min/max git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245735 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-21 21:04:21 +00:00
Sanjay Patel	6113df3d73	remove 'FeatureSlowUAMem' from AMD CPUs based on 10H micro-arch or later See discussion in D12154 ( http://reviews.llvm.org/D12154 ), AMD Software Optimization Guides for 10H/12H/15H/16H, and Agner Fog's experimental data. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245733 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-21 20:39:17 +00:00
Sanjay Patel	2071d7abd9	[x86] invert logic for attribute 'FeatureFastUAMem' This is a 'no functional change intended' patch. It removes one FIXME, but adds several more. Motivation: the FeatureFastUAMem attribute may be too general. It is used to determine if any sized misaligned memory access under 32-bytes is 'fast'. From the added FIXME comments, however, you can see that we're not consistent about this. Changing the name of the attribute makes it clearer to see the logic holes. Changing this to a 'slow' attribute also means we don't have to add an explicit 'fast' attribute to new chips; fast unaligned accesses have been standard for several generations of CPUs now. Differential Revision: http://reviews.llvm.org/D12154 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245729 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-21 20:17:26 +00:00
Sanjay Patel	b730bdf4e9	[x86] enable machine combiner reassociations for 128-bit vector min/max git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245715 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-21 18:06:49 +00:00
Eric Christopher	32ce343fe6	Fix typo - symetric -> symmetric. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245705 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-21 16:23:39 +00:00
Ahmed Bougacha	14fea5bf0d	[X86] Look for scalar through one bitcast when lowering to VBROADCAST. Fixes PR23464: one way to use the broadcast intrinsics is: _mm256_broadcastw_epi16(_mm_cvtsi32_si128((int)src)); We don't currently fold this, but now that we use native IR for the intrinsics (r245605), we can look through one bitcast to find the broadcast scalar. Differential Revision: http://reviews.llvm.org/D10557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245613 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-20 21:02:39 +00:00
Ahmed Bougacha	ad0ddd8e01	[X86] Replace avx2 broadcast intrinsics with native IR. Since r245605, the clang headers don't use these anymore. r245165 updated some of the tests already; update the others, add an autoupgrade, remove the intrinsics, and cleanup the definitions. Differential Revision: http://reviews.llvm.org/D10555 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245606 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-20 20:36:19 +00:00
Marina Yatsina	4ca59ab261	[X86] Fix FBLD and FBSTP FBLD and FBSTP should receive TBYTE because it is defined as FBLD m80 FBSTP m80 Differential Revision: http://reviews.llvm.org/D11748 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245553 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-20 11:51:24 +00:00
Marina Yatsina	8d1c57faee	[X86] Fix bug in COMISD and COMISS definition in td files COMISD should receive QWORD because it is defined as (V)COMISD xmm1, xmm2/m64 COMISS should receive DWORD because it is defined as (V)COMISS xmm1, xmm2/m32 Differential Revision: http://reviews.llvm.org/D11712 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245551 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-20 11:21:36 +00:00
David Majnemer	fa1aef3608	[X86] Fix the (shl (and (setcc_c), c1), c2) -> (and setcc_c, (c1 << c2)) fold We didn't check for the necessary preconditions before folding a mask/shift into a single mask. This fixes PR24516. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245544 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-20 09:00:56 +00:00
Sanjay Patel	3b7c3d3fe9	[x86] enable machine combiner reassociations for scalar double-precision min/max git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245506 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-19 21:27:27 +00:00
Sanjay Patel	d81980d640	[x86] enable machine combiner reassociations for scalar single-precision maximums git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245504 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-19 21:18:46 +00:00
David Majnemer	26047ead4b	[X86] Emit more efficient >= comparisons against 0 We don't do a great job with >= 0 comparisons against zero when the result is used as an i8. Given something like: void f(long long LL, bool B) { B = LL >= 0; } We used to generate: shrq $63, %rdi xorb $1, %dil movb %dil, (%rsi) Now we generate: testq %rdi, %rdi setns (%rsi) Differential Revision: http://reviews.llvm.org/D12136 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245498 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-19 20:51:40 +00:00
Bruno Cardoso Lopes	71a40e6fef	[PeepholeOptimizer] Look through PHIs to find additional register sources Reintroduce r245442. Remove an overly conservative assertion introduced in r245442. We could replace the assertion to use `shareSameRegisterFile` instead, but in that point in `insertPHI` we already lost the original Def subreg to check against. So drop the assertion completely. Original commit message: - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245479 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-19 18:53:36 +00:00
Derek Schuff	946eb8b5c6	x32. Fixes a bug in x32 exception handling. This patch updates the X86 lowering so that the Exception Pointer and Selector are 64-bit wide only if Subtarget.isTarget64BitLP64. Patch by João Porto Reviewers: dschuff, rnk Differential Revision: http://reviews.llvm.org/D12111 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245454 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-19 16:28:21 +00:00
JF Bastien	36fdcfb93a	x32. Fixes jmp %reg in x32 x32 has 32-bit pointers; x86-64 can't jmp %r32. This patch addresses this issue by explicitly zero-extending brind's target to 64-bits. Author: jpp Reviewers: jfb, dschuff, pavel.v.chupin Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D12112 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245452 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-19 16:17:08 +00:00

1 2 3 4 5 ...

12265 Commits