Commit Graph

12265 Commits

Author SHA1 Message Date
Michael Kuperstein
6d30abd62e [X86] Emit .cfi_escape GNU_ARGS_SIZE when adjusting the stack before calls
When outgoing function arguments are passed using push instructions, and EH
is enabled, we may need to indicate to the stack unwinder that the stack
pointer was adjusted before the call.

This should fix the exception handling issues in PR24792.

Differential Revision: http://reviews.llvm.org/D13132

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249522 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 07:01:31 +00:00
Igor Breger
2fe1f6e4f3 AVX512: Change encoding of vpshuflw and vpshufhw instructions. Implement WIG as W0 and not W1, like all other instruction have been implemented.
Add encoding tests.

Differential Revision: http://reviews.llvm.org/D13471

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249521 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 06:31:18 +00:00
Hans Wennborg
4d651e440b Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups.
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D13321

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249482 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 23:24:35 +00:00
Joseph Tremoulet
8136c73a75 [WinEH] Recognize CoreCLR personality function
Summary:
 - Add CoreCLR to if/else ladders and switches as appropriate.
 - Rename isMSVCEHPersonality to isFuncletEHPersonality to better
   reflect what it captures.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: pgavlin, AndyAyers, llvm-commits

Differential Revision: http://reviews.llvm.org/D13449

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249455 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 20:28:16 +00:00
Craig Topper
50bf4331f4 [X86] Teach constant hoisting that ANDs with 64-bit immediates in the range 0x80000000-0xffffffff can be handled cheaply and don't need to be hoisted.
Most importantly, this keeps constant hoisting from preventing instruction selections ability to turn an AND with 0xffffffff into a move into a 32-bit subregister.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249370 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 02:50:24 +00:00
Craig Topper
a1a1f2a090 [X86] Remove unnecessary AddComplexity directive. The instruction is already wrapped in the equivalent earlier. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249369 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 02:50:21 +00:00
David Majnemer
6212b4d0b8 [WinEH] Update CATCHRET's operand to match its successor
The CATCHRET operand did not match the MachineFunction's CFG.  This
mismatch happened because FrameLowering created a new MachineBasicBlock
and updated the CFG but forgot to update the CATCHRET operand.

Let's make sure this doesn't happen again by strengthing the funclet
membership analysis: it can now reason about the membership of all basic
blocks, not just those inside of funclets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249344 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-05 20:09:16 +00:00
Igor Breger
295b19789d AVX512: Implemented encoding and intrinsics for VPERMILPS/PD instructions.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12690

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249261 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-04 07:20:41 +00:00
Simon Pilgrim
3e07e8cf77 [X86] Lower SEXTLOAD using SIGN_EXTEND_VECTOR_INREG. NCI.
The custom lowering in LowerExtendedLoad is doing the equivalent shuffle, so make use of existing lowering code to reduce duplication.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249243 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-03 18:55:43 +00:00
Andrea Di Biagio
93ccc33e98 Reapply r249121 : "[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types."
This patch teaches FastIsel the following two things:
1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types;
2) On AVX, no instructions are needed for bitcasts between 256-bit vector types.

Example:

  %1 = bitcast <4 x i31> %V to <2 x i64>

Before (-fast-isel -fast-isel-abort=1):

  FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64>

Now we don't fall back to SelectionDAG and we correctly fold that computation
propagating the register associated to %V.

Originally reviewed here: http://reviews.llvm.org/D13347


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249147 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-02 16:08:05 +00:00
Andrea Di Biagio
053f844194 Revert: [FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types.
r249121 caused a Clang test failure (avx2-buitins.c).
Revert r249121 while I keep investigating on the reason why that test failed.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249124 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-02 13:06:19 +00:00
Andrea Di Biagio
f008b99120 [FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types.
This patch teaches FastIsel the following two things:
1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types;
2) On AVX, no instructions are needed for bitcasts between 256-bit vector types.

Example:

  %1 = bitcast <4 x i31> %V to <2 x i64>

Before (-fast-isel -fast-isel-abort=1):

  FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64>

Now we don't fall back to SelectionDAG and we correctly fold that computation
propagating the register associated to %V.

Differential Revision: http://reviews.llvm.org/D13347


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249121 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-02 12:45:37 +00:00
Reid Kleckner
646073b30f [WinEH] Emit __C_specific_handler tables for the new IR
We emit denormalized tables, where every range of invokes in the same
state gets a complete list of EH action entries. This is significantly
simpler than trying to infer the correct nested scoping structure from
the MI. Fortunately, for SEH, the nesting structure is really just a
size optimization.

With this, some basic __try / __except examples work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249078 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 21:38:24 +00:00
David Majnemer
68754da26d [WinEH] Make FuncletLayout more robust against catchret
Catchret transfers control from a catch funclet to an earlier funclet.
However, it is not completely clear which funclet the catchret target is
part of.  Make this clear by stapling the catchret target's funclet
membership onto the CATCHRET SDAG node.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249052 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 18:44:59 +00:00
NAKAMURA Takumi
c3f6054a2c Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64"
It broke; LLVM :: CodeGen__Generic__2009-11-16-BadKillsCrash.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249032 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 17:00:56 +00:00
Ahmed Bougacha
c97ac42320 [X86] Don't custom-lower vNi32 uint_to_fp when unsafe-fp-math.
The custom code produces incorrect results if later reassociated.

Since r221657, on x86, vNi32 uitofp is lowered using an optimized
sequence:

  movdqa LCPI0_0(%rip), %xmm1 ## xmm1 = [65535, ...]
  pand %xmm0, %xmm1
  por LCPI0_1(%rip), %xmm1 ## [0x4b000000, ...]
  psrld $16, %xmm0
  por LCPI0_2(%rip), %xmm0 ## [0x53000000, ...]
  addps LCPI0_3(%rip), %xmm0 ## [float -5.497642e+11, ...]
  addps %xmm1, %xmm0

Since r240361, the machine combiner opportunistically reassociates
2-instruction sequences (with -ffast-math). In the new code sequence,
the ADDPS' are eligible. In isolation, for simple examples (without
reassociable users), this makes no performance difference (the goal
being to enable reassociation of longer chains).

In the trivial example (just one uitofp), the reassociation doesn't
happen, because (I think) it would require the emission of a separate
movaps for a constantpool load (instead of folding it into addps).

However, when we have multiple uitofp sequences, and the constantpool
loads are CSE'd earlier, the machine combiner can do the reassociation.

When the ADDPS' are reassociated, the resulting sequence isn't correct
anymore, as we'd be adding large (2**39) constants with comparatively
smaller values (~2**23). Given that two of the three inputs are powers
of 2 larger than 2**16, and that ulp(2**39) == 2**(39-24) == 2**15,
the reassociated chain will produce 0 for any input in [0, 2**14[.
In my testing, it also produces wrong results for 99.5% of [0, 2**32[.

Avoid this by disabling the new lowering when -ffast-math. It does
mean that we'll get slower code than without it, but at least we
won't get egregiously incorrect code.

One might argue that, considering -ffast-math is all but meaningless,
uitofp producing wrong results isn't a compiler bug. But it really is.

Fixes PR24512.

...though this is really more of a workaround.
Ideally, we'd have some sort of Machine FMF, but that's a problem
that's not worth tackling until we do more with machine IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248965 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 00:11:07 +00:00
Reid Kleckner
39b3d70179 [WinEH] Emit int3 after noreturn calls on Win64
The Win64 unwinder disassembles forwards from each PC to try to
determine if this PC is in an epilogue. If so, it skips calling the EH
personality function for that frame. Typically, this means you cannot
catch an exception in the same frame that you threw it, because 'throw'
calls a noreturn runtime function.

Previously we avoided this problem with the TrapUnreachable
TargetOption, but that's a much bigger hammer than we need. All we need
is a 1 byte non-epilogue instruction right after the call.  Instead,
what we got was an unconditional branch to a shared block containing the
ud2, potentially 7 bytes instead of 1. So, this reverts r206684, which
added TrapUnreachable, and replaces it with something better.

The new code pattern matches for invoke/call followed by unreachable and
inserts an int3 into the DAG. To be 100% watertight, we would need to
insert SEH_Epilogue instructions into all basic blocks ending in a call
with no terminators or successors, but in practice this is unlikely to
come up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248959 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-30 23:09:23 +00:00
Sanjay Patel
4f55287fa6 [x86] enable machine combiner reassociations for 256-bit vector logical integer insts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248955 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-30 22:25:55 +00:00
David Blaikie
892946c873 Fix -Wsign-compare warning
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248942 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-30 20:37:48 +00:00
Simon Pilgrim
4e042482c8 [X86][XOP] Added support for the lowering of 128-bit vector shifts to XOP shift instructions
The XOP shifts just have logical/arithmetic versions and the left/right shifts are controlled by whether the value is positive/negative. Because of this I've added new X86ISD nodes instead of trying to force them to use the existing shift nodes.

Additionally Excavator cores (bdver4) support XOP and AVX2 - meaning that it should use the AVX2 shifts when it can and fall back to XOP in other cases.

Differential Revision: http://reviews.llvm.org/D8690

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248878 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-30 08:17:50 +00:00
Reid Kleckner
a4778f21f0 [WinEH] Setup RBP correctly in Win64 funclet prologues
Previously local variable captures just didn't work in 64-bit. Now we
can access local variables more or less correctly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248857 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-29 23:32:01 +00:00
David Majnemer
28c61f88cc [WinEH] Ensure that funclets obey the x64 ABI
The x64 ABI requires that epilogues do not contain code other than stack
adjustments and some limited control flow.  However, we'd insert code to
initialize the return address after stack adjustments.  Instead, insert
EAX/RAX with the current value before we create the stack adjustments in
the epilogue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248839 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-29 22:33:36 +00:00
Maksim Panchenko
3b3752c898 HHVM calling conventions.
HHVM calling convention, hhvmcc, is used by HHVM JIT for
functions in translated cache. We currently support LLVM back end to
generate code for X86-64 and may support other architectures in the
future.

In HHVM calling convention any GP register could be used to pass and
return values, with the exception of R12 which is reserved for
thread-local area and is callee-saved. Other than R12, we always
pass RBX and RBP as args, which are our virtual machine's stack pointer
and frame pointer respectively.

When we enter translation cache via hhvmcc function, we expect
the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed
to standard ABI alignment. This affects stack object alignment and stack
adjustments for function calls.

One extra calling convention, hhvm_ccc, is used to call C++ helpers from
HHVM's translation cache. It is almost identical to standard C calling
convention with an exception of first argument which is passed in RBP
(before we use RDI, RSI, etc.)

Differential Revision: http://reviews.llvm.org/D12681

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248832 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-29 22:09:16 +00:00
David Majnemer
0ba86faae2 [WinEH] Teach AsmPrinter about funclets
Summary:
Funclets have been turned into functions by the time they hit the object
file.  Make sure that they have decent names for the symbol table and
CFI directives explaining how to reason about their prologues.

Differential Revision: http://reviews.llvm.org/D13261

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248824 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-29 20:12:33 +00:00
Jeroen Ketema
f5f9e9a3bf Arguments spilled on the stack before a function call may have
alignment requirements, for example in the case of vectors.
These requirements are exploited by the code generator by using
move instructions that have similar alignment requirements, e.g.,
movaps on x86.

Although the code generator properly aligns the arguments with
respect to the displacement of the stack pointer it computes,
the displacement itself may cause misalignment. For example if
we have

%3 = load <16 x float>, <16 x float>* %1, align 64
call void @bar(<16 x float> %3, i32 0)

the x86 back-end emits:

movaps  32(%ecx), %xmm2
movaps  (%ecx), %xmm0
movaps  16(%ecx), %xmm1
movaps  48(%ecx), %xmm3
subl    $20, %esp       <-- if %esp was 16-byte aligned before this instruction, it no longer will be afterwards 
movaps  %xmm3, (%esp)   <-- movaps requires 16-byte alignment, while %esp is not aligned as such.
movl    $0, 16(%esp)
calll   __bar

To solve this, we need to make sure that the computed value with which
the stack pointer is changed is a multiple af the maximal alignment seen
during its computation. With this change we get proper alignment:

subl    $32, %esp
movaps  %xmm3, (%esp)

Differential Revision: http://reviews.llvm.org/D12337


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248786 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-29 10:12:57 +00:00
NAKAMURA Takumi
763aee2082 [CMake] X86AsmParser: Prune redundant LINK_LIBS.
It is described in LLVMBuild.txt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248771 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-29 01:25:01 +00:00
Sanjay Patel
5a4293ab23 add a FIXME for a CPU model check that should have an attribute instead
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248746 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 22:00:24 +00:00
Andrew Kaylor
aac3c943f3 Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing.
Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com)

Differential Revision: http://reviews.llvm.org/D11370



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248735 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 20:33:22 +00:00
Yaron Keren
164cc1e735 Silence clang warning: variable ‘Status’ set but not used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248691 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-27 21:31:33 +00:00
Simon Pilgrim
5769b61791 [X86][SSE2] Fix zero/any extension shuffles that don't start from the first element
Fix for D12561 - we weren't correctly ensuring that the base element for extension was moved to start on a boundary suitable for UNPCKL/H

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248536 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 21:02:17 +00:00
Sanjay Patel
ffafbd3b2a [x86] replace integer 'xor' ops with packed SSE FP 'xor' ops when operating on FP scalars
Turn this:

movd %xmm0, %eax
movd %xmm1, %ecx
xorl %eax, %ecx
movd %ecx, %xmm0

into this:

xorps %xmm1, %xmm0

This is related to, but does not solve:
https://llvm.org/bugs/show_bug.cgi?id=22428

This is an extension of:
http://reviews.llvm.org/rL248395


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248415 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-23 18:33:42 +00:00
Sanjay Patel
dfa2ab03d4 [x86] replace integer 'or' ops with packed SSE FP 'or' ops when operating on FP scalars
Turn this:

movd %xmm0, %eax
movd %xmm1, %ecx
orl %eax, %ecx
movd %ecx, %xmm0

into this:

orps %xmm1, %xmm0

This is related to, but does not solve:
https://llvm.org/bugs/show_bug.cgi?id=22428

This is an extension of:
http://reviews.llvm.org/rL248395



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248409 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-23 18:19:07 +00:00
Evgeniy Stepanov
d4052cf84c Android support for SafeStack.
Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

This is a re-commit of a change in r248357 that was reverted in
r248358.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248405 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-23 18:07:56 +00:00
Sanjay Patel
5e2a635f0e move call to convertIntLogicToFPLogic up; NFCI
The BEXTR comments didn't make sense before, we may want to extend the
FP logic transform to work on vectors, and this way is more beautiful.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248404 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-23 18:03:37 +00:00
Sanjay Patel
46a7838d96 [x86] move code for converting int logic to FP logic to a helper function; NFCI
This is a follow-on to:
http://reviews.llvm.org/rL248395

so we can add the call to the or/xor combines too.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248399 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-23 17:39:41 +00:00
Sanjay Patel
1813647111 [x86] replace integer 'and' ops with packed SSE FP 'and' ops when operating on FP scalars
Turn this:
   movd %xmm0, %eax
   movd %xmm1, %ecx
   andl %eax, %ecx
   movd %ecx, %xmm0

into this:
   andps %xmm1, %xmm0


This is related to, but does not solve:
https://llvm.org/bugs/show_bug.cgi?id=22428

Differential Revision: http://reviews.llvm.org/D13065



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248395 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-23 17:00:06 +00:00
Simon Pilgrim
e0a23dddf0 [X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR
This patches removes the x86.sse41.pmovsx* intrinsics, provides a suitable upgrade path and updates relevant tests to sign extend a subvector instead.

LLVM counterpart to D12835

Differential Revision: http://reviews.llvm.org/D13002

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248368 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-23 08:48:33 +00:00
Evgeniy Stepanov
1be7ea773a Revert "Android support for SafeStack."
test/Transforms/SafeStack/abi.ll breaks when target is not supported;
needs refactoring.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248358 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-23 01:23:22 +00:00
Evgeniy Stepanov
c7b6dc0535 Android support for SafeStack.
Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248357 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-23 01:03:51 +00:00
Stephen Canon
eb1a6c5669 Don't raise inexact when lowering ceil, floor, round, trunc.
The C standard has historically not specified whether or not these functions should raise the inexact flag. Traditionally on Darwin, these functions *did* raise inexact, and the llvm lowerings followed that conventions. n1778 (C bindings for IEEE-754 (2008)) clarifies that these functions should not set inexact. This patch brings the lowerings for arm64 and x86 in line with the newly specified behavior.  This also lets us fold some logic into TD patterns, which is nice.

Differential Revision: http://reviews.llvm.org/D12969


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248266 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 11:43:17 +00:00
NAKAMURA Takumi
09c0ea51ca Untabify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248264 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 11:15:07 +00:00
NAKAMURA Takumi
c36e746e98 Reformat blank lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248263 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 11:14:39 +00:00
NAKAMURA Takumi
6902c8db26 Reformat comment lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248262 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 11:14:12 +00:00
NAKAMURA Takumi
d4cdf1962b Reformat.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248261 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 11:13:55 +00:00
Simon Pilgrim
de8d7c41ca [X86][SSE] Match zero/any extension shuffles that don't start from the first element
This patch generalizes the lowering of shuffles as zero extensions to allow extensions that don't start from the first element. It now recognises extensions starting anywhere in the lower 128-bits or at the start of any higher 128-bit lane.

The motivation was to reduce the number of high cost pshufb calls, but it also improves the SSE2 case as well.

Differential Revision: http://reviews.llvm.org/D12561


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248250 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 08:16:08 +00:00
Chad Rosier
c5d4530d42 [Machine Combiner] Refactor machine reassociation code to be target-independent.
No functional change intended.
Patch by Haicheng Wu <haicheng@codeaurora.org>!

http://reviews.llvm.org/D12887
PR24522

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248164 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-21 15:09:11 +00:00
Asaf Badouh
648a027c82 [X86][AVX512] add masked version for RSQRT14 & RCP14 Scalar FP
Differential Revision: http://reviews.llvm.org/D12524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248147 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-21 10:23:53 +00:00
Igor Breger
b0eb8fb69c AVX512: Implemented encoding and intrinsics for vcmpss/sd.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12593

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248121 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-20 15:15:10 +00:00
Asaf Badouh
fe11ff1d50 [X86][AVX512] extend support in Scalar conversion
add scalar FP to Int conversion with truncation intrinsics
add scalar conversion FP32 from/to FP64 intrinsics
add rounding mode and SAE mode encoding for these intrinsics

Differential Revision: http://reviews.llvm.org/D12665

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248117 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-20 14:31:19 +00:00
Igor Breger
0d48c46954 AVX512: vsqrtss/sd encoding and intrinsics implementation.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12102

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248116 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-20 09:13:41 +00:00
Asaf Badouh
d3db2cf572 [X86][AVX512DQ] Add fpclass instruction
Differential Revision: http://reviews.llvm.org/D12931

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248115 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-20 08:46:07 +00:00
Michael Kuperstein
5610c28853 [X86] Fix sitofp and uitofp instruction matching failures with long double and avx512
The operation action for i32 and i64 cannot be set to legal, as long double 
needs custom lowering.

Patch by: mitch.l.bodart@intel.com
Differential Revision: http://reviews.llvm.org/D12372


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248114 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-20 08:12:17 +00:00
Igor Breger
d425988995 AVX512: Implemented intrinsics for vshuff32x4, vshuff64x2, vshufi64x2, vshufi32x4
Added tests for intrinsics.

Differential Revision: http://reviews.llvm.org/D12525

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248113 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-20 07:18:53 +00:00
Igor Breger
28cce2ad25 AVX512: Implement instructions encoding, lowering and intrinsics
vinserti64x4, vinserti64x2, vinserti32x8, vinserti32x4, vinsertf64x4, vinsertf64x2, vinsertf32x8, vinsertf32x4
Added tests for encoding, lowering and intrinsics.

Differential Revision: http://reviews.llvm.org/D11893

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248111 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-20 06:52:42 +00:00
Simon Pilgrim
afa71f40bf [X86][SSE] Vectorize CTTZ + CTTZ_ZERO_UNDEF
Now that we have fast vector CTPOP implementations we can use this to speed up vector CTTZ using the pattern (cttz(x) = ctpop((x & -x) - 1))

Additionally, for AVX512CD that provides lzcnt instructions we can use the pattern (cttz_undef(x) = (width - 1) - ctlz(x & -x))

Differential Revision: http://reviews.llvm.org/D12663

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248091 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-19 13:22:57 +00:00
Reid Kleckner
f946dd0412 [WinEH] Make funclet return instrs pseudo instrs
This makes catchret look more like a branch, and less like a weird use
of BlockAddress. It also lets us get away from
llvm.x86.seh.restoreframe, which relies on the old parentfpoffset label
arithmetic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247936 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-17 20:43:47 +00:00
Elena Demikhovsky
2e5bf5535c AVX-512: shufflevector for i1 vectors <2 x i1> .. <64 x i1>
AVX-512 does not provide an instruction that shuffles mask register. So I do the following way:

mask-2-simd , shuffle simd , simd-2-mask

Differential Revision: http://reviews.llvm.org/D12727



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247876 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-17 06:53:12 +00:00
Eric Christopher
973f7aa32a constify the Function parameter to the TTI creation callback and
propagate to all callers/users/etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247864 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 23:38:13 +00:00
Reid Kleckner
436444d569 [WinEH] Rip out the landingpad-based C++ EH state numbering code
It never really worked, and the new code is working better every day.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247860 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 22:14:46 +00:00
Reid Kleckner
66ef931b43 [WinEH] Pull Adjectives and CatchObj out of the catchpad arg list
Clang now passes the adjectives as an argument to catchpad.

Getting the CatchObj working is simply a matter of threading another
static alloca through codegen, first as an alloca, then as a frame
index, and finally as a frame offset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247844 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 20:16:27 +00:00
Reid Kleckner
1b86a3446f [WinEH] Skip state numbering when no EH pads are present
Otherwise we'd try to emit the thunk that passes the LSDA to
__CxxFrameHandler3. We don't emit the LSDA if there were no landingpads,
so we'd end up with an assembler error when trying to write the COFF
object.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247820 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 17:19:44 +00:00
Sanjay Patel
39490133e4 propagate fast-math-flags on DAG nodes
After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, 
so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: 
if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is
one test case in this patch to prove that point.

This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I 
did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF
( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes.

This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as
FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the
current global settings.

Differential Revision: http://reviews.llvm.org/D12095



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247815 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 16:31:21 +00:00
Michael Kuperstein
196a1036c4 [X86] Do not generate 64-bit pops of 32-bit GPRs.
When trying emit a stack adjustments using pops, frame lowering selects an
arbitrary free GPR. It should always select one from an appropriate class...
This fixes PR24649.

Patch by: amjad.aboud@intel.com
Differential Revision: http://reviews.llvm.org/D12609


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247785 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 11:27:20 +00:00
Michael Kuperstein
b82c32333b [X86] Fix emitEpilogue() to make less assumptions about pops
This is the mirror image of r242395.
When X86FrameLowering::emitEpilogue() looks for where to insert the %esp addition that
deallocates stack space used for local allocations, it assumes that any sequence of pop
instructions from function exit backwards consists purely of restoring callee-save registers.

This may be false, since from some point backward, the pops may be clean-up of stack space
allocated for arguments to a call.

Patch by: amjad.aboud@intel.com
Differential Revision: http://reviews.llvm.org/D12688


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247784 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 11:18:25 +00:00
Daniel Sanders
47b167dd84 Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Eric has replied and has demanded the patch be reverted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247702 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 16:17:27 +00:00
Daniel Sanders
9781f90c7e Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).

For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.

This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.

This commit also contains a trivial patch to clang to account for the C++ API
change. Thanks go to Pavel Labath for fixing LLDB for me.

Reviewers: rengolin

Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10969


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247692 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 14:08:28 +00:00
Daniel Sanders
a6aa0c3bcc Revert r247684 - Replace Triple with a new TargetTuple ...
LLDB needs to be updated in the same commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247686 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 13:46:21 +00:00
Daniel Sanders
7b82808e13 Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).

For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.

This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.

This commit also contains a trivial patch to clang to account for the C++ API
change.

Reviewers: rengolin

Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10969



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247683 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 13:17:40 +00:00
Daniel Sanders
c413998d28 Fix namespace indentation and missing blank lines before 'public:' in *MCAsmInfo.h. NFC.
This is to reduce noise in a following commit.

Also fixes a couple missing spaces before the reference operator.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247679 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 12:27:06 +00:00
Simon Pilgrim
29f50e9783 [X86][MMX] Added shuffle decodes for MMX/3DNow! shuffles.
Added shuffle decodes for MMX PUNPCK + PSHUFW shuffles.
Added shuffle decodes for 3DNow! PSWAPD shuffles.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247526 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-13 11:28:45 +00:00
Elena Demikhovsky
b21635658f AVX-512: Fixed a bug in OR/XOR operations for 512-bit FP values on KNL.
KNL does not have VXORPS, VORPS for 512-bit values.
I use integer VPXOR, VPOR that actually do the same.

X86ISD::FXOR/FOR are generated as a result of FSUB combining.

Differential Revision: http://reviews.llvm.org/D12753



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247523 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-13 08:15:15 +00:00
Sanjay Patel
d8fd22b7ec [x86] enable machine combiner reassociations for 128-bit vector logical integer insts (2nd try)
The changes in:
test/CodeGen/X86/machine-cp.ll
are just due to scheduling differences after some logic instructions were reassociated.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247516 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-12 19:47:50 +00:00
Simon Pilgrim
b43332c8d1 [X86] Renamed lowerVectorShuffleAsUnpack NFCI.
Renamed to lowerVectorShuffleAsPermuteAndUnpack to make it clear that it lowers to more than just a UNPCK instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247513 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-12 18:26:47 +00:00
Simon Pilgrim
9114b92030 [X86] Moved lowerVectorShuffleWithUNPCK earlier to make reuse easier. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247511 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-12 16:03:06 +00:00
Sanjay Patel
69f08e598c revert r247506; need to verify changes in existing tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247507 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-12 15:27:31 +00:00
Sanjay Patel
e6c453bf2d [x86] enable machine combiner reassociations for 128-bit vector logical integer insts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247506 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-12 14:58:04 +00:00
Akira Hatanaka
8e2b613ef0 Use function attribute "stackrealign" to decide whether stack
realignment should be forced.

With this commit, we can now force stack realignment when doing LTO and
do so on a per-function basis. Also, add a new cl::opt option
"stackrealign" to CommandFlags.h which is used to force stack
realignment via llc's command line.

Out-of-tree projects currently using -force-align-stack to force stack
realignment should make changes to attach the attribute to the functions
in the IR.

Differential Revision: http://reviews.llvm.org/D11814


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247450 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-11 18:54:38 +00:00
Ahmed Bougacha
74869be273 [CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit.
We used to have this magic "hasLoadLinkedStoreConditional()" callback,
which really meant two things:
- expand cmpxchg (to ll/sc).
- expand atomic loads using ll/sc (rather than cmpxchg).

Remove it, and, instead, introduce explicit callbacks:
- bool shouldExpandAtomicCmpXchgInIR(inst)
- AtomicExpansionKind shouldExpandAtomicLoadInIR(inst)

Differential Revision: http://reviews.llvm.org/D12557

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247429 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-11 17:08:28 +00:00
Ahmed Bougacha
f3d2de3832 [CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind.
This lets us generalize its usage to the other atomic instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247428 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-11 17:08:17 +00:00
Reid Kleckner
3f3f976995 [WinEH] Push and pop EBP for 32-bit funclets
The Win32 EH runtime caller does not preserve EBP, even though it does
preserve the CSRs (EBX, ESI, EDI) for us. The result was that each
finally funclet call would leave the frame pointer off by 12 bytes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247348 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 22:00:02 +00:00
Hans Wennborg
07a3b97f20 Re-commit r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes"
Except the changes that defined virtual destructors as =default, because that
ran into problems with GCC 4.7 and overriding methods that weren't noexcept.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247298 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 16:49:58 +00:00
Igor Breger
7744f3ca3f AVX512: Implemented encoding and intrinsics for
vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11802

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247276 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 12:54:54 +00:00
Hans Wennborg
2515069180 Revert r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes"
This caused build breakges, e.g.
http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/24926

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247226 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 00:57:26 +00:00
Ahmed Bougacha
de3869437d [CodeGen] Make x86 nontemporal store patfrags generic. NFC.
To be used by other targets.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247225 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 00:53:15 +00:00
Reid Kleckner
75771885f8 [WinEH] Add codegen support for cleanuppad and cleanupret
All of the complexity is in cleanupret, and it mostly follows the same
codepaths as catchret, except it doesn't take a return value in RAX.

This small example now compiles and executes successfully on win32:
  extern "C" int printf(const char *, ...) noexcept;
  struct Dtor {
    ~Dtor() { printf("~Dtor\n"); }
  };
  void has_cleanup() {
    Dtor o;
    throw 42;
  }
  int main() {
    try {
      has_cleanup();
    } catch (int) {
      printf("caught it\n");
    }
  }

Don't try to put the cleanup in the same function as the catch, or Bad
Things will happen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247219 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 00:25:23 +00:00
Hans Wennborg
bfd007fd70 Fix Clang-tidy misc-use-override warnings, other minor fixes
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D12740

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247216 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 00:12:56 +00:00
Reid Kleckner
69d051c87d [SEH] Emit 32-bit SEH tables for the new EH IR
The 32-bit tables don't actually contain PC range data, so emitting them
is incredibly simple.

The 64-bit tables, on the other hand, use the same table for state
numbering as well as label ranges. This makes things more difficult, so
it will be implemented later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247192 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 21:10:03 +00:00
Renato Golin
792b67e240 Revert "AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding."
This reverts commit r247149, as it was breaking numerous buildbots of varied architectures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247177 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 19:44:40 +00:00
Matthias Braun
af5ff60200 Save LaneMask with livein registers
With subregister liveness enabled we can detect the case where only
parts of a register are live in, this is expressed as a 32bit lanemask.
The current code only keeps registers in the live-in list and therefore
enumerated all subregisters affected by the lanemask. This turned out to
be too conservative as the subregister may also cover additional parts
of the lanemask which are not live. Expressing a given lanemask by
enumerating a minimum set of subregisters is computationally expensive
so the best solution is to simply change the live-in list to store the
lanemasks as well. This will reduce memory usage for targets using
subregister liveness and slightly increase it for other targets

Differential Revision: http://reviews.llvm.org/D12442

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247171 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 18:08:03 +00:00
Igor Breger
1676f595b1 AVX512: Implemented encoding and intrinsics for
vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11802

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247149 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 14:35:09 +00:00
Reid Kleckner
dff75493e8 [WinEH] Emit prologues and epilogues for funclets
Summary:
32-bit funclets have short prologues that allocate enough stack for the
largest call in the whole function. The runtime saves CSRs for the
funclet. It doesn't restore CSRs after we finally transfer control back
to the parent funciton via a CATCHRET, but that's a separate issue.
32-bit funclets also have to adjust the incoming EBP value, which is
what llvm.x86.seh.recoverframe does in the old model.

64-bit funclets need to spill CSRs as normal. For simplicity, this just
spills the same set of CSRs as the parent function, rather than trying
to compute different CSR sets for the parent function and each funclet.
64-bit funclets also allocate enough stack space for the largest
outgoing call frame, like 32-bit.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12546

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247092 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 22:44:41 +00:00
Derek Schuff
0d89d696f4 x32. Fixes a bug in how struct va_list is initialized in x32
Summary: This patch modifies X86TargetLowering::LowerVASTART so that
struct va_list is initialized with 32 bit pointers in x32. It also
includes tests that call @llvm.va_start() for x32.

Patch by João Porto

Subscribers: llvm-commits, hjl.tools
Differential Revision: http://reviews.llvm.org/D12346

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247069 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 20:51:31 +00:00
Derek Schuff
c76507d03c x32. Fixes a bug in i8mem_NOREX declaration.
The old implementation assumed LP64 which is broken for x32.  Specifically, the
MOVE8rm_NOREX and MOVE8mr_NOREX, when selected, would cause a 'Cannot emit
physreg copy instruction' error message to be reported.

This patch also enable the h-register*ll tests for x32.

Differential Revision: http://reviews.llvm.org/D12336

Patch by João Porto

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247058 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 19:47:15 +00:00
Andrew Kaylor
b25ffb37c6 Fix for bz24500: Avoid non-deterministic code generation triggered by the x86 call frame optimization
Patch by Dave Kreitzer

Differential Revision: http://reviews.llvm.org/D12620



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247042 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 18:18:46 +00:00
Igor Breger
b23094366e AVX512: kunpck encoding implementation
Added tests for encoding.

Differential Revision: http://reviews.llvm.org/D12061

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247010 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 13:10:00 +00:00
Elena Demikhovsky
1c82e5f791 Removed an old comment, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247006 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 12:22:22 +00:00
Elena Demikhovsky
27828d7a5e compilation issue, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246983 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 07:34:06 +00:00
Elena Demikhovsky
758b9df87a fixed compilation issue, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246982 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 07:10:08 +00:00
Elena Demikhovsky
1e00496f88 AVX-512: Lowering for 512-bit vector shuffles.
Vector types: <8 x 64>, <16 x 32>, <32 x 16> float and integer.

Differential Revision: http://reviews.llvm.org/D10683



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246981 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 06:38:21 +00:00
Reid Kleckner
412db355b3 Sink COFF.h MC include into .cpp files
This prevents MC clients from getting COFF.h, which conflicts with
winnt.h macros. Also a minor IWYU cleanup. Now the only public headers
including COFF.h are in Object, and they actually need it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246784 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-03 16:41:50 +00:00
Sanjay Patel
eb8298cfe1 [x86] enable machine combiner reassociations for scalar 'xor' insts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246781 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-03 16:36:16 +00:00
Igor Breger
d951d3c8df AVX512: Implemented encoding and intrinsics for vplzcntq, vplzcntd, vpconflictq, vpconflictd
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11931

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246750 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-03 09:05:31 +00:00
Ahmed Bougacha
1522e8d8f8 [X86] Require 32-byte alignment for 32-byte VMOVNTs.
We used to accept (and even test, and generate) 16-byte alignment
for 32-byte nontemporal stores, but they require 32-byte alignment,
per SDM. Found by inspection.

Instead of hardcoding 16 in the patfrag, check for natural alignment.
Also fix the autoupgrade and the various tests.

Also, use explicit -mattr instead of -mcpu: I stared at the output
several minutes wondering why I get 2x movntps for the unaligned
case (which is the ideal output, but needs some work: see FIXME),
until I remembered corei7-avx implies +slow-unaligned-mem-32.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246733 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-02 23:25:39 +00:00
Ahmed Bougacha
074165218d [X86] Cleanup nontemporal fragments. NFCI.
We can chain other fragments to avoid repeating conditions.
This also fixes a potential bug (that realistically can't happen),
where we would match indexed nontemporal stores for i32/i64.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246719 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-02 22:27:38 +00:00
Sanjay Patel
ec44710063 [x86] fix allowsMisalignedMemoryAccesses() for 8-byte and smaller accesses
This is a continuation of the fix from:
http://reviews.llvm.org/D10662

and discussion in:
http://reviews.llvm.org/D12154

Here, we distinguish slow unaligned SSE (128-bit) accesses from slow unaligned
scalar (64-bit and under) accesses. Other lowering (eg, getOptimalMemOpType) 
assumes that unaligned scalar accesses are always ok, so this changes 
allowsMisalignedMemoryAccesses() to match that behavior.

Differential Revision: http://reviews.llvm.org/D12543


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246658 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-02 15:42:49 +00:00
Asaf Badouh
05859c7cbb [X86][AVX512VLBW] add support in byte shift and SAD
add byte shift left/right
add SAD - compute sum of absolute differences

Differential Revision: http://reviews.llvm.org/D12479

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246654 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-02 14:21:54 +00:00
Igor Breger
1b50f7132b AVX512: Implemented encoding and intrinsics for VGETMANTPD/S , VGETMANTSD/S instructions
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11593

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246642 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-02 11:18:55 +00:00
Igor Breger
191108c6b8 AVX512: Implemented encoding and intrinsics for vshufps/d.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11709

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246640 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-02 10:50:58 +00:00
Elena Demikhovsky
e1bb461f27 AVX-512: store <4 x i1> and <2 x i1> values in memory
Enabled DAG pattern lowering for SKX with DQI predicate.

Differential Revision: http://reviews.llvm.org/D12550



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246625 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-02 09:20:58 +00:00
Vedant Kumar
ec0cd29de8 [CodeGen] Fix FREM on 32-bit MSVC on x86
Patch by Dylan McKay!

Differential Revision: http://reviews.llvm.org/D12099

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246615 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-02 01:31:58 +00:00
Sanjay Patel
ac515c4087 rename "slow-unaligned-mem-under-32" to slow-unaligned-mem-16" (NFCI)
This is a follow-on suggested by:
http://reviews.llvm.org/D12154 ( http://reviews.llvm.org/rL245729 )
http://reviews.llvm.org/D10662 ( http://reviews.llvm.org/rL245075 )

This makes the attribute name match most of the existing lowering logic
and regression test expectations.

But the current use of this attribute is inconsistent; see the FIXME
comment for "allowsMisalignedMemoryAccesses()". That change will
result in functional changes and should be coming soon.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246585 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-01 20:51:51 +00:00
Igor Breger
c02bfc6060 AVX512: Implemented intrinsics for valign.
Differential Revision: http://reviews.llvm.org/D12526

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246551 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-01 15:27:18 +00:00
Sanjay Patel
63384be23d [x86] enable machine combiner reassociations for scalar 'or' insts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246481 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-31 20:27:03 +00:00
Matthias Braun
023a6e3548 X86: Fix FastISel SSESelect register class
X86FastISel has been using the wrong register class for VBLENDVPS which
produces a VR128 and needs an extra copy to the target register. The
problem was already hit by the existing test cases when using
> llvm-lit -Dllc="llc -verify-machineinstr"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246461 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-31 18:25:11 +00:00
Igor Breger
046f79fbb0 AVX512: ktest implemantation
Added tests for encoding.

Differential Revision: http://reviews.llvm.org/D11979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246439 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-31 13:30:19 +00:00
Igor Breger
c7aaf020ab AVX512: Implemented encoding and intrinsics for vdbpsadbw
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12491

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246436 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-31 13:09:30 +00:00
Igor Breger
c21a0f3132 AVX512: kadd implementation
Added tests for encoding.

Differential Revision: http://reviews.llvm.org/D11973

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246432 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-31 11:50:23 +00:00
Igor Breger
66973634a5 AVX512: Implemented encoding and intrinsics for vpalignr
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12270

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246428 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-31 11:14:02 +00:00
Hal Finkel
16c92083ab [MIR Serialization] static -> static const in getSerializable*MachineOperandTargetFlags
Make the arrays 'static const' instead of just 'static'. Post-commit review
comment from Roman Divacky on IRC. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246376 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-30 08:07:29 +00:00
Vedant Kumar
21f084aa72 [X86] NFC: Clean up and clang-format a few lines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246340 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-28 21:59:00 +00:00
Sanjay Patel
4b1821fa36 [x86] enable machine combiner reassociations for scalar 'and' insts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-28 14:09:48 +00:00
Reid Kleckner
c0e64ada5c [WinEH] Add some support for code generating catchpad
We can now run 32-bit programs with empty catch bodies.  The next step
is to change PEI so that we get funclet prologues and epilogues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246235 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-27 23:27:47 +00:00
Reid Kleckner
9c5b8a0117 [ms-inline-asm] Relax assertion around funky identifiers slightly
A corresponding clang change will make it so that clang can consume part
of an assembler token. The assembler treats '.' as an identifier
character while clang does not, so it's view of the token stream is a
little different.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246089 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-26 21:57:25 +00:00
Andrew Kaylor
ab3081c118 Expose hasLiveCondCodeDef as a member function of the X86InstrInfo class. NFC
This takes the existing static function hasLiveCondCodeDef and makes it a member function of the X86InstrInfo class. This is a useful utility function that an upcoming change would like to use. NFC.

Patch by: Kevin B. Smith
Differential Revision: http://reviews.llvm.org/D12371




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246073 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-26 20:36:52 +00:00
Vedant Kumar
0c3c0acf23 [llvm-mc] Ignore opcode size prefix in 64-bit CALL disassembly
This is a fix for disassembling unusual instruction sequences in 64-bit
mode w.r.t the CALL rel16 instruction. It might be desirable to move the
check somewhere else, but it essentially mimics the special case
handling with JCXZ in 16-bit mode.

The current behavior accepts the opcode size prefix and causes the
call's immediate to stop disassembling after 2 bytes. When debugging
sequences of instructions with this pattern, the disassembler output
becomes extremely unreliable and essentially useless (if you jump midway
into what lldb thinks is a unified instruction, you'll lose %rip). So we
ignore the prefix and consume all 4 bytes when disassembling a 64-bit
mode binary.

Note: in Vol. 2A 3-99 the Intel spec states that CALL rel16 is N.S. N.S.
is defined as:

    Indicates an instruction syntax that requires an address override
    prefix in 64-bit mode and is not supported. Using an address
    override prefix in 64-bit mode may result in model-specific
    execution behavior. (Vol. 2A 3-7)

Since 0x66 is an operand override prefix we should be OK (although we
may want to warn about 0x67 prefixes to 0xe8). On the CPUs I tested
with, they all ignore the 0x66 prefix in 64-bit mode.

Patch by Matthew Barney!

Differential Revision: http://reviews.llvm.org/D9573

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246038 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-26 16:20:29 +00:00
Matthias Braun
ec01af9135 FastISel: Factor out common code; NFC intended
This should be no functional change but for the record: For three cases
in X86FastISel this will change the order in which the FalseMBB and
TrueMBB of a conditional branch is addedd to the successor/predecessor
lists.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245997 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-26 01:38:00 +00:00
Charles Davis
7e96f0f6ff Make variable argument intrinsics behave correctly in a Win64 CC function.
Summary:
This change makes the variable argument intrinsics, `llvm.va_start` and
`llvm.va_copy`, and the `va_arg` instruction behave as they do on Windows
inside a `CallingConv::X86_64_Win64` function. It's needed for a Clang patch
I have to add support for GCC's `__builtin_ms_va_list` constructs.

Reviewers: nadav, asl, eugenis

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245990 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-25 23:27:41 +00:00
Sanjay Patel
af7c00f213 make fast unaligned memory accesses implicit with SSE4.2 or SSE4a
This is a follow-on from the discussion in http://reviews.llvm.org/D12154.

This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has
generally fast unaligned memory ops.

A motivating use case for this change is a clang invocation that doesn't explicitly set
the CPU, but does target a feature that we know only exists on a CPU that supports fast
unaligned memops. For example:
$ clang -O1 foo.c -mavx

This resolves a difference in lowering noted in PR24449:
https://llvm.org/bugs/show_bug.cgi?id=24449

Before this patch, we used different store types depending on whether the example can be
lowered as a memset or not.

Differential Revision: http://reviews.llvm.org/D12288



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245950 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-25 16:29:21 +00:00
Michael Kuperstein
89024e86ad [X86] Remove references to _ftol2
As of r245924, _ftol2 is no longer used for fptoui on MS platforms.
Remove the dead code associated with it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245925 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-25 07:58:33 +00:00
Michael Kuperstein
f48b1beeec [X86] Fix fptoui conversions
This fixes two issues in x86 fptoui lowering.
1) Makes conversions from f80 go through the right path on AVX-512.
2) Implements an inline sequence for fptoui i64 instead of a library
call. This improves performance by 6X on SSE3+ and 3X otherwise.
Incidentally, it also removes the use of ftol2 for fptoui, which was
wrong to begin with, as ftol2 converts to a signed i64, producing
wrong results for values >= 2^63.

Patch by: mitch.l.bodart@intel.com
Differential Revision: http://reviews.llvm.org/D11316

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245924 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-25 07:42:09 +00:00
Steve King
46ff6da860 Pass function attributes instead of boolean in isIntDivCheap().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245921 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-25 02:31:21 +00:00
Matthias Braun
56dd2d0886 MachineBasicBlock: Add liveins() method returning an iterator_range
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245895 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-24 22:59:52 +00:00
Michael Zuckerman
59dfeede45 [X86] Add support for mmword memory operand size for Intel-syntax x86 assembly
Differential Revision: http://reviews.llvm.org/D12151


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245835 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-24 10:26:54 +00:00
Michael Zuckerman
7b854fda4a first commit to llvm
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245825 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-24 07:48:50 +00:00
Sanjay Patel
bf83737adb [x86] enable machine combiner reassociations for 256-bit vector min/max
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245735 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-21 21:04:21 +00:00
Sanjay Patel
6113df3d73 remove 'FeatureSlowUAMem' from AMD CPUs based on 10H micro-arch or later
See discussion in D12154 ( http://reviews.llvm.org/D12154 ), AMD Software
Optimization Guides for 10H/12H/15H/16H, and Agner Fog's experimental data.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245733 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-21 20:39:17 +00:00
Sanjay Patel
2071d7abd9 [x86] invert logic for attribute 'FeatureFastUAMem'
This is a 'no functional change intended' patch. It removes one FIXME, but adds several more.

Motivation: the FeatureFastUAMem attribute may be too general. It is used to determine if any
sized misaligned memory access under 32-bytes is 'fast'. From the added FIXME comments, however,
you can see that we're not consistent about this. Changing the name of the attribute makes it
clearer to see the logic holes.

Changing this to a 'slow' attribute also means we don't have to add an explicit 'fast' attribute
to new chips; fast unaligned accesses have been standard for several generations of CPUs now.

Differential Revision: http://reviews.llvm.org/D12154



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245729 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-21 20:17:26 +00:00
Sanjay Patel
b730bdf4e9 [x86] enable machine combiner reassociations for 128-bit vector min/max
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245715 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-21 18:06:49 +00:00
Eric Christopher
32ce343fe6 Fix typo - symetric -> symmetric.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245705 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-21 16:23:39 +00:00
Ahmed Bougacha
14fea5bf0d [X86] Look for scalar through one bitcast when lowering to VBROADCAST.
Fixes PR23464: one way to use the broadcast intrinsics is:

  _mm256_broadcastw_epi16(_mm_cvtsi32_si128(*(int*)src));

We don't currently fold this, but now that we use native IR for
the intrinsics (r245605), we can look through one bitcast to find
the broadcast scalar.

Differential Revision: http://reviews.llvm.org/D10557


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245613 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-20 21:02:39 +00:00
Ahmed Bougacha
ad0ddd8e01 [X86] Replace avx2 broadcast intrinsics with native IR.
Since r245605, the clang headers don't use these anymore.
r245165 updated some of the tests already; update the others, add
an autoupgrade, remove the intrinsics, and cleanup the definitions.

Differential Revision: http://reviews.llvm.org/D10555


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245606 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-20 20:36:19 +00:00
Marina Yatsina
4ca59ab261 [X86] Fix FBLD and FBSTP
FBLD and FBSTP should receive TBYTE because it is defined as
FBLD m80
FBSTP m80

Differential Revision: http://reviews.llvm.org/D11748



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245553 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-20 11:51:24 +00:00
Marina Yatsina
8d1c57faee [X86] Fix bug in COMISD and COMISS definition in td files
COMISD should receive QWORD because it is defined as
 (V)COMISD xmm1, xmm2/m64

COMISS should receive DWORD because it is defined as
 (V)COMISS xmm1, xmm2/m32

Differential Revision: http://reviews.llvm.org/D11712



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245551 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-20 11:21:36 +00:00
David Majnemer
fa1aef3608 [X86] Fix the (shl (and (setcc_c), c1), c2) -> (and setcc_c, (c1 << c2)) fold
We didn't check for the necessary preconditions before folding a
mask/shift into a single mask.

This fixes PR24516.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245544 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-20 09:00:56 +00:00
Sanjay Patel
3b7c3d3fe9 [x86] enable machine combiner reassociations for scalar double-precision min/max
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245506 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-19 21:27:27 +00:00
Sanjay Patel
d81980d640 [x86] enable machine combiner reassociations for scalar single-precision maximums
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245504 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-19 21:18:46 +00:00
David Majnemer
26047ead4b [X86] Emit more efficient >= comparisons against 0
We don't do a great job with >= 0 comparisons against zero when the
result is used as an i8.

Given something like:
  void f(long long LL, bool *B) {
    *B = LL >= 0;
  }

We used to generate:
  shrq    $63, %rdi
  xorb    $1, %dil
  movb    %dil, (%rsi)

Now we generate:
  testq   %rdi, %rdi
  setns   (%rsi)

Differential Revision: http://reviews.llvm.org/D12136

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245498 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-19 20:51:40 +00:00
Bruno Cardoso Lopes
71a40e6fef [PeepholeOptimizer] Look through PHIs to find additional register sources
Reintroduce r245442. Remove an overly conservative assertion introduced
in r245442. We could replace the assertion to use `shareSameRegisterFile`
instead, but in that point in `insertPHI` we already lost the original
Def subreg to check against. So drop the assertion completely.

Original commit message:

- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.

With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:

A:
  psllq %mm1, %mm0
  movd  %mm0, %r9
  jmp C

B:
  por %mm1, %mm0
  movd  %mm0, %r9
  jmp C

C:
  movd  %r9, %mm0
  pshufw  $238, %mm0, %mm0

Becomes:

A:
  psllq %mm1, %mm0
  jmp C

B:
  por %mm1, %mm0
  jmp C

C:
  pshufw  $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245479 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-19 18:53:36 +00:00
Derek Schuff
946eb8b5c6 x32. Fixes a bug in x32 exception handling.
This patch updates the X86 lowering so that the Exception Pointer and Selector
are 64-bit wide only if Subtarget.isTarget64BitLP64.

Patch by João Porto

Reviewers: dschuff, rnk
Differential Revision: http://reviews.llvm.org/D12111

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245454 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-19 16:28:21 +00:00
JF Bastien
36fdcfb93a x32. Fixes jmp %reg in x32
x32 has 32-bit pointers; x86-64 can't jmp %r32. This patch addresses this issue by explicitly zero-extending brind's target to 64-bits.

Author: jpp

Reviewers: jfb, dschuff, pavel.v.chupin

Subscribers: llvm-commits

Differential revision: http://reviews.llvm.org/D12112

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245452 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-19 16:17:08 +00:00