Commit Graph

86689 Commits

Author SHA1 Message Date
Benjamin Kramer
75511fc092 Reflect the MC/MCDisassembler split on the include/ level.
No functional change, just moving code around.

llvm-svn: 258818
2016-01-26 16:44:37 +00:00
Sanjay Patel
906306d436 [LibCallSimplifier] fold memset(malloc(x), 0, x) --> calloc(1, x)
This is a step towards solving PR25892:
https://llvm.org/bugs/show_bug.cgi?id=25892

It won't handle the reported case. As noted by the 'TODO' comments in the patch, 
we need to relax the hasOneUse() constraint and also match patterns that include
memset_chk() and the llvm.memset() intrinsic in addition to memset().

Differential Revision: http://reviews.llvm.org/D16337

llvm-svn: 258816
2016-01-26 16:17:24 +00:00
Matthew Simpson
973e079b66 Revert "Reapply commit r258404 with fix"
This commit exposes a crash in computeKnownBits on the Chromium buildbots.
Reverting to investigate.

Reference: https://llvm.org/bugs/show_bug.cgi?id=26307
llvm-svn: 258812
2016-01-26 15:45:49 +00:00
Igor Laevsky
7660fa06ad Re-submit r256008 "Improve DWARFDebugFrame::parse to also handle __eh_frame."
Originally this change was causing failures on windows buildbots.
But those problems were fixed in r258806.

llvm-svn: 258811
2016-01-26 15:09:42 +00:00
Dan Gohman
5af9a1aad6 [WebAssembly] Fix a typo in a comment.
llvm-svn: 258810
2016-01-26 14:55:17 +00:00
Igor Laevsky
3044550590 [DebugInfo] Fix DWARFDebugFrame instruction operand ordering
We can't rely on the evalution order of function arguments.

Differential Revision: http://reviews.llvm.org/D16509

llvm-svn: 258806
2016-01-26 13:31:11 +00:00
Simon Pilgrim
0800d72a1a [X86][SSE] Add zero element and general 64-bit VZEXT_LOAD support to EltsFromConsecutiveLoads
This patch adds support for trailing zero elements to VZEXT_LOAD loads (and checks that no zero elts occur within the consecutive load).

It also generalizes the 64-bit VZEXT_LOAD load matching to work for loads other than 2x32-bit loads.

After this patch it will also be easier to add support for other basic load patterns like 32-bit VZEXT_LOAD loads, PMOVZX and subvector load insertion.

Differential Revision: http://reviews.llvm.org/D16217

llvm-svn: 258798
2016-01-26 09:30:08 +00:00
Craig Topper
6dd1abcce8 [X86] Mark LDS/LES as not being allowed in 64-bit mode.
Their opcodes are used as part of the VEX prefix in 64-bit mode. Clearly the disassembler implicitly decoded them as AVX instructions in 64-bit mode, but I think the AsmParser would have encoded them.

llvm-svn: 258793
2016-01-26 06:10:15 +00:00
Matt Arsenault
9094a66425 AMDGPU: Move AMDGPU intrinsics only used by R600
llvm-svn: 258790
2016-01-26 04:49:24 +00:00
Matt Arsenault
f5c703425b AMDGPU: Tidy minor td file issues
Make comments and indentation more consistent.

Rearrange a few things to be in a more consistent order,
such as organizing subtarget features from those describing
an actual device property, and those used as options.

llvm-svn: 258789
2016-01-26 04:49:22 +00:00
Matt Arsenault
51a14cbbc7 AMDGPU: Make v32i8/v64i8 illegal types
Old intrinsics were forcing these, but they have now all
been removed. This fixes large i8 vector operations generally
being broken.

llvm-svn: 258788
2016-01-26 04:43:48 +00:00
Matt Arsenault
b7742acaf4 AMDGPU: Remove old sample intrinsics
I did my best to try to update all the uses in tests that
just happened to use the old ones to the newer intrinsics.

I'm not sure I got all of the immediate operand conversions
correct, since the value seems to have been ignored by the
old pattern but I don't think it really matters.

llvm-svn: 258787
2016-01-26 04:38:08 +00:00
Matt Arsenault
581518df24 AMDGPU: Add new amdgcn intrinsics for cube instructions
More cleanup to try to get all intrinsics using the correct
amdgcn prefix that are as close to the instruction as possible.

llvm-svn: 258786
2016-01-26 04:29:56 +00:00
Matt Arsenault
667cd15c1c AMDGPU: Implement read_register and write_register intrinsics
Some of the special intrinsics now that now correspond to a instruction
also have special setting of some registers, e.g. llvm.SI.sendmsg sets
m0 as well as use s_sendmsg. Using these explicit register intrinsics
may be a better option.

Reading the exec mask and others may be useful for debugging. For this
I'm not sure this is entirely correct because we would want this to
be convergent, although it's possible this is already treated
sufficently conservatively.

llvm-svn: 258785
2016-01-26 04:29:24 +00:00
Matt Arsenault
97a3b39dcb AMDGPU: Restore AMDGPU prefixed rsq intrinsic for now
Also move into backend intrinsics to discourage use of the old name.

llvm-svn: 258783
2016-01-26 04:14:16 +00:00
Dan Gohman
bf8e0c60a6 [WebAssembly] Optimize memcpy/memmove/memcpy calls.
These calls return their first argument, but because LLVM uses an intrinsic
with a void return type, they can't use the returned attribute. Generalize
the store results pass to optimize these calls too.

llvm-svn: 258781
2016-01-26 04:01:11 +00:00
Dan Gohman
82741c1205 [WebAssembly] Remove a completed entry from the README.txt.
llvm-svn: 258780
2016-01-26 03:43:48 +00:00
Dan Gohman
e694253126 [WebAssembly] Implement unaligned loads and stores.
Differential Revision: http://reviews.llvm.org/D16534

llvm-svn: 258779
2016-01-26 03:39:31 +00:00
Haicheng Wu
5302d65f58 [LIR] Add support for structs and hand unrolled loops
This is a recommit of r258620 which causes PR26293.

The original message:

Now LIR can turn following codes into memset:

typedef struct foo {
  int a;
  int b;
} foo_t;

void bar(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; ++i) {
    f[i].a = 0;
    f[i].b = 0;
  }
}

void test(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; i += 2) {
    f[i] = 0;
    f[i+1] = 0;
  }
}

llvm-svn: 258777
2016-01-26 02:27:47 +00:00
Reid Kleckner
0081990bfb Use binary search for intrinsic ID lookups
This improves compile time of Function.cpp from 57s to 37s for me
locally.  Intrinsic IDs are cached on the Function object, so this
shouldn't regress performance.

llvm-svn: 258774
2016-01-26 02:06:41 +00:00
Matthias Braun
036b32dcef LiveIntervalAnalysis: Improve some comments
As recommended by Justin.

llvm-svn: 258771
2016-01-26 01:40:48 +00:00
Reid Kleckner
b834dd994c Sort intrinsics by LLVM intrinsic name, rather than tablegen def name
Step one towards using a simple binary search to lookup intrinsic IDs
instead of our crazy table generated switch+memcmp+startswith code that
makes Function.cpp take about a minute to compile.  See PR24785 and
PR11951 for why we should do this.

The X86 backend contains tables that need to be sorted on intrinsic ID,
so reorder those.

llvm-svn: 258757
2016-01-26 00:55:00 +00:00
Matthias Braun
7d3caf5306 LiveIntervalAnalysis: Cleanup handleMove{Down|Up}() functions, NFC
These two functions are hard to reason about. This commit makes the code
more comprehensible:

- Use four distinct variables (OldIdxIn, OldIdxOut, NewIdxIn, NewIdxOut)
  with a fixed value instead of a changing iterator I that points to
  different things during the function.
- Remove the early explanation before the function in favor of more
  detailed comments inside the function. Should have more/clearer comments now
  stating which conditions are tested and which invariants hold at
  different points in the functions.

The behaviour of the code was not changed.

I hope that this will make it easier to review the changes in
http://reviews.llvm.org/D9067 which I will adapt next.

Differential Revision: http://reviews.llvm.org/D16379

llvm-svn: 258756
2016-01-26 00:43:50 +00:00
Dan Gohman
a72e83c26e [MC] Use .p2align instead of .align
For historic reasons, the behavior of .align differs between targets.
Fortunately, there are alternatives, .p2align and .balign, which make the
interpretation of the parameter explicit, and which behave consistently across
targets.

This patch teaches MC to use .p2align instead of .align, so that people reading
code for multiple architectures don't have to remember which way each platform
does its .align directive.

Differential Revision: http://reviews.llvm.org/D16549

llvm-svn: 258750
2016-01-26 00:03:25 +00:00
Philip Reames
75dc59c9a3 [GVN] Rearrange code to make local vs non-local cases more obvious [NFCI]
llvm-svn: 258747
2016-01-25 23:37:53 +00:00
Evgeniy Stepanov
258db6665b [cfi] Cross-DSO CFI diagnostic mode (LLVM part).
* __cfi_check gets a 3rd argument: ubsan handler data
* Instead of trapping on failure, call __cfi_check_fail which must be
  present in the module (generated in the frontend).

llvm-svn: 258746
2016-01-25 23:35:03 +00:00
Philip Reames
9298d3408c [GVN] Factor out common code [NFCI]
We had the same code duplicated for each type of Def.  We also have the entire block duplicated between the local and non-local case, but let's start with local cleanup.

llvm-svn: 258740
2016-01-25 23:19:12 +00:00
Matthias Braun
69950e68ba X86ISelLowering: Fix cmov(cmov) special lowering bug
There's a special case in EmitLoweredSelect() that produces an improved
lowering for cmov(cmov) patterns. However this special lowering is
currently broken if the inner cmov has multiple users so this patch
stops using it in this case.

If you wonder why this wasn't fixed by continuing to use the special
lowering and inserting a 2nd PHI for the inner cmov: I believe this
would incur additional copies/register pressure so the special lowering
does not improve upon the normal one anymore in this case.

This fixes http://llvm.org/PR26256 (= rdar://24329747)

llvm-svn: 258729
2016-01-25 22:08:25 +00:00
Teresa Johnson
a52cfbb4e3 [ThinLTO] Find all needed metadata when linking metadata as postpass
For metadata postpass linking, after importing all functions, we need
to recursively walk through any nodes reached via imported functions to
locate needed subprogram metadata. Some might only be reached indirectly
via the variable list for an inlined function.

llvm-svn: 258728
2016-01-25 22:04:56 +00:00
Simon Pilgrim
a63717e4c0 [X86][AVX] Add commutation support for VPERM2X128 instructions
Its main use is to allow memory folding of the 1st operand

Differential Revision: http://reviews.llvm.org/D16521

llvm-svn: 258726
2016-01-25 21:51:34 +00:00
Teresa Johnson
8b2ed4fc5e [ThinLTO] Handle DISubprogram reached indirectly from DIImportedEntity
Extend fix for PR26037 to identify DISubprogram reached from a
DIImportedEntity via a DILexicalBlock.

llvm-svn: 258722
2016-01-25 21:29:55 +00:00
Lawrence Hu
baafd4c214 Enable loopreroll to rerool loop with pointer induction variable.
Example:

while (buf !=end ) {
   S += buf[0];
   S += buf[1];
   buf +=2;
};

Differential Revision: http://reviews.llvm.org/D13151

llvm-svn: 258709
2016-01-25 19:43:45 +00:00
Lawrence Hu
0572a631ee Undo commit 258700 due to missing commit message
llvm-svn: 258708
2016-01-25 19:36:30 +00:00
Matthew Simpson
d9e4b63bf8 Reapply commit r25804 with fix
We were hitting an assertion because we were computing smaller type sizes for
instructions that cannot be demoted. The fix first determines the instructions
that will be demoted, and then applies the smaller type size to only those
instructions.

This should fix PR26239.

llvm-svn: 258705
2016-01-25 19:24:29 +00:00
Quentin Colombet
06230e1d45 Speculatively revert r258620 as it is the likely culprid of PR26293.
llvm-svn: 258703
2016-01-25 19:12:49 +00:00
Ivan Krasin
09873095d3 Temporary disable broken fuzzer/timeout tests.
Reviewers: kcc

Differential Revision: http://reviews.llvm.org/D16543

llvm-svn: 258702
2016-01-25 19:05:45 +00:00
Lawrence Hu
1cf7c9fba6 Differential Revision: http://reviews.llvm.org/D13151
llvm-svn: 258700
2016-01-25 18:53:39 +00:00
Dan Gohman
d5075eb344 [WebAssembly] Fix unbalanced register stack code in the case of late DCE.
Instructions can be DCE'd after the RegStackify pass. If the instruction which
would be the pop for what would be a push is removed, don't use a push.

llvm-svn: 258694
2016-01-25 16:48:44 +00:00
Dan Gohman
6aa3a18666 [WebAssembly] Minor code formatting cleanups. NFC.
llvm-svn: 258692
2016-01-25 15:12:05 +00:00
Dan Gohman
105451ecf0 [SelectionDAG] Use the correct return type for memcpy, memmove, and memset.
When generating calls to memcpy, memmove, and memset, use void* as the return
type rather than void, to match the standard signatures for these functions.

This has no practical effect for most targets, since the return values of
these calls aren't being used anyway, and most calling conventions tolerate
this kind of mismatch. However, this change will help support future
optimizations to utilize the return value to avoid holding the argument
value live across a call.

llvm-svn: 258691
2016-01-25 15:05:56 +00:00
James Molloy
fe66a6200d [DemandedBits] Fix computation of demanded bits for ICmps
The computation of ICmp demanded bits is independent of the individual operand being evaluated. We simply return a mask consisting of the minimum leading zeroes of both operands.

We were incorrectly passing "I" to ComputeKnownBits - this should be "UserI->getOperand(0)". In cases where we were evaluating the 1th operand, we were taking the minimum leading zeroes of it and itself.

This should fix PR26266.

llvm-svn: 258690
2016-01-25 14:49:36 +00:00
Michael Zuckerman
847379aa25 [AVX512] Adding PTESTNMB/D/W/Q instruction
Differential Revision: http://reviews.llvm.org/D16520

llvm-svn: 258688
2016-01-25 14:43:23 +00:00
Michael Zuckerman
5131ab0907 [AVX512] Adding PTESTMB/W/D/Q instruction
Differential Revision: http://reviews.llvm.org/D16519

llvm-svn: 258686
2016-01-25 13:27:32 +00:00
Bradley Smith
849b958836 [ARM] Add DSP build attribute and extension targeting
This patch was originally committed as r257885, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258683
2016-01-25 11:26:11 +00:00
Bradley Smith
28db0fcf02 [ARM] Add new system registers to ARMv8-M Baseline/Mainline
This patch was originally committed as r257884, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258682
2016-01-25 11:25:36 +00:00
Bradley Smith
bb33c3478a [ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline
This patch was originally committed as r257883, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258681
2016-01-25 11:24:47 +00:00
Asaf Badouh
ec3729528a [X86][IFMA] adding intrinsics and encoding for multiply and add of unsigned 52bit integer
VPMADD52LUQ - Packed Multiply of Unsigned 52-bit Integers and Add the Low 52-bit Products to Qword Accumulators
 VPMADD52HUQ - Packed Multiply of Unsigned 52-bit Unsigned Integers and Add High 52-bit Products to 64-bit Accumulators

Differential Revision: http://reviews.llvm.org/D16407

llvm-svn: 258680
2016-01-25 11:14:24 +00:00
Oliver Stannard
43a6ec7452 [ARM] Add ARMv8.2-A FP16 scalar instructions
This was originally committed as r255762, but reverted as it broke windows
bots. Re-commitiing the exact same patch, as the underlying cause was fixed by
r258677.

ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.

The assembly for these instructions uses S registers (AArch32 does not
have H registers), but the instructions have ".f16" type specifiers
rather than ".f32" or ".f64". The top 16 bits of each source register
are ignored, and the top 16 bits of the destination register are set to
zero.

These instructions are mostly the same as the 32- and 64-bit versions,
but they use coprocessor 9 rather than 10 and 11.

Two new instructions, VMOVX and VINS, have been added to allow packing
and extracting two 16-bit floats stored in the top and bottom halves of
an S register.

New fixup kinds have been added for the PC-relative load and store
instructions, but no ELF relocations have been added as they have a
range of 512 bytes.

Differential Revision: http://reviews.llvm.org/D15038

llvm-svn: 258678
2016-01-25 10:26:26 +00:00
Oliver Stannard
4ff6a9f924 [TableGen] Fix sort order of asm operand classes
This is a fix for https://llvm.org/bugs/show_bug.cgi?id=22796.

The previous implementation of ClassInfo::operator< allowed cycles of classes
such that x < y < z < x, meaning that a list of them cannot be correctly
sorted, and the sort order could differ with different standard libraries.

The original implementation sorted classes by ValueName if they were otherwise
equal. This isn't strictly necessary, but some backends seem to accidentally
rely on it. If I reverse this comparison I get 8 test failures spread across
the AArch64, Mips and X86 backends, so I have left it in until those backends
can be fixed.

There was one case in the X86 backend where the observable behaviour of the
assembler is changed by this patch. This was because some of the memory asm
operands were not marked as children of X86MemAsmOperand.

Differential Revision: http://reviews.llvm.org/D16141

llvm-svn: 258677
2016-01-25 10:20:19 +00:00
Junmo Park
2adffa1342 Silence a -Wparentheses warning; NFC.
llvm-svn: 258676
2016-01-25 10:17:17 +00:00
Igor Breger
66fa90c341 AVX1 : Enable vector masked_load/store to AVX1.
Use AVX1 FP instructions (vmaskmovps/pd) in place of the AVX2 int instructions (vpmaskmovd/q).

Differential Revision: http://reviews.llvm.org/D16528

llvm-svn: 258675
2016-01-25 10:17:11 +00:00
Michael Zuckerman
e13e338f45 [AVX512] [CMPPS ][ CMPPD ] Adding full Comparison Predicate names
X86AsmParser.cpp is missing full comparison predicate names for CMPPD and CMPPS Instructions.
X86AsmParser.cpp defines only the short names of the Comparison predicate that you can find in the following pdf:
https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf
Page 5-61 table 5-3

Differential Revision: http://reviews.llvm.org/D16518

llvm-svn: 258671
2016-01-25 08:43:26 +00:00
Lang Hames
19769657b9 [Object][COFF] Revert r258665 - It doesn't do what I had intended.
I'm discussing the right approach for tracking visibility for COFF symbols on
the llvm-dev list.

llvm-svn: 258666
2016-01-25 01:21:45 +00:00
Lang Hames
84361d0d8a [Object][COFF] Set the generic SF_Exported flag on COFF exported symbols.
The ORC ObjectLinkingLayer uses this flag during symbol lookup. Failure to set
it causes all symbols to behave as if they were non-exported, which has caused
failures in the kaleidoscope tutorials on Windows. Raising the flag should
un-break the tutorials.

No test case yet - none of the existing command line tools for printing symbol
tables (llvm-nm, llvm-objdump) show the status of this flag, and I don't want to
change the format from these tools without consulting their owners. I'll send an
email to the dev-list to figure out the right way forward.

llvm-svn: 258665
2016-01-24 21:56:40 +00:00
David Majnemer
ea5702b618 [COFF] Simplify SetSectionName
Consolidate the code which handles string table offsets less than 999999
with the code for offsets less than 9999999.  While we are here,
simplify the code by not using sprintf to generate the string.

llvm-svn: 258664
2016-01-24 20:46:11 +00:00
David Majnemer
51c9237bd6 [LoopSimplify] Reuse changeToUnreachable
Use existing functionality provided in changeToUnreachable instead of
reinventing it in LoopSimplify.

No functionality change is intended.

llvm-svn: 258663
2016-01-24 19:32:52 +00:00
David Majnemer
58f71414f2 Fix build bot breakage
llvm-svn: 258661
2016-01-24 16:46:53 +00:00
Elena Demikhovsky
832e2d5858 Added Skylake client to X86 targets and features
Changes in X86.td:

I set features of Intel processors in incremental form: IVB = SNB + X HSW = IVB + X ..
I added Skylake client processor and defined it's features
FeatureADX was missing on KNL
Added some new features to appropriate processors SMAP, IFMA, PREFETCHWT1, VMFUNC and others

Differential Revision: http://reviews.llvm.org/D16357

llvm-svn: 258659
2016-01-24 10:41:28 +00:00
Amjad Aboud
01175f32ea Fixed few comments.
llvm-svn: 258658
2016-01-24 08:18:55 +00:00
Igor Breger
f91b2666bb AVX512: VMOVDQU8/16/32/64 (load) intrinsic implementation.
Differential Revision: http://reviews.llvm.org/D16137

llvm-svn: 258657
2016-01-24 08:04:33 +00:00
David Majnemer
5ab451ac92 Fix buildbot failures
llvm-svn: 258655
2016-01-24 06:40:37 +00:00
David Majnemer
bfc3671cd7 [SCCP] Remove duplicate code
SCCP has code identical to changeToUnreachable's behavior, switch it
over to just call changeToUnreachable.

No functionality change intended.

llvm-svn: 258654
2016-01-24 06:26:47 +00:00
David Majnemer
0fce247968 [InstCombine, SCCP] Consolidate code used to remove instructions
InstCombine and SCCP both want to remove dead code in a very particular
way but using identical means to do so.  Share the code between the two.

No functionality change is intended.

llvm-svn: 258653
2016-01-24 05:26:18 +00:00
David Majnemer
1fad37bd8d [WinEH] Don't miscompile cleanups which conditionally unwind to caller
A cleanup can have paths which unwind or end up in unreachable.
If there is an unreachable path *and* a path which unwinds to caller,
we would mistakenly inject an unwind path to a catchswitch on the
unreachable path.  This results in a verifier assertion firing because
the cleanup unwinds to two different places: to the caller and to the
catchswitch.

This occured because we used getCleanupRetUnwindDest to determine if the
cleanuppad had no cleanuprets.
This is incorrect, getCleanupRetUnwindDest returns null for cleanuprets
which unwind to caller.

llvm-svn: 258651
2016-01-23 23:54:33 +00:00
Manuel Jacob
5c24aef18d Remove duplicate documentation in ConstantFolding.cpp. NFC.
The documentation for these functions is already present in the header file.

llvm-svn: 258649
2016-01-23 22:49:54 +00:00
Manuel Jacob
746c7a15c8 Remove duplicate documentation in Attributes.cpp. NFC.
The documentation for these methods is already present in the header.

llvm-svn: 258648
2016-01-23 22:42:24 +00:00
Simon Pilgrim
bc4ea7d779 [SelectionDAG] Generalised the CONCAT_VECTORS creation to support BUILD_VECTOR and UNDEF folding.
llvm-svn: 258646
2016-01-23 22:27:54 +00:00
Simon Pilgrim
98e0c36faa [X86][SSE] Generalised TRUNC -> PACKSS/PACKUS code. NFC.
Generalised mask generation / subvector extraction to use the input/output types directly instead of an if/else through all the currently accepted types.

llvm-svn: 258645
2016-01-23 22:02:48 +00:00
Simon Pilgrim
47c949cc50 Tidied up TRUNC combine code. NFC.
Make use of DAG.getBitcast and use clang-format to reduce number of lines (and make it more readable).

llvm-svn: 258644
2016-01-23 21:50:40 +00:00
Justin Lebar
67acdea900 [CUDA] Die gracefully when trying to output an LLVM alias.
Summary:
Previously, we would just output "foo = bar" in the assembly, and then
ptxas would choke.  Now we die before emitting any invalid code.

Reviewers: echristo

Subscribers: jholewinski, llvm-commits, jhen, tra

Differential Revision: http://reviews.llvm.org/D16490

llvm-svn: 258638
2016-01-23 21:12:20 +00:00
Justin Lebar
cc870065cb [CUDA] Make empty parameter lists in nvptx function decls easier to read.
Summary:
Before:

  .func  (.param .b32 func_retval0) _ZL21__nvvm_reflect_anchorv(

  )
  {

After:

  .func  (.param .b32 func_retval0) _ZL21__nvvm_reflect_anchorv()
  {

Reviewers: bkramer

Subscribers: llvm-commits, tra, jhen, echristo, jholewinski

Differential Revision: http://reviews.llvm.org/D16512

llvm-svn: 258637
2016-01-23 21:12:17 +00:00
Benjamin Kramer
820941c5f2 Don't check if a list is empty with ilist::size.
ilist::size() is O(n) while ilist::empty() is O(1)

llvm-svn: 258636
2016-01-23 20:58:09 +00:00
Kostya Serebryany
0c11655f17 [libFuzzer] add -abort_on_timeout option
llvm-svn: 258631
2016-01-23 19:34:19 +00:00
Akira Hatanaka
dfd425a3de [Bitcode] Insert the darwin wrapper at the beginning of a file when the
target is macho.

It looks like the check for macho was accidentally dropped in r132959.

I don't have a test case, but I'll add one if anyone knows how this can
be tested.

llvm-svn: 258627
2016-01-23 16:02:10 +00:00
Aaron Ballman
ab11289986 Silence a -Wparentheses warning; NFC.
llvm-svn: 258626
2016-01-23 15:42:21 +00:00
Simon Pilgrim
37a0ceb144 Added missing comment. NFC.
llvm-svn: 258624
2016-01-23 14:38:02 +00:00
Simon Pilgrim
0e48f0e5bb [X86][SSE] Remove INSERTPS dependencies from unreferenced operands.
If the INSERTPS zeroes out all the referenced elements from either of the 2 input vectors (and the input is not already UNDEF), then set that input to UNDEF to reduce dependencies.

llvm-svn: 258622
2016-01-23 13:37:07 +00:00
Haicheng Wu
9d77533d54 [LIR] Add support for structs and hand unrolled loops
Now LIR can turn following codes into memset:

typedef struct foo {
  int a;
  int b;
} foo_t;

void bar(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; ++i) {
    f[i].a = 0;
    f[i].b = 0;
  }
}

void test(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; i += 2) {
    f[i] = 0;
    f[i+1] = 0;
  }
}

llvm-svn: 258620
2016-01-23 06:52:41 +00:00
Matthias Braun
f692315c07 Inline variable into assert
Seems like some compilers still give unused variable warnings for
bool var = ...;
(void)var;
so I have to inline the variable.

llvm-svn: 258619
2016-01-23 06:49:29 +00:00
NAKAMURA Takumi
c47ea12e00 AArch64ISelLowering.cpp: Fix a warning. [-Wunused-variable]
llvm-svn: 258618
2016-01-23 06:34:59 +00:00
Junmo Park
d7f46a4f6c Remove extra whitespace. NFC.
llvm-svn: 258617
2016-01-23 06:34:36 +00:00
David Majnemer
f62478a34a [PruneEH] Don't try to insert a terminator after another terminator
LLVM's BasicBlock has a single terminator, it is not valid to have two.

llvm-svn: 258616
2016-01-23 06:00:44 +00:00
Manuel Jacob
865681354b Put space after pointer type in test. NFC.
llvm-svn: 258615
2016-01-23 05:47:34 +00:00
Matt Arsenault
fe8ee22547 AMDGPU: Remove more unused intrinsics
Replace tests with lrp with basic IR expansion

llvm-svn: 258612
2016-01-23 05:42:38 +00:00
David Majnemer
7a3addc91c [PruneEH] FuncletPads must not have undef operands
Instead of RAUW with undef, replace the first non-token instruction with
unreachable.

This fixes PR26263.

llvm-svn: 258611
2016-01-23 05:41:29 +00:00
David Majnemer
09858a3961 [PruneEH] Unify invoke and call handling in DeleteBasicBlock
No functionality change is intended.

llvm-svn: 258610
2016-01-23 05:41:27 +00:00
David Majnemer
0728f4a41f [PruneEH] Reuse code from removeUnwindEdge
PruneEH had functionality idential to removeUnwindEdge.
Consolidate around removeUnwindEdge.
No functionality change is intended.

llvm-svn: 258609
2016-01-23 05:41:22 +00:00
Matt Arsenault
38b08addbb AMDGPU: Move amdgcn intrinsic handling into SITargetLowering
llvm-svn: 258608
2016-01-23 05:32:20 +00:00
Matt Arsenault
f305746857 AMDGPU: Remove IntrNoMem from llvm.SI.sendmsg
This has side effects.

llvm-svn: 258607
2016-01-23 05:32:18 +00:00
Matt Arsenault
c3ec12b749 AMDGPU: Remove Feature64BitPtr
This is a leftover from AMDIL that doesn't do anything
and doesn't belong here.

llvm-svn: 258606
2016-01-23 05:32:14 +00:00
Matthias Braun
0892910f16 AArch64ISel: Fix ccmp code selection matching deep expressions.
Some of the conditions necessary to produce ccmp sequences were only
checked in recursive calls to emitConjunctionDisjunctionTree() after
some of the earlier expressions were already built. Move all checks over
to isConjunctionDisjunctionTree() so they are all checked before we
start emitting instructions.

Also rename some variable to better reflect their usage.

llvm-svn: 258605
2016-01-23 04:05:22 +00:00
Matthias Braun
da14179563 AArch64ISelLowering: Reduce maximum recursion depth of isConjunctionDisjunctionTree()
This function will exhibit exponential runtime (2**n) so we should
rather use a lower limit.

llvm-svn: 258604
2016-01-23 04:05:18 +00:00
Matthias Braun
a0ca239a6c Fix wrong indentation
llvm-svn: 258603
2016-01-23 04:05:16 +00:00
Derek Schuff
0558165991 [WebAssembly] Fix RegNumbering for the stack pointer
Previously it failed to add NumArgRegs to the offset and so clobbered an
already-used register. Now just start the numbering after the arg regs
and don't duplicate the add. Test coverage for this coming shortly with
the implementation of byval.

llvm-svn: 258597
2016-01-23 01:20:43 +00:00
Kostya Serebryany
548cef831b [libFuzzer] add more fields to DictionaryEntry to count the number of uses and successes
llvm-svn: 258589
2016-01-22 23:55:14 +00:00
David Majnemer
a2ed036c0a [WinEH] Let cleanups post-dominated by unreachable get executed
Cleanups in C++ are a little weird.  They are only guaranteed to be
reliably executed if, and only if, there is a viable catch handler which
can handle the exception.

This means that reachability of a cleanup is lexically determined by it
being nested with a try-block which unwinds to a catch.  It is *cannot*
be reasoned about by examining the control flow edges leaving a cleanup.

Usually this is not a problem.  It becomes a problem when there are *no*
edges out of a cleanup because we believed that code post-dominated by
the cleanup is dead.  In LLVM's case, this code is what informs the
personality routine about the presence of a suitable catch handler.
However, the lack of edges to that catch handler makes the handler
become unreachable which causes us to remove it.  By removing the
handler, the cleanup becomes unreachable.

Instead, inject a catch-all handler with every cleanup that has no
unwind edges.  This will allow us to properly unwind the stack.

This fixes PR25997.

llvm-svn: 258580
2016-01-22 23:20:43 +00:00
Kevin Enderby
9b924af8f7 Fix the code that leads to the incorrect trigger of the report_fatal_error()
in MachOObjectFile::getSymbolByIndex() when a Mach-O file has
a symbol table load command but the number of symbols are zero.

The code in MachOObjectFile::symbol_begin_impl() should not be
assuming there is a symbol at index 0, in cases there is no symbol
table load command or the count of symbol is zero.  So I also fixed
that.  And needed to fix MachOObjectFile::symbol_end_impl() to
also do the same thing for no symbol table or one with zero entries.

The code in MachOObjectFile::getSymbolByIndex() should trigger
the report_fatal_error() for programmatic errors for any index when
there is no symbol table load command and not return the end iterator.
So also fixed that. Note there is no test case as this is a programmatic
error.

The test case using the file macho-invalid-bad-symbol-index has
a symbol table load command with its number of symbols (nsyms)
is zero. Which was incorrectly testing the bad triggering of the
report_fatal_error() in in MachOObjectFile::getSymbolByIndex().

This test case is an invalid Mach-O file but not for that reason.
It appears this Mach-O file use to have an nsyms value of 11,
and what makes this Mach-O file invalid is the counts and
indexes into the symbol table of the dynamic load command
are now invalid because the number of symbol table entries
(nsyms) is now zero.  Which can be seen with the existing
llvm-obdump:

% llvm-objdump -private-headers macho-invalid-bad-symbol-index
…
Load command 4
     cmd LC_SYMTAB
 cmdsize 24
  symoff 4216
   nsyms 0
  stroff 4392
 strsize 144
Load command 5
            cmd LC_DYSYMTAB
        cmdsize 80
      ilocalsym 0
      nlocalsym 8 (past the end of the symbol table)
     iextdefsym 8 (greater than the number of symbols)
     nextdefsym 2 (past the end of the symbol table)
      iundefsym 10 (greater than the number of symbols)
      nundefsym 1 (past the end of the symbol table)
...

And the native darwin tools generates an error for this file:

% nm macho-invalid-bad-symbol-index
nm: object: macho-invalid-bad-symbol-index truncated or malformed object (ilocalsym plus nlocalsym in LC_DYSYMTAB load command extends past the end of the symbol table)

I added new checks for the indexes and sizes for these in the
constructor of MachOObjectFile.  And added comments for what
would be a proper diagnostic messages.

And changed the test case using macho-invalid-bad-symbol-index
to test for the new error now produced.

Also added a test with a valid Mach-O file with a symbol table
load command where the number of symbols is zero that shows
the report_fatal_error() is not called.

llvm-svn: 258576
2016-01-22 22:49:55 +00:00
Ivan Krasin
ce1bcd8c31 Use std::piecewise_constant_distribution instead of ad-hoc binary search.
Summary:
Fix the issue with the most recently discovered unit receiving much less attention.

Note: this is the second attempt (prev: r258473). Now, libc++ build is fixed.

Reviewers: aizatsky, kcc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16487

llvm-svn: 258571
2016-01-22 22:28:27 +00:00
Weiming Zhao
e73494fde9 Fix LivePhysRegs::addLiveOuts
Summary:
The testing for returnBB was flipped which may cause ARM ld/st opt pass uses callee saved regs in returnBB when shrink-wrap is used.


Reviewers: t.p.northover, apazos, MatzeB

Subscribers: mcrosier, zzheng, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D16434

llvm-svn: 258569
2016-01-22 22:21:34 +00:00
Sanjay Patel
6ef10254c1 fix typos; NFC
llvm-svn: 258567
2016-01-22 22:09:41 +00:00
Matt Arsenault
3913a77bb9 AMDGPU: Add new name for barrier intrinsic
llvm-svn: 258558
2016-01-22 21:30:43 +00:00
Matt Arsenault
7a5e15697d AMDGPU: Rename intrinsics to use amdgcn prefix
The intrinsic target prefix should match the target name
as it appears in the triple.

This is not yet complete, but gets most of the important ones.
llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled
for compatability for now.

llvm-svn: 258557
2016-01-22 21:30:34 +00:00
Sergei Larin
7b219abac0 Make sure that any new and optimized objects created during GlobalOPT copy all the attributes from the base object.
Summary:
Make sure that any new and optimized objects created during GlobalOPT copy all the attributes from the base object.

A good example of improper behavior in the current implementation is section information associated with the GlobalObject. If a section was set for it, and GlobalOpt is creating/modifying a new object based on this one (often copying the original name), without this change new object will be placed in a default section, resulting in inappropriate properties of the new variable.
The argument here is that if customer specified a section for a variable, any changes to it that compiler does should not cause it to change that section allocation.
Moreover, any other properties worth representation in copyAttributesFrom() should also be propagated.

Reviewers: jmolloy, joker-eph, joker.eph

Subscribers: slarin, joker.eph, rafael, tobiasvk, llvm-commits

Differential Revision: http://reviews.llvm.org/D16074

llvm-svn: 258556
2016-01-22 21:18:20 +00:00
Sanjay Patel
ada0c1bc05 function names start with a lowercase letter; NFC
llvm-svn: 258552
2016-01-22 21:11:47 +00:00
Sanjoy Das
26d6272ad2 [PlaceSafepoints] Introduce a -spp-no-statepoints flag
Summary:
This change adds a `-spp-no-statepoints` flag to PlaceSafepoints that
bypasses the code that wraps newly introduced polls and existing calls
in gc.statepoint.  With `-spp-no-statepoints` enabled, PlaceSafepoints
effectively becomes a safpeoint **poll** insertion pass.

The eventual goal is to "constant fold" this option, along with
`-rs4gc-use-deopt-bundles` to `true`, once clients using gc.statepoint
are okay doing so.

Reviewers: pgavlin, reames, JosephTremoulet

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16439

llvm-svn: 258551
2016-01-22 21:02:55 +00:00
Xinliang David Li
42a10f05d4 [PGO] Remove use of static variable. /NFC
Make the variable a member of  the writer trait object owned
now by the writer. Also use a different generator interface
to pass the infoObject from the writer. 

llvm-svn: 258544
2016-01-22 20:25:56 +00:00
Xinliang David Li
960dc56746 Revert 258486 -- for a better fix coming soon
llvm-svn: 258538
2016-01-22 19:53:31 +00:00
Matt Arsenault
2b88adb9bd AMDGPU: Fix crash with invariant markers
The promote alloca pass didn't handle these intrinsics and crashed.
These intrinsics should accept any address space, but for now just
erase them to avoid breaking.

llvm-svn: 258537
2016-01-22 19:47:54 +00:00
Jingyue Wu
90a1a65026 [NVPTX] expand mul_lohi to mul_lo and mul_hi
Summary: Fixes PR26186.

Reviewers: grosser, jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D16479

llvm-svn: 258536
2016-01-22 19:47:26 +00:00
Ahmed Bougacha
7980e233f5 [AArch64] Simplify emitConditionalCompare calls. NFC.
Now that both callsites are identical, we can simplify the
prototype and make it easier to reason about the 2-CC case.

llvm-svn: 258534
2016-01-22 19:43:57 +00:00
Ahmed Bougacha
1c71a2aac6 [AArch64] Lower 2-CC FCCMPs (one/ueq) using AND'ed CCs.
The current behavior is incorrect, as the two CCs returned by
changeFPCCToAArch64CC, intended to be OR'ed, are instead used
in an AND ccmp chain.

Consider:
define i32 @t(float %a, float %b, float %c, float %d, i32 %e, i32 %f) {
  %cc1 = fcmp one float %a, %b
  %cc2 = fcmp olt float %c, %d
  %and = and i1 %cc1, %cc2
  %r = select i1 %and, i32 %e, i32 %f
  ret i32 %r
}

Assuming (%a < %b) and (%c < %d); we used to do:
  fcmp  s0, s1            # nzcv <- 1000
  orr   w8, wzr, #0x1     # w8 <- 1
  csel  w9, w8, wzr, mi   # w9 <- 1
  csel  w8, w8, w9, gt    # w8 <- 1
  fcmp  s2, s3            # nzcv <- 1000
  cset   w9, mi           # w9 <- 1
  tst    w8, w9           # (w8 & w9) == 1, so: nzcv <- 0000
  csel  w0, w0, w1, ne    # w0 <- w0

We now do:
  fcmp  s2, s3            # nzcv <- 1000
  fccmp s0, s1, #0, mi    #  mi, so: nzcv <- 1000
  fccmp s0, s1, #8, le    # !le, so: nzcv <- 1000
  csel  w0, w0, w1, pl    # !pl, so: w0 <- w1

In other words, we transformed:
  (c < d) &&  ((a < b) || (a > b))
into:
  (c < d) &&   (a u>= b) && (a u<= b)
whereas, per De Morgan's, we wanted:
  (c < d) && !((a u>= b) && (a u<= b))

Note that this problem doesn't occur in the test-suite.

changeFPCCToAArch64CC produces disjunct CCs; here, one -> mi/gt.
We can't represent that in the fccmp chain; it can't express
arbitrary OR sequences, as one comment explains:
  In general we can create code for arbitrary "... (and (and A B) C)"
  sequences.  We can also implement some "or" expressions, because
  "(or A B)" is equivalent to "not (and (not A) (not B))" and we can
  implement some  negation operations. [...] However there is no way
  to negate the result of a partial sequence.

Instead, introduce changeFPCCToANDAArch64CC, which produces the
conjunct cond codes:
- (a one b)
    == ((a olt b) || (a ogt b))
    == ((a ord b) && (a une b))
- (a ueq b)
    == ((a uno b) || (a oeq b))
    == ((a ule b) && (a uge b))

Note that, at first, one might think that, when PushNegate is true,
we should use the disjunct CCs, in effect doing:
  (a || b)
  = !(!a && !(b))
  = !(!a && !(b1 || b2))  <- changeFPCCToAArch64CC(b, b1, b2)
  = !(!a && !b1 && !b2)

However, we can take advantage of the fact that the CC is already
negated, which lets us avoid special-casing PushNegate and doing
the simpler to reason about:

  (a || b)
  = !(!a && (!b))
  = !(!a && (b1 && b2))   <- changeFPCCToANDAArch64CC(!b, b1, b2)
  = !(!a && b1 && b2)

This makes both emitConditionalCompare cases behave identically,
and produces correct ccmp sequences for the 2-CC fcmps.

llvm-svn: 258533
2016-01-22 19:43:54 +00:00
Ahmed Bougacha
3a901cfda8 [AArch64] Assert that CCMP isel didn't fail inconsistently.
We verify that the op tree is eligible for CCMP emission in
isConjunctionDisjunctionTree, but it's also possible that
emitConjunctionDisjunctionTree fails later.
The initial check is useful, as it avoids building nodes
that will get discarded.
Still, make sure that inconsistencies don't happen with
an assert.

llvm-svn: 258532
2016-01-22 19:43:43 +00:00
Sanjoy Das
a81b52c690 [RS4GC] Use OB_deopt instead of "deopt"
llvm-svn: 258529
2016-01-22 19:20:40 +00:00
Krzysztof Parzyszek
7ec3ade80f [Hexagon] Use general purpose registers to spill pred/mod registers into
Patch by Tobias Edler Von Koch.

llvm-svn: 258527
2016-01-22 19:15:58 +00:00
Matt Arsenault
8d0283f1a9 AMDGPU: Fix getArchTypePrefix
llvm-svn: 258525
2016-01-22 19:09:12 +00:00
Matt Arsenault
fdfc9419b0 AMDGPU: Rename some r600 intrinsics to use correct TargetPrefix
These ones aren't directly emitted by mesa and inserted by a pass.

llvm-svn: 258523
2016-01-22 19:00:09 +00:00
Matt Arsenault
2e8073cc66 AMDGPU: Remove unused R600 intrinsics
llvm-svn: 258522
2016-01-22 18:52:14 +00:00
David Majnemer
0533388931 [WinEH] Make collectFuncletMembers non-recursive
Use a worklist for the pre-order DFS instead of using recursion.
No functionality change is intended.

llvm-svn: 258521
2016-01-22 18:49:50 +00:00
Kevin Enderby
d50c4b11ba Fix MachOObjectFile::getSymbolName() to not call report_fatal_error()
but to return object_error::parse_failed.  Then made the code in llvm-nm
do for Mach-O files what is done in the darwin native tools which is to
print "bad string index" for bad string indexes.  Updated the error message
in the llvm-objdump test, and added tests to show llvm-nm prints
"bad string index" and a test to print the actual bad string index value
which in this case is 0xfe000002 when printing the fields as raw hex.

llvm-svn: 258520
2016-01-22 18:47:14 +00:00
Matt Arsenault
720ce3fd59 AMDGPU: Change control flow intrinsics to use amdgcn prefix
These aren't supposed to be used outside of the backend,
so there aren't any users to worry about.

llvm-svn: 258516
2016-01-22 18:42:55 +00:00
Matt Arsenault
d4ad2318d1 AMDGPU: Don't use separate mulhu/mulhs Pats
llvm-svn: 258515
2016-01-22 18:42:49 +00:00
Matt Arsenault
351925633e AMDGPU: Remove random TGSI intrinsic
I don't think this was ever used.

llvm-svn: 258514
2016-01-22 18:42:44 +00:00
Matt Arsenault
21c6e6f537 AMDGPU: Remove AMDGPU.fract intrinsic
Mesa doesn't use this, and this is pattern matched already
from fsub x, (ffloor x)

llvm-svn: 258513
2016-01-22 18:42:38 +00:00
Xinliang David Li
16253b4d49 [PGO] eliminate use of static variable
llvm-svn: 258486
2016-01-22 05:48:40 +00:00
JF Bastien
050cf771fb NFC WebAssembly: update links
I got a vanity URL, and moved the github waterfall repo.

llvm-svn: 258484
2016-01-22 04:21:49 +00:00
Dan Gohman
46980bada3 [SelectionDAG] Fold more offsets into GlobalAddresses
This reapplies r258296 and r258366, and also fixes an existing bug in
SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the
offset in a GlobalAddressSDNode, which is uncovered by those patches.

llvm-svn: 258482
2016-01-22 03:57:34 +00:00
Manuel Jacob
714fa41ac7 Replace Type::getInt32Ty() and comparison by isIntegerTy(32). NFC.
llvm-svn: 258480
2016-01-22 03:30:27 +00:00
Ivan Krasin
7b4522dc59 Revert r258473 as it's breaking the build with libc++
Reviewers: kcc

Differential Revision: http://reviews.llvm.org/D16441

llvm-svn: 258479
2016-01-22 03:21:52 +00:00
Eduard Burtescu
a868f6e2ac [opaque pointer types] [NFC] DataLayout::getIndexedOffset: take source element type instead of pointer type and rename to getIndexedOffsetInType.
Summary:

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16282

llvm-svn: 258478
2016-01-22 03:08:27 +00:00
Eduard Burtescu
cfc72ec986 [opaque pointer types] [NFC] FindAvailableLoadedValue: take LoadInst instead of just the pointer.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16422

llvm-svn: 258477
2016-01-22 01:51:51 +00:00
Eduard Burtescu
636d36b9c9 [opaque pointer types] [NFC] gep_type_{begin,end} now take source element type and address space.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16436

llvm-svn: 258474
2016-01-22 01:33:43 +00:00
Ivan Krasin
db4009626d Use std::piecewise_constant_distribution instead of ad-hoc binary search.
Summary:
Fix the issue with the most recently discovered unit receiving much less attention.

Note: I had to change the seed for one test to make it pass. Alternatively,
the number of runs could be increased. I believe that the average time of
'foo' discovery is not increased, just seed=1 was particularly convenient
for the previous PRNG scheme used.

Reviewers: aizatsky, kcc

Subscribers: llvm-commits, kcc

Differential Revision: http://reviews.llvm.org/D16419

llvm-svn: 258473
2016-01-22 01:32:34 +00:00
Eduard Burtescu
0effa1afdd [opaque pointer types] [NFC] Add an explicit type argument to ConstantFoldLoadFromConstPtr.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16418

llvm-svn: 258472
2016-01-22 01:17:26 +00:00
Pirama Arumuga Nainar
2e5b2b3d41 Do not lower VSETCC if operand is an f16 vector
Summary:
SETCC with f16 vectors has OperationAction set to Expand but still gets
lowered to FCM* intrinsics based on its result type.  This patch skips
lowering of VSETCC if the operand is an f16 vector.

v4 and v8 tests included.

Reviewers: ab, jmolloy

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D15361

llvm-svn: 258471
2016-01-22 01:16:57 +00:00
Reid Kleckner
2ddae2ea6b Revert "[SelectionDAG] Fold more offsets into GlobalAddresses"
This reverts r258296 and the follow up r258366. With this change, we
miscompiled the following program on Windows:
  #include <string>
  #include <iostream>
  static const char kData[] = "asdf jkl;";
  int main() {
    std::string s(kData + 3, sizeof(kData) - 3);
    std::cout << s << '\n';
  }

llvm-svn: 258465
2016-01-22 01:09:29 +00:00
Kostya Serebryany
f7155b3e82 [libFuzzer] don't do expensive memmem if the result will not be used
llvm-svn: 258462
2016-01-22 01:04:58 +00:00
Teresa Johnson
2a387148a1 [ThinLTO] Do metadata linking during batch function importing
Summary:
Since we are currently not doing incremental importing there is
no need to link metadata as a postpass. The module linker will
only link in the imported subroutines due to the functionality
added by r256003.

(Note that the metadata postpass linking functionalitiy is still
used by llvm-link, and may be needed here in the future if a more
incremental strategy is adopted.)

Reviewers: joker.eph

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16424

llvm-svn: 258458
2016-01-22 00:15:53 +00:00
Eduard Burtescu
42b3bd4662 [opaque pointer types] [NFC] Take advantage of get{Source,Result}ElementType when folding GEPs.
Summary:

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16302

llvm-svn: 258456
2016-01-21 23:42:06 +00:00
Sanjay Patel
ef7cae166d move function definitions so we don't need separate declarations ; NFCI
llvm-svn: 258455
2016-01-21 23:38:43 +00:00
Sanjay Patel
ff5da390f5 [LibCallSimplifier] refactor FP function signature checks ; NFCI
Use the helper function added in r258428.

The check should really be hoisted to the caller of all of these
optimize* functions, but that's another step.

llvm-svn: 258446
2016-01-21 22:58:01 +00:00
Sanjay Patel
7c9dc49b45 avoid variable shadowing; NFC
llvm-svn: 258445
2016-01-21 22:41:16 +00:00
Sanjay Patel
4a76c00379 remove unnecessary variable; NFC
llvm-svn: 258444
2016-01-21 22:31:18 +00:00
Reid Kleckner
4439f8e4ca Avoid unnecessary stack realignment in musttail thunks with SSE2 enabled
The X86 musttail implementation finds register parameters to forward by
running the calling convention algorithm until a non-register location
is returned. However, assigning a vector memory location has the side
effect of increasing the function's stack alignment. We shouldn't
increase the stack alignment when we are only looking for register
parameters, so this change conditionalizes it.

llvm-svn: 258442
2016-01-21 22:23:22 +00:00
Simon Pilgrim
6f240f4b49 [X86][SSE] Improve i16 splatting shuffles
Better handling of the annoying pshuflw/pshufhw ops which only shuffle lower/upper halves of a vector.

Added vXi16 unary shuffle support for cases where i16 elements (from the same half of the source) are being splatted to the whole of one of the halves. This avoids the general lowering case which must shuffle the 32-bit elements first - meaning that we used to end up with unnecessary duplicate pshuflw/pshufhw shuffles.

Note this has the side effect of a lot of SSSE3 test cases no longer needing to use PSHUFB, as it falls below the 3 op combine threshold for when PSHUFB is typically worth it. I've raised PR26183 to discuss if the threshold should be changed and whether we need to make it more specific to the target CPU.

Differential Revision: http://reviews.llvm.org/D14901

llvm-svn: 258440
2016-01-21 22:07:41 +00:00
Lang Hames
373875b04a [RuntimeDyld][AArch64] Add support for the MachO ARM64_RELOC_SUBTRACTOR reloc.
llvm-svn: 258438
2016-01-21 21:59:50 +00:00
David L Kreitzer
28ea778709 Fix for two constant propagation problems in GVN with the assume intrinsic
instruction.

Patch by Yuanrui Zhang.

Differential Revision: http://reviews.llvm.org/D16100

llvm-svn: 258435
2016-01-21 21:32:35 +00:00
Kevin Enderby
a1e729dabc Fix MachOObjectFile::getSymbolSection() to not call report_fatal_error()
but to return object_error::parse_failed.  Then made the code in llvm-nm
do for Mach-O files what is done in the darwin native tools which is to
print "(?,?)" or just "s" for bad section indexes.  Also added a test to show
it prints the bad section index of "42" when printing the fields as raw hex.

llvm-svn: 258434
2016-01-21 21:13:27 +00:00
Sanjay Patel
1087b8fb2a [LibCallSimplifier] don't get fooled by a fake fmin()
This is similar to the bug/fix:
https://llvm.org/bugs/show_bug.cgi?id=26211
http://reviews.llvm.org/rL258325

The fmin() test case reveals another bug caused by sloppy
code duplication. It will crash without this patch because
fp128 is a valid floating-point type, but we would think
that we had matched a function that used doubles.

The new helper function can be used to replace similar
checks that are used in several other places in this file.

llvm-svn: 258428
2016-01-21 20:19:54 +00:00
David Majnemer
4981d2326a [InstCombine] Simplify (x >> y) <= x
This commit extends the patterns recognised by InstSimplify to also handle (x >> y) <= x in the same way as (x /u y) <= x.

The missing optimisation was found investigating why LLVM did not optimise away bound checks in a binary search: https://github.com/rust-lang/rust/pull/30917

Patch by Andrea Canciani!

Differential Revision: http://reviews.llvm.org/D16402

llvm-svn: 258422
2016-01-21 18:55:54 +00:00
Chad Rosier
b9cedfb407 Partially revert "Add command line options to force function/loop alignments."
This partially reverts r256571 in favor of the solution in r258409.

llvm-svn: 258421
2016-01-21 18:49:15 +00:00