Commit Graph

36009 Commits

Author SHA1 Message Date
Adrian Prantl
ae61e8a345 Debug info: Support DWARF4 bitfields via DW_AT_data_bit_offset.
The DWARF2 specification of DW_AT_bit_offset was written from the perspective of
a big-endian machine with unclear semantics for other systems.  DWARF4
deprecated DW_AT_bit_offset and introduced a new attribute DW_AT_data_bit_offset
that simply counts the number of bits from the beginning of the containing
entity regardless of endianness.

After this patch LLVM emits DW_AT_bit_offset for DWARF 2 or 3 and
DW_AT_data_bit_offset when DWARF 4 or later is requested.

llvm-svn: 267895
2016-04-28 15:37:48 +00:00
Krzysztof Parzyszek
49d1f997e6 [RDF] Handle undefined registers in RDF copy propagation
When updating the graph, make sure that new uses without reaching defs
are handled correctly.

llvm-svn: 267891
2016-04-28 15:09:19 +00:00
Simon Pilgrim
04933872c3 [InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBits
The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits.

This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero.

Differential Revision: http://reviews.llvm.org/D19614

llvm-svn: 267873
2016-04-28 12:22:53 +00:00
Matthias Braun
18562ab366 CodeGen: Add DetectDeadLanes pass.
The DetectDeadLanes pass performs a dataflow analysis of used/defined
subregister lanes across COPY instructions and instructions that will
get lowered to copies. It detects dead definitions and uses reading
undefined values which are obscured by COPY and subregister usage.

These dead definitions cause trouble in the register coalescer which
cannot deal with definitions suddenly becoming dead after coalescing
COPY instructions.

For now the pass only adds dead and undef flags to machine operands. It
should be possible to extend it in the future to remove the dead
instructions and redo the analysis for the affected virtual
registers.

Differential Revision: http://reviews.llvm.org/D18427

llvm-svn: 267851
2016-04-28 03:07:16 +00:00
Sanjay Patel
9fec0afb7a Update test to use FileCheck
Also, add some metadata to show what that currently looks like.

llvm-svn: 267827
2016-04-28 00:29:27 +00:00
Bryan Chan
2567ab558c [SystemZ] Support Swift Calling Convention
Summary:
Port rL265480, rL264754, rL265997 and rL266252 to SystemZ, in order to enable the Swift port on the architecture. SwiftSelf and SwiftError are assigned to R10 and R9, respectively, which are normally callee-saved registers. For more information, see:

RFC: Implementing the Swift calling convention in LLVM and Clang
https://groups.google.com/forum/#!topic/llvm-dev/epDd2w93kZ0

Reviewers: kbarton, manmanren, rjmccall, uweigand

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19414

llvm-svn: 267823
2016-04-28 00:17:23 +00:00
Peter Collingbourne
2442815aeb LTO: Don't bother trying to mangle unnamed globals, as they can't be preserved with MustPreserveSymbols.
Summary: Should fix sanitizer-windows bot.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D19635

llvm-svn: 267820
2016-04-27 23:48:11 +00:00
Kevin Enderby
b798753bdb Fix bugs in llvm-objdump printing the last word for -section in non i386 and x86 files.
Two problems, 1) for the last 4 bytes it would print them as separate bytes not a word
and 2) it would print the same last byte for those bytes less than a word.

rdar://25938224

llvm-svn: 267819
2016-04-27 23:43:00 +00:00
Zachary Turner
7ccba4f0bc Parse module information from DBI stream.
This gets more data out of the DBI strema of the PDB.  In
particular it extracts the metadata for the list of modules
(compilands) that this PDB contains info about, and adds support
for dumping these fields to llvm-pdbdump.

Differential Revision: http://reviews.llvm.org/D19570
Reviewed By: ruiu

llvm-svn: 267818
2016-04-27 23:41:42 +00:00
Rong Xu
3a8084117b more buildbot failure fix to r267792
__llvm_prf_nm length is embedded in llvm_used. Relax llvm_used check.

llvm-svn: 267816
2016-04-27 23:23:53 +00:00
Rong Xu
b6a36f2009 [PGO] Promote indirect calls to conditional direct calls with value-profile
This patch implements the transformation that promotes indirect calls to
conditional direct calls when the indirect-call value profile meta-data is
available.

Differential Revision: http://reviews.llvm.org/D17864

llvm-svn: 267815
2016-04-27 23:20:27 +00:00
Mitch Bodart
fde6615c97 [X86] Enable the post-RA-scheduler for clang's default 32-bit cpu.
For compilations with no explicit cpu specified, this exhibits
nice gains on Silvermont, with neutral performance on big cores.

Differential Revision: http://reviews.llvm.org/D19138

llvm-svn: 267809
2016-04-27 22:52:35 +00:00
Kevin Enderby
41f6b232d1 Fix a bug in llvm-objdump printing of 32-bit addresses for -section in non i386 and x86 files.
rdar://25896202

llvm-svn: 267807
2016-04-27 22:36:18 +00:00
Quentin Colombet
d6bb035737 [X86][FastISel] Make sure we use the right register class when we select stores.
llvm-svn: 267806
2016-04-27 22:33:42 +00:00
Rong Xu
db7372e42c Fix buildbot failure due to r267792
Relax the test check as some targets do not have name compression.

llvm-svn: 267803
2016-04-27 22:06:35 +00:00
Colin LeMahieu
7045424b87 [Hexagon] Merging nops in to previous packet rather than always creating a new one.
llvm-svn: 267798
2016-04-27 21:37:44 +00:00
Quentin Colombet
96d6f82ab0 [X86] Fix the lowering of TLS calls.
The callseq_end node must be glued with the TLS calls, otherwise,
the generic code will miss the uses of the returned value and will
mark it dead.
Moreover, TLSCall 64-bit pseudo must not set an implicit-use on RDI,
the pseudo uses the symbol address at this point not RDI and the
lowering will do the right thing.

llvm-svn: 267797
2016-04-27 21:37:37 +00:00
Rong Xu
724ca3e5db [PGO] Prohibit address recording if the function is both internal and COMDAT
Differential Revision: http://reviews.llvm.org/D19515

llvm-svn: 267792
2016-04-27 21:17:30 +00:00
Matt Arsenault
982e737c85 AMDGPU: Account for globals in AMDGPUPromoteAlloca pass
Patch by Bas Nieuwenhuizen

llvm-svn: 267791
2016-04-27 21:05:08 +00:00
Kevin Enderby
3c1e719ae1 Add a test case for the crash fixed with r267037. David Blaikie said it would be nice to have!
This was crashing llvm-objdump with -macho -objc-meta-data when trying dump a non-existent section.
So the test binary is simply created from an empty .s file compiled with: clang -arch armv7 empty.s -c

llvm-svn: 267782
2016-04-27 20:37:06 +00:00
Ahmed Bougacha
e8bff14c32 [AArch64] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB,
whereas it should be a successor of the original MBB.

Follow-up to r266339.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267779
2016-04-27 20:33:02 +00:00
Ahmed Bougacha
492c1a346a [ARM] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas
it should be a successor of the original MBB.

The testcase changes are caused by Thumb2SizeReduction, which
was previously confused by the broken CFG.

Follow-up to r266679.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267778
2016-04-27 20:32:54 +00:00
Simon Pilgrim
7ece5e4f29 [InstCombine][AVX2] Add AVX2 per-element vector shift tests
At the moment we don't simplify PSRAV/PSRLV/PSLLV intrinsics to generic IR for constant shift amounts, but we could.

llvm-svn: 267777
2016-04-27 20:25:34 +00:00
Kevin B. Smith
1783031f2d [X86]: Quit promoting 16 bit loads to 32 bit.
Differential Revision: http://reviews.llvm.org/D19592

llvm-svn: 267773
2016-04-27 19:58:03 +00:00
David Majnemer
221bf8bc4d [CodeGenPrepare] Don't sink a cast past its user
The sink cast machinery is supposed to sink casts as close to their user
as possible.  However, an EH pad is the first instruction in it's basic
block.  Don't sink if the user is an EH pad.

This fixes PR27536.

llvm-svn: 267767
2016-04-27 19:36:38 +00:00
Ahmed Bougacha
591dc29993 [LIR] Set attributes on memset_pattern16.
"inferattrs" will deduce the attribute, but it will be too late for
many optimizations. Set it ourselves when creating the call.

Differential Revision: http://reviews.llvm.org/D17598

llvm-svn: 267762
2016-04-27 19:04:50 +00:00
Ahmed Bougacha
efd557cecd [InferAttrs] Mark memset_pattern16 params nocapture.
Differential Revision: http://reviews.llvm.org/D19471

llvm-svn: 267760
2016-04-27 19:04:43 +00:00
Chad Rosier
c6f107fe0e Revert "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD."
This reverts commit r267733 due to a -Werror,-Wunused-function error.

llvm-svn: 267752
2016-04-27 18:29:11 +00:00
Matthew Simpson
71a9d56171 [LV] Reallow positive-stride interleaved load groups with gaps
We previously disallowed interleaved load groups that may cause us to
speculatively access memory out-of-bounds (r261331). We did this by ensuring
each load group had an access corresponding to the first and last member.
Instead of bailing out for these interleaved groups, this patch enables us to
peel off the last vector iteration, ensuring that we execute at least one
iteration of the scalar remainder loop. This solution was proposed in the
review of the previous patch.

Differential Revision: http://reviews.llvm.org/D19487

llvm-svn: 267751
2016-04-27 18:21:36 +00:00
Marcin Koscielnicki
1e17bfd3e5 [Mips] Add support for llvm.thread.pointer intrinsic.
This will be used to implement __builtin_thread_pointer in clang.

Differential Revision: http://reviews.llvm.org/D19569

llvm-svn: 267743
2016-04-27 17:21:49 +00:00
Gerolf Hoflehner
832835b3e9 [InstCombine] Sharpended test case in pr21210.ll
llvm-svn: 267742
2016-04-27 17:19:54 +00:00
Artem Tamazov
ed6f89bcdc [AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD.
Added support of TTMP quads.
Reworked M0 exclusion machinery for SMRD and similar instructions
to enable usage of TTMP registers in those instructions as destinations.
Tests added.

Differential Revision: http://reviews.llvm.org/D19342

llvm-svn: 267733
2016-04-27 16:20:23 +00:00
Reid Kleckner
d8e7f2f768 [PDB] Fix function names for private symbols in PDBs
Summary:
llvm-symbolizer wants to get linkage names of functions for historical
reasons. Linkage names are only recorded in the PDB for public symbols,
and the linkage name is apparently stored separately in some "public
symbol" record. We had a workaround in PDBContext which would look for
such symbols when the user requested linkage names.

However, when given an address that was truly in a private function and
public funciton, we would accidentally find nearby public symbols and
return those function names. The fix is to look for both function
symbols and public symbols and only prefer the public symbol name if the
addresses of the symbols agree.

Fixes PR27492

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19571

llvm-svn: 267732
2016-04-27 16:10:29 +00:00
Nicolai Haehnle
494b4aee1e AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsic
Summary:
So it appears that to guarantee some of the ordering requirements of a GLSL
memoryBarrier() executed in the shader, we need to emit an s_waitcnt.

(We can't use an s_barrier, because memoryBarrier() may appear anywhere in
the shader, in particular it may appear in non-uniform control flow.)

Reviewers: arsenm, mareko, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19203

llvm-svn: 267729
2016-04-27 15:46:01 +00:00
Matthew Simpson
995eceaf0c [TTI] Add hook for vector extract with extension
This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.

Differential Revision: http://reviews.llvm.org/D18523

llvm-svn: 267725
2016-04-27 15:20:21 +00:00
Artem Tamazov
0b6855273a [AMDGPU][llvm-mc] s_getreg/setreg* - Support symbolic names of hardware registers.
Possibility to specify code of hardware register kept.
Disassemble to symbolic name, if name is known.
Tests updated/added.

Differential Revision: http://reviews.llvm.org/D19335

llvm-svn: 267724
2016-04-27 15:17:03 +00:00
Nico Weber
b519b357d0 Revert r267649, it caused PR27539.
llvm-svn: 267723
2016-04-27 15:16:54 +00:00
Kristof Beyls
28b8dc2f09 Remove size 1 from check as that isn't part of what the test is meant to be testing.
This test also runs on e.g. ARM-native builds when the X86 backend is also
built.  This test produces code for the default instruction set, even though it
is in a "X86" sub-directory. Given that this test doesn't seem to be testing
anything architecture-specific, it seems it's best to adapt the check to not
check for an architecture-dependent value (the size of the function), rather
than hard-code the test to target x86.

llvm-svn: 267722
2016-04-27 15:03:09 +00:00
Teresa Johnson
1ec75cc513 [ThinLTO] Use valueid instead of bitcode offsets in combined index file
Summary:
With the removal of support for lazy parsing of combined index summary
records (e.g. r267344), we no longer need to include the summary record
bitcode offset in the VST entries for definitions. Change the combined
index format to be similar to the per-module index format in using value
ids to cross-reference from the summary record to the VST entry (rather
than the summary record bitcode offset to cross-reference in the other
direction).

The visible changes are:
1) Add the value id to the combined summary records
2) Remove the summary offset from the combined VST records, which has
the following effects:
- No longer need the VST_CODE_COMBINED_GVDEFENTRY record, as all
  combined index VST entries now only contain the value id and
  corresponding GUID.
- No longer have duplicate VST entries in the case where there are
  multiple definitions of a symbol (e.g. weak/linkonce), as they all
  have the same value id and GUID.

An implication of #2 above is that in order to hook up an alias to the
correct aliasee based on the value id of the aliasee recorded in the
combined index alias record, we need to scan the entries in the index
for that GUID to find the one from the same module (i.e. the case where
there are multiple entries for the aliasee). But the reader no longer
has to maintain a special map to hook up the alias/aliasee.

Reviewers: joker.eph

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19481

llvm-svn: 267712
2016-04-27 13:28:35 +00:00
Simon Pilgrim
3f41257831 [InstCombine][SSE] Regenerated vector shift tests
llvm-svn: 267699
2016-04-27 12:04:44 +00:00
Zlatko Buljan
92f1550331 [mips][microMIPS] Add CodeGen support for SUBU16, SUB, SUBU, DSUB and DSUBU instructions
Differential Revision: http://reviews.llvm.org/D16676

llvm-svn: 267694
2016-04-27 11:31:44 +00:00
Zlatko Buljan
a2323fb2af [mips][microMIPS] Add CodeGen support for SLL16, SRL16, SLL, SLLV, SRA, SRAV, SRL and SRLV instructions
Differential Revision: http://reviews.llvm.org/D17989

llvm-svn: 267693
2016-04-27 11:02:23 +00:00
Artur Pilipenko
e4f3081483 Use DL preferred alignment for alloca in Value::getPointerAlignment
Teach Value::getPointerAlignment that allocas with no explicit alignment are aligned to preferred alignment of the allocated type.

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D17569

llvm-svn: 267689
2016-04-27 10:42:29 +00:00
Simon Pilgrim
b8826cf36d [InstCombine][SSE] Added DemandedBits tests for MOVMSK instructions
MOVMSK zeros the upper bits of the gpr - we should be able to use this.

llvm-svn: 267686
2016-04-27 09:53:09 +00:00
Adam Nemet
ededcfa020 [LoopDist] Add llvm.loop.distribute.enable loop metadata
Summary:
D19403 adds a new pragma for loop distribution.  This change adds
support for the corresponding metadata that the pragma is translated to
by the FE.

As part of this I had to rethink the flag -enable-loop-distribute.  My
goal was to be backward compatible with the existing behavior:

  A1. pass is off by default from the optimization pipeline
  unless -enable-loop-distribute is specified

  A2. pass is on when invoked directly from opt (e.g. for unit-testing)

The new pragma/metadata overrides these defaults so the new behavior is:

  B1. A1 + enable distribution for individual loop with the pragma/metadata

  B2. A2 + disable distribution for individual loop with the pragma/metadata

The default value whether the pass is on or off comes from the initiator
of the pass.  From the PassManagerBuilder the default is off, from opt
it's on.

I moved -enable-loop-distribute under the pass.  If the flag is
specified it overrides the default from above.

Then the pragma/metadata can further modifies this per loop.

As a side-effect, we can now also use -enable-loop-distribute=0 from opt
to emulate the default from the optimization pipeline.  So to be precise
this is the new behavior:

  C1. pass is off by default from the optimization pipeline
  unless -enable-loop-distribute or the pragma/metadata enables it

  C2. pass is on when invoked directly from opt
  unless -enable-loop-distribute=0 or the pragma/metadata disables it

Reviewers: hfinkel

Subscribers: joker.eph, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D19431

llvm-svn: 267672
2016-04-27 05:28:18 +00:00
Mehdi Amini
cc7d938331 Revert "Support "preserving" the summary information when using setModule() API in LTOCodeGenerator"
This reverts commit r267665.
ASAN shows that there is a use of undefined value.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267668
2016-04-27 05:11:44 +00:00
Mehdi Amini
6e6426ce69 Support "preserving" the summary information when using setModule() API in LTOCodeGenerator
Another attempt at r267655...

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267665
2016-04-27 04:24:10 +00:00
Mehdi Amini
4c94cd20e1 Revert "Support "preserving" the summary information when using setModule() API in LTOCodeGenerator"
This reverts commit r267657, r267656, and r267655.
The test does not pass on multiple bots, I'm unsure why yet but let's unbreak them.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267664
2016-04-27 03:34:28 +00:00
Evgeny Stupachenko
9d35c64dc2 The patch fixes PR27392.
Summary:
 It is incorrect to compare TripCount (which is BECount + 1)
  with extraiters (or Count) to check if we should enter unrolled
  loop or not, because TripCount can potentially overflow
  (when BECount is max unsigned integer).
 While comparing BECount with (Count - 1) is overflow safe and
  therefore correct.

Reviewer: hfinkel

Differential Revision: http://reviews.llvm.org/D19256

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 267662
2016-04-27 03:04:54 +00:00
Chuang-Yu Cheng
d389efdf8e [ppc64] fix bug in prologue that mfocrf's cr operand should be explict state instead of implicit
This fixes PR27414

Reviewers: kbarton mgrang tjablin

http://reviews.llvm.org/D19255

llvm-svn: 267660
2016-04-27 02:59:28 +00:00