Summary:
This includes a hazard recognizer implementation to replace some of
the hazard handling we had during frame index elimination.
Reviewers: arsenm
Subscribers: qcolombet, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18602
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268143 91177308-0d34-0410-b5e6-96231b3b80d8
We can demonstrate the 'select' bug and fix with a simpler test case.
The merged weight values are already tested in another test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268139 91177308-0d34-0410-b5e6-96231b3b80d8
This moves some logic added to EarlyCSE in rL268120 into
`llvm::isInstructionTriviallyDead`. Adds a test case for DCE to
demonstrate that passes other than EarlyCSE can now pick up on the new
information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268126 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change teaches EarlyCSE some basic properties of guard intrinsics:
- Guard intrinsics read all memory, but don't write to any memory
- After a guard has executed, the condition it was guarding on can be
assumed to be true
- Guard intrinsics on a constant `true` are no-ops
Reviewers: reames, hfinkel
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D19578
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268120 91177308-0d34-0410-b5e6-96231b3b80d8
If a block has no successors because it ends in unreachable,
this was accessing an invalid iterator.
Also stop counting instructions that don't emit any
real instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268119 91177308-0d34-0410-b5e6-96231b3b80d8
matchSelectPattern attempts to see through casts which mask min/max
patterns from being more obvious. Under certain circumstances, it would
misidentify a sequence of instructions as a min/max because it assumed
that folding casts would preserve the result. This is not the case for
floating point <-> integer casts.
This fixes PR27575.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268086 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This removes the temporary call to isIntegratedAssemblerRequired() which was
added recently. It's effect is now acheived directly in the MipsTargetStreamer
hierarchy.
Reviewers: sdardis
Subscribers: dsanders, sdardis, llvm-commits
Differential Revision: http://reviews.llvm.org/D19715
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268058 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The portion in MipsAsmParser is responsible for figuring out which expansion to
use, while the portion in MipsTargetStreamer is responsible for emitting it.
This allows us to remove the call to isIntegratedAssemblerRequired() which is
currently ensuring the effect of .cprestore only occurs when writing objects.
The small functional change is that the memory offsets are now correctly
printed as signed values.
Reviewers: sdardis
Subscribers: dsanders, sdardis, llvm-commits
Differential Revision: http://reviews.llvm.org/D19714
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268042 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The goal is for each operand type to have its own parse function and
at the same time share common code for tracking state as different
instruction types share operand types (e.g. glc/glc_flat, etc).
Introduce parseAMDGPUOperand which can parse any optional operand.
DPP and Clamp/OMod have custom handling for now. Sam also suggested
to have class hierarchy for operand types instead of table. This
can be done in separate change.
Remove parseVOP3OptionalOps, parseDS*OptionalOps, parseFlatOptionalOps,
parseMubufOptionalOps, parseDPPOptionalOps.
Reduce number of definitions of AsmOperand's and MatchClasses' by using common base class.
Rename AsmMatcher/InstPrinter methods accordingly.
Print immediate type when printing parsed immediate operand.
Use 'off' if offset/index register is unused instead of skipping it to make it more readable (also agreed with SP3).
Update tests.
Reviewers: tstellarAMD, SamWot, artem.tamazov
Subscribers: qcolombet, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19584
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268015 91177308-0d34-0410-b5e6-96231b3b80d8
This was being treated the same as private, which has an immediate
offset. For unknown, it probably means it's for a computation not
actually being used for accessing memory, so it should not have a
nontrivial addressing mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268002 91177308-0d34-0410-b5e6-96231b3b80d8
In case of missing live intervals for a physical registers
getLanesWithProperty() would report 0 which was not a safe default in
all situations. Add a parameter to pass in a safe default.
No testcase because in-tree targets do not skip computing register unit
live intervals.
Also cleanup the getXXX() functions to not perform the
RequireLiveIntervals checks anymore so we do not even need to return
safe defaults.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267977 91177308-0d34-0410-b5e6-96231b3b80d8
We need to keep loop hints from the original loop on the new vector loop.
Failure to do this meant that, for example:
void foo(int *b) {
#pragma clang loop unroll(disable)
for (int i = 0; i < 16; ++i)
b[i] = 1;
}
this loop would be unrolled. Why? Because we'd vectorize it, thus dropping the
hints that unrolling should be disabled, and then we'd unroll it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267970 91177308-0d34-0410-b5e6-96231b3b80d8
A bug was introduced when the code was refactored which resulted in a
bad memory access.
This fixes PR27565.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267953 91177308-0d34-0410-b5e6-96231b3b80d8
I closely followed the precedents set by the vectorizer:
* With -Rpass-missed, the loop is reported with further details pointing
to -Rpass--analysis.
* -Rpass-analysis reports the details why distribution has failed.
* Regardless of -Rpass*, when distribution fails for a loop where
distribution was forced with the pragma, a warning is produced according
to -Wpass-failed. In this case the analysis info is also printed even
without -Rpass-analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267952 91177308-0d34-0410-b5e6-96231b3b80d8
When inlining a call site with llvm.mem.parallel_loop_access metadata, this
metadata needs to be propagated to all cloned memory-accessing instructions.
Otherwise, inlining parts of the loop body will invalidate the annotation.
With this functionality, we now vectorize the following as expected:
void Body(int *res, int *c, int *d, int *p, int i) {
res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}
void Test(int *res, int *c, int *d, int *p, int n) {
int i;
#pragma clang loop vectorize(assume_safety)
for (i = 0; i < 1600; i++) {
Body(res, c, d, p, i);
}
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267949 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is the follow-up patch for http://reviews.llvm.org/D19436
* Update the discriminator reading algorithm to match the assignment algorithm.
* Add test to cover the new algorithm.
Reviewers: dnovillo, echristo, dblaikie
Subscribers: danielcdh, dblaikie, echristo, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D19522
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267945 91177308-0d34-0410-b5e6-96231b3b80d8
This instruction is just a control flow marker - it should not
actually exist in the object file. Unfortunately, nothing catches
it before it gets to AsmPrinter. If integrated assembler is used,
it's considered to be a normal 4-byte instruction, and emitted as
an all-0 word, crashing the program. With external assembler,
a comment is emitted.
Fixed by setting Size to 0 and handling it in MCCodeEmitter - this
means the comment will still be emitted if integrated assembler
is not used.
This broke an ASan test, which has been disabled for a long time
as a result (see the discussion on D19657). We can reenable it
once this lands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267943 91177308-0d34-0410-b5e6-96231b3b80d8
We now read out the rest of the substreams from the DBI streams. One of
these substreams, the FileInfo substream, contains information about which
source files contribute to each module (aka compiland). This patch
additionally parses out the file information from that substream, and
dumps it in llvm-pdbdump.
Differential Revision: http://reviews.llvm.org/D19634
Reviewed by: ruiu
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267928 91177308-0d34-0410-b5e6-96231b3b80d8
Revert "[Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random number, set bool, and dfp test significance".
This patch has caused a functional regression in SPEC2k6 namd, and a performance regression in mesa-pipe.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267927 91177308-0d34-0410-b5e6-96231b3b80d8
ScheduleDAGMI::initQueues changes the RegionBegin to the first non-debug
instruction. Since it does not track register pressure, it does not affect
any RP trackers. ScheduleDAGMILive inherits initQueues from ScheduleDAGMI,
and it does reset the TopTPTracker in its schedule method. Any derived,
target-specific scheduler will need to do it as well, but the TopRPTracker
is only exposed as a "const" object to derived classes. Without the ability
to modify the tracker directly, this leaves a derived scheduler with a
potential of having the TopRPTracker out-of-sync with the CurrentTop.
The symptom of the problem:
void llvm::ScheduleDAGMILive::scheduleMI(llvm::SUnit *, bool):
Assertion `TopRPTracker.getPos() == CurrentTop && "out of sync"' failed.
Differential Revision: http://reviews.llvm.org/D19438
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267918 91177308-0d34-0410-b5e6-96231b3b80d8
The canonical form for allocas is a single allocation of the array type.
In case we see a non-canonical array alloca, make sure we aren't
replacing this with an array N times smaller.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267916 91177308-0d34-0410-b5e6-96231b3b80d8
Currently Mips::emitAtomicBinaryPartword() does not properly respect the
width of pointers. For MIPS64 this causes the memory address that the ll/sc
sequence uses to be truncated. At runtime this causes a segmentation fault.
This can be fixed by applying similar changes as r266204, so that a full 64bit
pointer is loaded.
Reviewers: dsanders
Differential Review: http://reviews.llvm.org/D19651
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267900 91177308-0d34-0410-b5e6-96231b3b80d8
The DWARF2 specification of DW_AT_bit_offset is ambiguous for
little-endian machines, but by restoring to the old behavior
we match what debuggers expect and what other popular compilers
generate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267896 91177308-0d34-0410-b5e6-96231b3b80d8
The DWARF2 specification of DW_AT_bit_offset was written from the perspective of
a big-endian machine with unclear semantics for other systems. DWARF4
deprecated DW_AT_bit_offset and introduced a new attribute DW_AT_data_bit_offset
that simply counts the number of bits from the beginning of the containing
entity regardless of endianness.
After this patch LLVM emits DW_AT_bit_offset for DWARF 2 or 3 and
DW_AT_data_bit_offset when DWARF 4 or later is requested.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267895 91177308-0d34-0410-b5e6-96231b3b80d8
The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits.
This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero.
Differential Revision: http://reviews.llvm.org/D19614
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267873 91177308-0d34-0410-b5e6-96231b3b80d8
The DetectDeadLanes pass performs a dataflow analysis of used/defined
subregister lanes across COPY instructions and instructions that will
get lowered to copies. It detects dead definitions and uses reading
undefined values which are obscured by COPY and subregister usage.
These dead definitions cause trouble in the register coalescer which
cannot deal with definitions suddenly becoming dead after coalescing
COPY instructions.
For now the pass only adds dead and undef flags to machine operands. It
should be possible to extend it in the future to remove the dead
instructions and redo the analysis for the affected virtual
registers.
Differential Revision: http://reviews.llvm.org/D18427
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267851 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Port rL265480, rL264754, rL265997 and rL266252 to SystemZ, in order to enable the Swift port on the architecture. SwiftSelf and SwiftError are assigned to R10 and R9, respectively, which are normally callee-saved registers. For more information, see:
RFC: Implementing the Swift calling convention in LLVM and Clang
https://groups.google.com/forum/#!topic/llvm-dev/epDd2w93kZ0
Reviewers: kbarton, manmanren, rjmccall, uweigand
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19414
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267823 91177308-0d34-0410-b5e6-96231b3b80d8
Two problems, 1) for the last 4 bytes it would print them as separate bytes not a word
and 2) it would print the same last byte for those bytes less than a word.
rdar://25938224
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267819 91177308-0d34-0410-b5e6-96231b3b80d8
This gets more data out of the DBI strema of the PDB. In
particular it extracts the metadata for the list of modules
(compilands) that this PDB contains info about, and adds support
for dumping these fields to llvm-pdbdump.
Differential Revision: http://reviews.llvm.org/D19570
Reviewed By: ruiu
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267818 91177308-0d34-0410-b5e6-96231b3b80d8
The callseq_end node must be glued with the TLS calls, otherwise,
the generic code will miss the uses of the returned value and will
mark it dead.
Moreover, TLSCall 64-bit pseudo must not set an implicit-use on RDI,
the pseudo uses the symbol address at this point not RDI and the
lowering will do the right thing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267797 91177308-0d34-0410-b5e6-96231b3b80d8
This was crashing llvm-objdump with -macho -objc-meta-data when trying dump a non-existent section.
So the test binary is simply created from an empty .s file compiled with: clang -arch armv7 empty.s -c
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267782 91177308-0d34-0410-b5e6-96231b3b80d8
transferSuccessors() would LoadCmpBB a successor of DoneBB,
whereas it should be a successor of the original MBB.
Follow-up to r266339.
Unfortunately, it's tricky to catch this in the verifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267779 91177308-0d34-0410-b5e6-96231b3b80d8
transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas
it should be a successor of the original MBB.
The testcase changes are caused by Thumb2SizeReduction, which
was previously confused by the broken CFG.
Follow-up to r266679.
Unfortunately, it's tricky to catch this in the verifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267778 91177308-0d34-0410-b5e6-96231b3b80d8
At the moment we don't simplify PSRAV/PSRLV/PSLLV intrinsics to generic IR for constant shift amounts, but we could.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267777 91177308-0d34-0410-b5e6-96231b3b80d8
The sink cast machinery is supposed to sink casts as close to their user
as possible. However, an EH pad is the first instruction in it's basic
block. Don't sink if the user is an EH pad.
This fixes PR27536.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267767 91177308-0d34-0410-b5e6-96231b3b80d8
We previously disallowed interleaved load groups that may cause us to
speculatively access memory out-of-bounds (r261331). We did this by ensuring
each load group had an access corresponding to the first and last member.
Instead of bailing out for these interleaved groups, this patch enables us to
peel off the last vector iteration, ensuring that we execute at least one
iteration of the scalar remainder loop. This solution was proposed in the
review of the previous patch.
Differential Revision: http://reviews.llvm.org/D19487
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267751 91177308-0d34-0410-b5e6-96231b3b80d8
Added support of TTMP quads.
Reworked M0 exclusion machinery for SMRD and similar instructions
to enable usage of TTMP registers in those instructions as destinations.
Tests added.
Differential Revision: http://reviews.llvm.org/D19342
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267733 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
llvm-symbolizer wants to get linkage names of functions for historical
reasons. Linkage names are only recorded in the PDB for public symbols,
and the linkage name is apparently stored separately in some "public
symbol" record. We had a workaround in PDBContext which would look for
such symbols when the user requested linkage names.
However, when given an address that was truly in a private function and
public funciton, we would accidentally find nearby public symbols and
return those function names. The fix is to look for both function
symbols and public symbols and only prefer the public symbol name if the
addresses of the symbols agree.
Fixes PR27492
Reviewers: zturner
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19571
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267732 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
So it appears that to guarantee some of the ordering requirements of a GLSL
memoryBarrier() executed in the shader, we need to emit an s_waitcnt.
(We can't use an s_barrier, because memoryBarrier() may appear anywhere in
the shader, in particular it may appear in non-uniform control flow.)
Reviewers: arsenm, mareko, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19203
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267729 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.
Differential Revision: http://reviews.llvm.org/D18523
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267725 91177308-0d34-0410-b5e6-96231b3b80d8
This test also runs on e.g. ARM-native builds when the X86 backend is also
built. This test produces code for the default instruction set, even though it
is in a "X86" sub-directory. Given that this test doesn't seem to be testing
anything architecture-specific, it seems it's best to adapt the check to not
check for an architecture-dependent value (the size of the function), rather
than hard-code the test to target x86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267722 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
With the removal of support for lazy parsing of combined index summary
records (e.g. r267344), we no longer need to include the summary record
bitcode offset in the VST entries for definitions. Change the combined
index format to be similar to the per-module index format in using value
ids to cross-reference from the summary record to the VST entry (rather
than the summary record bitcode offset to cross-reference in the other
direction).
The visible changes are:
1) Add the value id to the combined summary records
2) Remove the summary offset from the combined VST records, which has
the following effects:
- No longer need the VST_CODE_COMBINED_GVDEFENTRY record, as all
combined index VST entries now only contain the value id and
corresponding GUID.
- No longer have duplicate VST entries in the case where there are
multiple definitions of a symbol (e.g. weak/linkonce), as they all
have the same value id and GUID.
An implication of #2 above is that in order to hook up an alias to the
correct aliasee based on the value id of the aliasee recorded in the
combined index alias record, we need to scan the entries in the index
for that GUID to find the one from the same module (i.e. the case where
there are multiple entries for the aliasee). But the reader no longer
has to maintain a special map to hook up the alias/aliasee.
Reviewers: joker.eph
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19481
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267712 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
D19403 adds a new pragma for loop distribution. This change adds
support for the corresponding metadata that the pragma is translated to
by the FE.
As part of this I had to rethink the flag -enable-loop-distribute. My
goal was to be backward compatible with the existing behavior:
A1. pass is off by default from the optimization pipeline
unless -enable-loop-distribute is specified
A2. pass is on when invoked directly from opt (e.g. for unit-testing)
The new pragma/metadata overrides these defaults so the new behavior is:
B1. A1 + enable distribution for individual loop with the pragma/metadata
B2. A2 + disable distribution for individual loop with the pragma/metadata
The default value whether the pass is on or off comes from the initiator
of the pass. From the PassManagerBuilder the default is off, from opt
it's on.
I moved -enable-loop-distribute under the pass. If the flag is
specified it overrides the default from above.
Then the pragma/metadata can further modifies this per loop.
As a side-effect, we can now also use -enable-loop-distribute=0 from opt
to emulate the default from the optimization pipeline. So to be precise
this is the new behavior:
C1. pass is off by default from the optimization pipeline
unless -enable-loop-distribute or the pragma/metadata enables it
C2. pass is on when invoked directly from opt
unless -enable-loop-distribute=0 or the pragma/metadata disables it
Reviewers: hfinkel
Subscribers: joker.eph, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D19431
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267672 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r267665.
ASAN shows that there is a use of undefined value.
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267668 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r267657, r267656, and r267655.
The test does not pass on multiple bots, I'm unsure why yet but let's unbreak them.
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267664 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
It is incorrect to compare TripCount (which is BECount + 1)
with extraiters (or Count) to check if we should enter unrolled
loop or not, because TripCount can potentially overflow
(when BECount is max unsigned integer).
While comparing BECount with (Count - 1) is overflow safe and
therefore correct.
Reviewer: hfinkel
Differential Revision: http://reviews.llvm.org/D19256
From: Evgeny Stupachenko <evstupac@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267662 91177308-0d34-0410-b5e6-96231b3b80d8
It's probably the case for all 3 MMX users out there, but with
hand-crafted IR, you can trigger selection failures. Fix that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267652 91177308-0d34-0410-b5e6-96231b3b80d8
This effectively adds back the extractelt combine removed by r262358:
the direct case can still occur (because x86_mmx is special, see
r262446), but it's the indirect case that's now superseded by the
generic combine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267651 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If the linker requested to preserve a linkonce function, we should
honor this even if we drop all uses.
Reviewers: dexonsmith
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D19527
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267644 91177308-0d34-0410-b5e6-96231b3b80d8
When encountering a non-local pointer, LVI would eagerly scan the block for dereferences of the given object to prove the pointer to be non null. That's all well and good, but *then* we'd go recurse through our input blocks. As a result, we could end up scanning each and every block we traverse, even if the final definition was obviously non null or we found a constant value somewhere up the chain. The previous code papered over this by using the isKnownNonNull routine from value tracking. This made the duplication less painful in the common case.
Instead, we know do the block scan only *after* we've gotten the recursive results back. This lets us stop scanning individual blocks as soon as we've determined it to be non-null in any predecessor block and use our usual merge rules to propagate that information cheaply through successor blocks. For a pointer which can be found non-null, this does strictly less work and sometimes substaintially so.
Note that the case where we *can't* prove something non-null is still the really expensive case. We end up scanning each and every block looking for a dereference and never end up finding one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267642 91177308-0d34-0410-b5e6-96231b3b80d8
the prologue.
Do not use basic blocks that have EFLAGS live-in as prologue if we need
to realign the stack. Realigning the stack uses AND instruction and this
clobbers EFLAGS.
An other alternative would have been to save and restore EFLAGS around
the stack realignment code, but this is likely inefficient.
Fixes PR27531.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267634 91177308-0d34-0410-b5e6-96231b3b80d8
Now, it is possible to know that partial definitions are dead definitions and
recognize that clobbered registers are also dead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267621 91177308-0d34-0410-b5e6-96231b3b80d8
As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well.
This patch builds on 267609 which did the same thing for unary casts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267620 91177308-0d34-0410-b5e6-96231b3b80d8
Essentially, I was using the wrong size function. For types which were sized, but not primitive, I wasn't getting a useful size for the operand and failed an assert. I fixed this, and also added a guard that the input is a sized type. Test case is for the original mistake. I'm not sure how to actually exercise the sized type check.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267618 91177308-0d34-0410-b5e6-96231b3b80d8
We need the default ratio to be sufficiently large that it triggers transforms
based on block frequency info (BFI) and plays well with the recently introduced
BranchProbability used by CGP.
Differential Revision: http://reviews.llvm.org/D19435
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267615 91177308-0d34-0410-b5e6-96231b3b80d8
As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well.
This patch implements only the unary operation case. Once this is in, I'll implement the same for the binary operations.
Differential Revision: http://reviews.llvm.org/D19492
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267609 91177308-0d34-0410-b5e6-96231b3b80d8
The destination buffer that sprintf uses is restrict qualified, we do
not need to worry about derived pointers referenced via format
specifiers.
This reverts commit r267580.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267605 91177308-0d34-0410-b5e6-96231b3b80d8
The DBI stream contains a lot of bookkeeping information for other
streams. In particular it contains information about section contributions
and linked modules. This patch is a first attempt at parsing some of the
information out of the DBI stream. It currently only parses and dumps the
headers of the DBI stream, so none of the module data or section
contribution data is pulled out.
This is just a proof of concept that we understand the basic properties of
the DBI stream's metadata, and followup patches will try to extract more
detailed information out.
Differential Revision: http://reviews.llvm.org/D19500
Reviewed By: majnemer, ruiu
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267585 91177308-0d34-0410-b5e6-96231b3b80d8
When a block is tail-duplicated, the PHI nodes from that block are
replaced with appropriate COPY instructions. When those PHI nodes
contained use operands with subregisters, the subregisters were
dropped from the COPY instructions, resulting in incorrect code.
Keep track of the subregister information and use this information
when remapping instructions from the duplicated block.
Differential Revision: http://reviews.llvm.org/D19337
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267583 91177308-0d34-0410-b5e6-96231b3b80d8
A latent bug in llvm-objdump used the wrong format specifier on 32-bit
targets, causing the test to fail. This fixes the issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267582 91177308-0d34-0410-b5e6-96231b3b80d8
sprintf doesn't read or copy the terminating null byte from it's string
operands. sprintf will append it's own after processing all of the
format specifiers.
This fixes PR27526.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267580 91177308-0d34-0410-b5e6-96231b3b80d8
The Machine Instruction Verifier flagged some issues in the serialized MIR.
Adjust the input to correct them.
Fixes the remaining portion of PR27480.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267578 91177308-0d34-0410-b5e6-96231b3b80d8
This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344
CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this
same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the
IR further with a select in place.
For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly
predictable branch. Even the most limited HW branch predictors will be correct on this branch almost
all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all
the times the branch was predicted correctly.
As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's
MispredictPenalty. Or we could just let targets override the default by implementing the hook with that
and other target-specific options. Note that trying to statically determine mispredict rates for
close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie,
50/50 taken/not-taken might still be 100% predictable.
Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable()
branch weight default values are 4 and 64. A proposal to change that is in D19435.
Differential Revision: http://reviews.llvm.org/D19488
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267572 91177308-0d34-0410-b5e6-96231b3b80d8
Support for SDWA instructions for VOP1 and VOP2 encoding.
Not done yet:
- converters for support optional operands and modifiers
- VOPC
- sext() modifier
- intrinsics
- VOP2b (see vop_dpp.s)
- V_MAC_F32 (see vop_dpp.s)
Differential Revision: http://reviews.llvm.org/D19360
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267553 91177308-0d34-0410-b5e6-96231b3b80d8
If the linker specifically requested for a linkonce to be preserved,
we need to make sure we won't drop it even if all the uses in the
current module disappear.
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267543 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary.
This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19301
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267519 91177308-0d34-0410-b5e6-96231b3b80d8
Some of the JIT tests began failing with "[llvm] r266663 - [Orc] Re-commit
r266581 with fixes for MSVC, and format cleanups." on powerpc64 big endian.
To get the buildbots running I am marking these as UNSUPPORTED for now.
If this is fixed remove the UNSUPPORTED flag "powerpc64-unknown-linux-gnu".
In r267516 I marked these as XFAIL but they succeed on some of the bots
on stage1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267518 91177308-0d34-0410-b5e6-96231b3b80d8
When passed in via filename, this test will fail if the path to the test
has the strings "f1" and "f2" in somewhere. Pass the file through stdin
to prevent test failures due to coincidences in path names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267517 91177308-0d34-0410-b5e6-96231b3b80d8
Some of the JIT tests began failing with "[llvm] r266663 - [Orc] Re-commit
r266581 with fixes for MSVC, and format cleanups." on powerpc64 big endian.
To get the buildbots running I am marking these as XFAIL for now.
If this is fixed remove the XFAIL flag "powerpc64-unknown-linux-gnu".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267516 91177308-0d34-0410-b5e6-96231b3b80d8
When SimplifyCFG merges identical instructions from both sides of a diamond, it
can preserve !llvm.mem.parallel_loop_access (as it does with most of the other
metadata). There's no real data or control dependency change in this case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267515 91177308-0d34-0410-b5e6-96231b3b80d8
I really thought we were doing this already, but we were not. Given this input:
void Test(int *res, int *c, int *d, int *p) {
for (int i = 0; i < 16; i++)
res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}
we did not vectorize the loop. Even with "assume_safety" the check that we
don't if-convert conditionally-executed loads (to protect against
data-dependent deferenceability) was not elided.
One subtlety: As implemented, it will still prefer to use a masked-load
instrinsic (given target support) over the speculated load. The choice here
seems architecture specific; the best option depends on how expensive the
masked load is compared to a regular load. Ideally, using the masked load still
reduces unnecessary memory traffic, and so should be preferred. If we'd rather
do it the other way, flipping the order of the checks is easy.
The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also
implies that if conversion is okay.
Differential Revision: http://reviews.llvm.org/D19512
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267514 91177308-0d34-0410-b5e6-96231b3b80d8
While we're here, fix the comment and variable names to make it
clear that these are raw weights, not percentages.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267491 91177308-0d34-0410-b5e6-96231b3b80d8
The SparcV8 fneg and fabs instructions interestingly come only in a
single-float variant. Since the sign bit is always the topmost bit no
matter what size float it is, you simply operate on the high
subregister, as if it were a single float.
However, the layout of double-floats in the float registers is reversed
on little-endian CPUs, so that the high bits are in the second
subregister, rather than the first.
Thus, this expansion must check the endianness to use the correct
subregister.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267489 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is what was the "instcombine" portion of D14185, with an additional
test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll).
The patch causes instcombine to replace sequences of extractelement-insertvalue-store
that act essentially like a bitcast followed by a store.
Differential review: http://reviews.llvm.org/D14260
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267482 91177308-0d34-0410-b5e6-96231b3b80d8
log2(Mask) is smaller than 32, we must use the 32-bit variant because the 64-bit
variant cannot encode it. Therefore, set the subreg part accordingly.
[AArch64] Fix optimizeCondBranch logic.
The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!
This fixes the last make check verifier issues for AArch64: PR27479.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267465 91177308-0d34-0410-b5e6-96231b3b80d8
Fix early exit from linkInModule. IRMover::move returns false on
success and true on error.
Add a few more cases of merged common linkage variables with
different sizes and alignments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267437 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a number of endianness issues as well as an ODR
violation that hopefully causes everything to be happy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267431 91177308-0d34-0410-b5e6-96231b3b80d8
visitAND, when folding and (load) forgets to check which output of
an indexed load is involved, happily folding the updated address
output on the following testcase:
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"
%typ = type { i32, i32 }
define signext i32 @_Z8access_pP1Tc(%typ* %p, i8 zeroext %type) {
%b = getelementptr inbounds %typ, %typ* %p, i64 0, i32 1
%1 = load i32, i32* %b, align 4
%2 = ptrtoint i32* %b to i64
%3 = and i64 %2, -35184372088833
%4 = inttoptr i64 %3 to i32*
%_msld = load i32, i32* %4, align 4
%zzz = add i32 %1, %_msld
ret i32 %zzz
}
Fix this by checking ResNo.
I've found a few more places that currently neglect to check for
indexed load, and tightened them up as well, but I don't have test
cases for them. In fact, they might not be triggerable at all,
at least with current targets. Still, better safe than sorry.
Differential Revision: http://reviews.llvm.org/D19202
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267420 91177308-0d34-0410-b5e6-96231b3b80d8
Commit r266977 was reason for failing LLVM test suite with error message: fatal error: error in backend: Cannot select: t17: i32 = rotr t2, t11 ...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267418 91177308-0d34-0410-b5e6-96231b3b80d8
We didn't have logic to correctly handle CFGs where there was more than
one EH-pad successor (these are novel with WinEH).
There were situations where a register was live in one exceptional
successor but not another but the code as written would only consider
the first exceptional successor it found.
This resulted in split points which were insufficiently early if an
invoke was present.
This fixes PR27501.
N.B. This removes getLandingPadSuccessor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267412 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch adds support for the X asm constraint.
To do this, we lower the constraint to either a "w" or "r" constraint
depending on the operand type (both constraints are supported on ARM).
Fixes PR26493
Reviewers: t.p.northover, echristo, rengolin
Subscribers: joker.eph, jgreenhalgh, aemerson, rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D19061
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267411 91177308-0d34-0410-b5e6-96231b3b80d8
The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!).
There seems to be no reason why we can't SRA constants too, so let's do it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267393 91177308-0d34-0410-b5e6-96231b3b80d8
If several regions cover the same area of code, we have to restore
the combined value for that area when return from a nested region.
This patch achieves that by combining regions before calling buildSegments.
Differential Revision: http://reviews.llvm.org/D18610
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267390 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This implements a new method of run-time checking the NoWrap
SCEV predicates, which should be easier to optimize and nicer
for targets that don't correctly handle multiplication/addition
of large integer types (like i128).
If the AddRec is {a,+,b} and the backedge taken count is c,
the idea is to check that |b| * c doesn't have unsigned overflow,
and depending on the sign of b, that:
a + |b| * c >= a (b >= 0) or
a - |b| * c <= a (b <= 0)
where the comparisons above are signed or unsigned, depending on
the flag that we're checking.
The advantage of doing this is that we avoid extending to a larger
type and we avoid the multiplication of large types (multiplying
i128 can be expensive).
Reviewers: sanjoy
Subscribers: llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D19266
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267389 91177308-0d34-0410-b5e6-96231b3b80d8
ADD8TLS, a variant of add instruction used for initial-exec TLS,
currently accepts r0 as a source register. While add itself supports
r0 just fine, linker can relax it to a local-exec sequence, converting
it to addi - which doesn't support r0.
Differential Revision: http://reviews.llvm.org/D19193
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267388 91177308-0d34-0410-b5e6-96231b3b80d8
in a debug-info-bearing function has a debug location attached to it. Failure to
do so causes an "!dbg attachment points at wrong subprogram for function"
assertion failure when the inliner sets up inline scope info.
rdar://problem/25878916
This reaplies r267320 without changes after fixing an issue in the OpenMP IR
generator in clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267370 91177308-0d34-0410-b5e6-96231b3b80d8
This corrects the MI annotations for the stack adjustment following the __chkstk
invocation. We were marking the original SP usage as a Def rather than Kill.
The (new) assigned value is the definition, the original reference is killed.
Adjust the ISelLowering to mark Kills and FrameSetup as well.
This partially resolves PR27480.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267361 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267359 91177308-0d34-0410-b5e6-96231b3b80d8
Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:
1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND).
2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements
3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly
We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns).
Differential Revision: http://reviews.llvm.org/D19318
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267357 91177308-0d34-0410-b5e6-96231b3b80d8
This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:
1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input)
2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input
Differential Revision: http://reviews.llvm.org/D17490
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267356 91177308-0d34-0410-b5e6-96231b3b80d8
Fixed issue with VPPERM target shuffle mask decoding that was incorrectly masking off the 3-bit permute op with a 2-bit mask.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267346 91177308-0d34-0410-b5e6-96231b3b80d8
Reused the ability to split constants of a type wider than the shuffle mask to work with masks generated from scalar constants transfered to xmm.
This fixes an issue preventing PSHUFB target shuffle masks decoding rematerialized scalar constants and also exposes the XOP VPPERM bug described in PR27472.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267343 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes PR22248 on s390x. The previous attempt at this was D19101,
which was before LOAD_STACK_GUARD existed. Compared to the previous
version, this always emits a rather ugly block of 4 instructions, involving
a thread pointer load that can't be shared with other potential users.
However, this is necessary for SSP - spilling the guard value (or thread
pointer used to load it) is counter to the goal, since it could be
overwritten along with the frame it protects.
Differential Revision: http://reviews.llvm.org/D19363
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267340 91177308-0d34-0410-b5e6-96231b3b80d8
Add tests for some missing cases to bitcode upgrade in r267296.
- DICompositeType with an 'elements:' field, which will cause it to be
involved in a cycle after the upgrade.
- A DIDerivedType that references a class in 'extraData:'.
I updated test/Bitcode/dityperefs-3.8.ll with the missing cases and
regenerated test/Bitcode/dityperefs-3.8.ll.bc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267332 91177308-0d34-0410-b5e6-96231b3b80d8
The original patch caused crashes because it could derefence a null pointer
for SelectionDAGTargetInfo for targets that do not define it.
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:
- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267328 91177308-0d34-0410-b5e6-96231b3b80d8
in a debug-info-bearing function has a debug location attached to it. Failure to
do so causes an "!dbg attachment points at wrong subprogram for function"
assertion failure when the inliner sets up inline scope info.
rdar://problem/25878916
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267320 91177308-0d34-0410-b5e6-96231b3b80d8
Keeping as much as possible internal/private is
known to help the optimizer. Let's try to benefit from
this in ThinLTO.
Note: this is early work, but is enough to build clang (and
all the LLVM tools). I still need to write some lit-tests...
Differential Revision: http://reviews.llvm.org/D19103
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267317 91177308-0d34-0410-b5e6-96231b3b80d8
The option to control the emission of the new relocations
is -relax-relocations (blatantly copied from GNU as).
It can't be enabled by default because it breaks relatively
recent versions of ld.bfd/ld.gold (late 2015).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267307 91177308-0d34-0410-b5e6-96231b3b80d8
It seems we still have some ordering issue in the combined index
emission, but I can't figure out why right now.
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267306 91177308-0d34-0410-b5e6-96231b3b80d8
This is a fixup for r267304.
The test was sensitive to the path in a subtle way:
the index in memory is sorted by GUID, which are hashes
that include the source filename for local globals.
Teresa recently added a directive at the IR level, so
we can specify it here to make the test independent of
the path.
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267305 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
As discussed in D18298, some local globals can't
be renamed/promoted (because they have a section, or because
they are referenced from inline assembly).
To be able to detect naming collision, we need to keep around
the "GUID" using their original name without taking the linkage
into account.
Reviewers: tejohnson
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19454
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267304 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We are always importing the initializer for a GlobalVariable.
So if a GlobalVariable is in the export-list, we pull in any
refs as well.
Reviewers: tejohnson
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19102
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267303 91177308-0d34-0410-b5e6-96231b3b80d8
Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around
DIType*. It is no longer legal to refer to a DICompositeType by its
'identifier:', and DIBuilder no longer retains all types with an
'identifier:' automatically.
Aside from the bitcode upgrade, this is mainly removing logic to resolve
an MDString-based reference to an actualy DIType. The commits leading
up to this have made the implicit type map in DICompileUnit's
'retainedTypes:' field superfluous.
This does not remove DITypeRef, DIScopeRef, DINodeRef, and
DITypeRefArray, or stop using them in DI-related metadata. Although as
of this commit they aren't serving a useful purpose, there are patchces
under review to reuse them for CodeView support.
The tests in LLVM were updated with deref-typerefs.sh, which is attached
to the thread "[RFC] Lazy-loading of debug info metadata":
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267296 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This removes a couple of flags added to control this behavior, and
simply keeps all value names when save-temps is specified.
Reviewers: rafael
Subscribers: llvm-commits, pcc, davide
Differential Revision: http://reviews.llvm.org/D19384
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267279 91177308-0d34-0410-b5e6-96231b3b80d8
Since forward references for uniqued node operands are expensive (and
those for distinct node operands are cheap due to
DistinctMDOperandPlaceholder), minimize forward references in uniqued
node operands.
Moreover, guarantee that when a cycle is broken by a distinct node, none
of the uniqued nodes have any forward references. In
ValueEnumerator::EnumerateMetadata, enumerate uniqued node subgraphs
first, delaying distinct nodes until all uniqued nodes have been
handled. This guarantees that uniqued nodes only have forward
references when there is a uniquing cycle (since r267276 changed
ValueEnumerator::organizeMetadata to partition distinct nodes in front
of uniqued nodes as a post-pass).
Note that a single uniqued subgraph can hit multiple distinct nodes at
its leaves. Ideally these would themselves be emitted in post-order,
but this commit doesn't attempt that; I think it requires an extra pass
through the edges, which I'm not convinced is worth it (since
DistinctMDOperandPlaceholder makes forward references quite cheap
between distinct nodes).
I've added two testcases:
- test/Bitcode/mdnodes-distinct-in-post-order.ll is just like
test/Bitcode/mdnodes-in-post-order.ll, except with distinct nodes
instead of uniqued ones. This confirms that, in the absence of
uniqued nodes, distinct nodes are still emitted in post-order.
- test/Bitcode/mdnodes-distinct-nodes-break-cycles.ll is the minimal
example where a naive post-order traversal would cause one uniqued
node to forward-reference another. IOW, it's the motivating test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267278 91177308-0d34-0410-b5e6-96231b3b80d8
When an operand of a distinct node hasn't been read yet, the reader can
use a DistinctMDOperandPlaceholder. This is much cheaper than forward
referencing from a uniqued node. Change
ValueEnumerator::organizeMetadata to partition distinct nodes and
uniqued nodes to reduce the overhead of cycles broken by distinct nodes.
Mehdi measured this for me; this removes most of the RAUW from the
importing step of -flto=thin, even after a WIP patch that removes
string-based DITypeRefs (introducing many more cycles to the metadata
graph).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267276 91177308-0d34-0410-b5e6-96231b3b80d8
Before we printed a warning to stderr and left the actual output stream in a
mess. This tries to print a .long or .short representation of what we saw (as
if there was a data-in-code directive).
This isn't guaranteed to restore synchronization in Thumb-mode (if the invalid
instruction was supposed to be 32-bits, we may be off-by-16 for the rest of the
function). But there's no certain way to deal with that, and it's invalid code
anyway (if the data really wasn't an instruction, the user can add proper
.data_in_code directives if they care)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267250 91177308-0d34-0410-b5e6-96231b3b80d8
Only one consumer (llvm-objdump) actually cared about the fact that there were
two triples. Others were actively working around the fact that the Triple
returned by getArch might have been invalid. As for llvm-objdump, it needs to
be acutely aware of both Triples anyway, so being generic in the exposed API is
no benefit.
Also rename the version of getArch returning a Triple. Users were having to
pass an unwanted nullptr to disambiguate the two, which was nasty.
The only functional change here is that armv7m and armv7em object files no
longer crash llvm-objdump.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267249 91177308-0d34-0410-b5e6-96231b3b80d8
The dwo_name was added to dwo files to improve diagnostics in dwp, but
it confuses tools that attempt to load any dwo named by a dwo_name, even
ones inside dwos. Avoid this by keeping track of whether a unit is
already a dwo unit, and if so, not loading further dwos.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267241 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than relying on the gmlt-like data emitted into the .o/executable
which only contains the simple name of any inlined functions, use the
.dwo file if present.
Test symbolication with/without a .dwo, and the old test that was
testing behavior when no gmlt-like data was present. (I haven't included
a test of non-gmlt-like data + no .dwo (that would be akin to
symbolication with no debug info) but we could add one for completeness)
The test was simplified a bit to be a little clearer (unoptimized, force
inline, using a function call as the inlined entity) and regenerated
with ToT clang. For the no-gmlt-like-data case, I modified Clang back to
its old behavior temporarily & the .dwo file is identical so it is
shared between the two executables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267227 91177308-0d34-0410-b5e6-96231b3b80d8
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads
a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that
value and returns it. The constant folder specifically recognizes the form of
this intrinsic and the constant initializers it may load from; if a loaded
constant initializer is known to have the form ``i32 trunc(x - %ptr)``,
the intrinsic call is folded to ``x``.
LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if ``x`` is an
``unnamed_addr`` function. However, it does not provide this guarantee for
a constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a constant.
Differential Revision: http://reviews.llvm.org/D18367
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267223 91177308-0d34-0410-b5e6-96231b3b80d8
The existing code turned out to be completely correct when auditted. Thus, only minor code changes and adding a couple of tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267215 91177308-0d34-0410-b5e6-96231b3b80d8