During statepoint lowering we can sometimes avoid spilling of the value if we know that it was already spilled for previous statepoint.
We were doing this by checking if incoming statepoint value was lowered into load from stack slot. This was working only in boundaries of one basic block.
But instead of looking at the lowered node we can look directly at the llvm-ir value and if it was gc.relocate (or some simple modification of it) look up stack slot for it's derived pointer and reuse stack slot from it. This allows us to look across basic block boundaries.
Differential Revision: http://reviews.llvm.org/D10251
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239472 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
Reviewers: rafael
Reviewed By: rafael
Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski
Differential Revision: http://reviews.llvm.org/D10311
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239467 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
Reviewers: rafael
Reviewed By: rafael
Subscribers: rafael, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10307
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239465 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
Reviewers: echristo, rafael
Reviewed By: rafael
Subscribers: rafael, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10243
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239464 91177308-0d34-0410-b5e6-96231b3b80d8
fix segfault by checking for UnknownArch, since
getArchTypePrefix() will return nullptr for UnknownArch.
This fixes regression caused by r238424.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239456 91177308-0d34-0410-b5e6-96231b3b80d8
We have to do this manually, the runtime only sets up ebp. Fixes a crash
when returning after catching an exception.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239451 91177308-0d34-0410-b5e6-96231b3b80d8
Use a "safeseh" string attribute to do this. You would think we chould
just accumulate the set of personalities like we do on dwarf, but this
fails to account for the LSDA-loading thunks we use for
__CxxFrameHandler3. Each of those needs to make it into .sxdata as well.
The string attribute seemed like the most straightforward approach.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239448 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit 2e449ec5bcdf67b52b315b16c2128aaf25d5b73c.
This was svn r239440. Its currently failing an ARM test so reverting while I work out
what to do next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239441 91177308-0d34-0410-b5e6-96231b3b80d8
It wasn't possible to have a variable Symbol with offset or 'isCommon' so
this just enables better packing of the MCSymbol class.
Reviewed by Rafael Espindola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239440 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The RegisterScavenger explicitly ignores <kill> flags on operands of
predicated instructions and therefore assumes that such registers remain
live. When it then scavenges such a register, it inserts a spill of this
(killed) register. This is invalid code and gets flagged up by the
verifier.
Nowadays kill flags are set correctly on predicated instructions. This
patch makes the Scavenger respect them.
The bug has so far only been triggered by an internal pass, so I don't
have a test case unfortunately.
Fixes PR23119.
Reviewers: hfinkel, tobiasvk_caf
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9039
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239439 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We used to assume V->RAUW only modifies the operand list of V's user.
However, if V and V's user are Constants, RAUW may replace and invalidate V's
user entirely.
This patch fixes the above issue by letting the caller replace the
operand instead of calling RAUW on Constants.
Test Plan: @nested_const_expr and @rauw in access-non-generic.ll
Reviewers: broune, jholewinski
Reviewed By: broune, jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10345
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239435 91177308-0d34-0410-b5e6-96231b3b80d8
This gets all the handler info through to the asm printer and we can
look at the .xdata tables now. I've convinced one small catch-all test
case to work, but other than that, it would be a stretch to say this is
functional.
The state numbering algorithm avoids doing any scope reconstruction as
we do for C++ to simplify the implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239433 91177308-0d34-0410-b5e6-96231b3b80d8
Store instructions do not modify register values and therefore it's safe
to form a store pair even if the source register has been read in between
the two store instructions.
Previously, the read of w1 (see below) prevented the formation of a stp.
str w0, [x2]
ldr w8, [x2, #8]
add w0, w8, w1
str w1, [x2, #4]
ret
We now generate the following code.
stp w0, w1, [x2]
ldr w8, [x2, #8]
add w0, w8, w1
ret
All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass.
Performance results for SPEC2K were within noise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239432 91177308-0d34-0410-b5e6-96231b3b80d8
Based on feedback to r239428 by David Blaikie, use const_cast to reduce
duplication of the const and non-const versions of getNameEntryPtr.
Also have that method return the pointer to the name directly instead
of users having to then get the name from the union.
Finally, add a FIXME that we should use a static_assert once available in
the new operator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239429 91177308-0d34-0410-b5e6-96231b3b80d8
This should hopefully fix the 32-bit bots which were allocating space for a pointer
but needed to be aligned to 64-bits.
Now we allocate enough space for a uint64_t and a pointer and cast to the appropriate storage
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239428 91177308-0d34-0410-b5e6-96231b3b80d8
that was resetting it.
Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis.
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.
Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.
rdar://problem/13752163
Differential Revision: http://reviews.llvm.org/D10099
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239427 91177308-0d34-0410-b5e6-96231b3b80d8
Similarly to User which allocates a number of Use's prior to the this pointer,
allocate space for the Name* for MCSymbol only when we need a name.
Given that an MCSymbol is 48-bytes on 64-bit systems, this saves a decent % of space.
Given the verify_uselistorder test case with debug info and llc, 50k symbols have names
out of 700k so this optimises for the common case of temporary unnamed symbols.
Reviewed by David Blaikie.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239423 91177308-0d34-0410-b5e6-96231b3b80d8
array of bytes. The generation of this byte arrays was expecting
the host to be little endian, which prevents big endian hosts to be
used in the generation of the PTX code. This patch fixes the
problem by changing the way the bytes are extracted so that it
works for either little and big endian.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239412 91177308-0d34-0410-b5e6-96231b3b80d8
make_error_code(object_error) is slow because object::object_category()
uses a ManagedStatic variable. But the real problem is that the function is
called too frequently. This patch uses std::error_code() instead of
object_error::success. In most cases, we return "success", so this patch
reduces number of function calls to that function.
http://reviews.llvm.org/D10333
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239409 91177308-0d34-0410-b5e6-96231b3b80d8
Specified the llvm namespace for the 2 calls to make_unique() which caused
compilation errors in Visual Studio 2013.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239405 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
For some branches, GAS accepts an immediate instead of the 2nd register operand.
We only implement this for BNE and BEQ for now. Other branch instructions can be added later, if needed.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: seanbruno, emaste, llvm-commits
Differential Revision: http://reviews.llvm.org/D9666
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239396 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If malloc/realloc fails then the SmallVector becomes unusable since begin() and
end() will return NULL. This is unlikely to occur but was the cause of recent
bugpoint test failures on my machine.
It is not clear whether not checking for malloc/realloc failure is a deliberate
decision and adding checks has the potential to impact compiler performance.
Therefore, this patch only adds the check to builds with assertions enabled for
the moment.
Reviewers: bkramer
Reviewed By: bkramer
Subscribers: bkramer, llvm-commits
Differential Revision: http://reviews.llvm.org/D9520
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239392 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: I noticed an object file with `DW_OP_reg4 DW_OP_breg4 0` as a DWARF expression,
which I traced to a missing break (and `++I`) in this code snippet.
While I was at it, I also added support for a few other corner cases
along the same lines that I could think of.
Test Plan: Hand-crafted test case to exercises these cases is included.
Reviewers: echristo, dblaikie, aprantl
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10302
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239380 91177308-0d34-0410-b5e6-96231b3b80d8
The following code triggers a fatal error in the compiler instrumentation
of ASan on Darwin because we place the attribute into llvm.metadata section,
which does not have the proper MachO section name.
void foo() __attribute__((annotate("custom")));
void foo() {;}
This commit reorders the checks so that we skip everything in llvm.metadata
first. It also removes the hard failure in case the section name does not
parse. That check will be done lower in the compilation pipeline anyway.
(Reviewed in http://reviews.llvm.org/D9093.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239379 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This cleans up most allocas NVPTXLowerKernelArgs emits for byval
parameters.
Test Plan: makes bug21465.ll more stronger to verify no redundant local load/store.
Reviewers: eliben, jholewinski
Reviewed By: eliben, jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10322
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239368 91177308-0d34-0410-b5e6-96231b3b80d8
We don't want to replace function A by Function B in one module and Function B
by Function A in another module.
If these functions are marked with linkonce_odr we would end up with a function
stub calling B in one module and a function stub calling A in another module. If
the linker decides to pick these two we will have two stubs calling each other.
rdar://21265586
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239367 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This was a longstanding FIXME and is a necessary precursor to cases
where foldOperandImpl may have to create more than one instruction
(e.g. to constrain a register class). This is the split out NFC changes from
D6262.
Reviewers: pete, ributzka, uweigand, mcrosier
Reviewed By: mcrosier
Subscribers: mcrosier, ted, llvm-commits
Differential Revision: http://reviews.llvm.org/D10174
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239336 91177308-0d34-0410-b5e6-96231b3b80d8
on a per-function basis.
Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.
rdar://problem/20542263
Differential Revision: http://reviews.llvm.org/D8717
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239325 91177308-0d34-0410-b5e6-96231b3b80d8
The Fragment and Section, and a bool for HasFragment were all used to create
a PointerUnion. Just use a pointer union instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239324 91177308-0d34-0410-b5e6-96231b3b80d8
Also delete the now unused MCMachOSymbolFlags.h header as the only enum in there was moved to MCSymbolMachO.
Similarly to ELF and COFF, manipulating the flags is now done via helpers instead of spread
throughout the codebase.
Reviewed by Rafael Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239316 91177308-0d34-0410-b5e6-96231b3b80d8
All flags setting/getting is now done in the class with helper methods instead
of users having to get the bits in the correct order.
Reviewed by Rafael Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239314 91177308-0d34-0410-b5e6-96231b3b80d8
While we have some code to transform specification like {ax} into
{eax}/{rax} if the operand type isn't 16bit, we should reject cases
where there is no sane way to do this, like the i128 type in the
example.
Related to rdar://21042280
Differential Revision: http://reviews.llvm.org/D10260
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239309 91177308-0d34-0410-b5e6-96231b3b80d8
The global-merge pass was crashing because it assumes that all ConstantExprs
(reached via the global variables that they use) have at least one user.
I haven't worked out a way to test this, as an unused ConstantExpr cannot be
represented by serialised IR, and global-merge can only be run in llc, which
does not run any passes which can make a ConstantExpr dead.
This (reduced to the point of silliness) C code triggers this bug when compiled
for arm-none-eabi at -O1:
static a = 7;
static volatile b[10] = {&a};
c;
main() {
c = 0;
for (; c < 10;)
printf(b[c]);
}
Differential Revision: http://reviews.llvm.org/D10314
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239308 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for system register MMFR4_EL1 (memory model feature register) in the assembler.
This register provides information about the implemented memory model and memory management support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239302 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We need to add a runtime memcheck for pair of accesses (x,y) where at least one of x and y
are writes.
Assuming we have w writes and r reads, currently this number is estimated as being
w* (w+r-1). This estimation will count (write,write) pairs twice and will overestimate
the number of checks required.
This change adds a getNumberOfChecks method to RuntimePointerCheck, which
will count the number of runtime checks needed (similar in implementation to
needsAnyChecking) and uses it to produce the correct number of runtime checks.
Test Plan:
llvm test suite
spec2k
spec2k6
Performance results: no changes observed (not surprising since the formula for 1 writer is basically the same, which would covers most cases - at least with the current check limit).
Reviewers: anemet
Reviewed By: anemet
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D10217
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239295 91177308-0d34-0410-b5e6-96231b3b80d8
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
a = A[i]; // load of even element
b = A[i+1]; // load of odd element
... // operations on a, b, c, d
A[i] = c; // store of even element
A[i+1] = d; // store of odd element
}
The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
%wide.vec = load <8 x i32>, <8 x i32>* %ptr
%vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
%vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
%interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7>
store <8 x i32> %interleaved.vec, <8 x i32>* %ptr
This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239291 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visitor.
Also, this makes the code more universal - I'll take advandate of it in next patches where I start handling additional types of instructions.
Test Plan: Tests would be submitted in subsequent patches.
Reviewers: atrick, chandlerc
Reviewed By: atrick, chandlerc
Subscribers: atrick, llvm-commits
Differential Revision: http://reviews.llvm.org/D10205
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239282 91177308-0d34-0410-b5e6-96231b3b80d8
No test since the kinds of transforms this prevents seem to not really
be relevant for SI's different addressing modes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239261 91177308-0d34-0410-b5e6-96231b3b80d8
There were several SelectInst combines that always returned an existing
instruction instead of modifying an old one or creating a new one.
These are prime candidates for moving to InstSimplify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239229 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
canUnrollCompletely takes `unsigned` values for `UnrolledCost` and
`RolledDynamicCost` but is passed in `uint64_t`s that are silently
truncated. Because of this, when `UnrolledSize` is a large integer
that has a small remainder with UINT32_MAX, LLVM tries to completely
unroll loops with high trip counts.
Reviewers: mzolotukhin, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10293
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239218 91177308-0d34-0410-b5e6-96231b3b80d8
CVP wants to analyze the condition operand of a select along an edge.
It succeeds in getting back a Constant but not a ConstantInt. Instead,
it gets a ConstantExpr. It then assumes that the Constant must be equal
to false because it isn't equal to true.
Instead, perform an additional comparison.
This fixes PR23752.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239217 91177308-0d34-0410-b5e6-96231b3b80d8
If we have (select a, b, c), it is sometimes valid to simplify this to a
single select operand. However, doing so is only valid if the
computation doesn't inject poison into the computation.
It might be helpful to consider the following example:
(select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN)
The select is equivalent to (add %i, 1) but not (add nsw %i, 1).
Self hosting on x86_64 revealed that this occurs very, very rarely so
bailing out is hopefully pretty reasonable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239215 91177308-0d34-0410-b5e6-96231b3b80d8
the overloaded version of addPass which takes Pass*.
This change enables inserting the machine printer pass when the overloaded
version of addPass that takes Pass* is called to add a pass, instead of the
one which takes AnalysisID. I need this to prevent make-check tests from
failing when I commit another patch later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239192 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r239141. This commit was an attempt to reintroduce
a previous patch that broke many self-hosting bots with clang timeouts,
but it still has slowdown issues, at least on ARM, increasing the
compilation time (stage 2, clang's) by 5x.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239175 91177308-0d34-0410-b5e6-96231b3b80d8
This change is NFC because both the ``break;`` and the fall through end
up returning immediately. However, this helps clarify intent and also
ensures correctness in case more ``case`` blocks are added later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239172 91177308-0d34-0410-b5e6-96231b3b80d8
For targets with a free fneg, this fold is always a net loss if it
ends up duplicating the multiply, so definitely avoid it.
This might be true for some targets without a free fneg too, but
I'll leave that for future investigation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239167 91177308-0d34-0410-b5e6-96231b3b80d8
The new naming is (to me) much easier to understand. Here is a summary
of the new state of the world:
- '*Threshold' is the threshold for full unrolling. It is measured
against the estimated unrolled cost as computed by getUserCost in TTI
(or CodeMetrics, etc). We will exceed this threshold when unrolling
loops where unrolling exposes a significant degree of simplification
of the logic within the loop.
- '*PercentDynamicCostSavedThreshold' is the percentage of the loop's
estimated dynamic execution cost which needs to be saved by unrolling
to apply a discount to the estimated unrolled cost.
- '*DynamicCostSavingsDiscount' is the discount applied to the estimated
unrolling cost when the dynamic savings are expected to be high.
When actually analyzing the loop, we now produce both an estimated
unrolled cost, and an estimated rolled cost. The rolled cost is notably
a dynamic estimate based on our analysis of the expected execution of
each iteration.
While we're still working to build up the infrastructure for making
these estimates, to me it is much more clear *how* to make them better
when they have reasonably descriptive names. For example, we may want to
apply estimated (from heuristics or profiles) dynamic execution weights
to the *dynamic* cost estimates. If we start doing that, we would also
need to track the static unrolled cost and the dynamic unrolled cost, as
only the latter could reasonably be weighted by profile information.
This patch is sadly not without functionality change for the new unroll
analysis logic. Buried in the heuristic management were several things
that surprised me. For example, we never subtracted the optimized
instruction count off when comparing against the unroll heursistics!
I don't know if this just got lost somewhere along the way or what, but
with the new accounting of things, this is much easier to keep track of
and we use the post-simplification cost estimate to compare to the
thresholds, and use the dynamic cost reduction ratio to select whether
we can exceed the baseline threshold.
The old values of these flags also don't necessarily make sense. My
impression is that none of these thresholds or discounts have been tuned
yet, and so they're just arbitrary placehold numbers. As such, I've not
bothered to adjust for the fact that this is now a discount and not
a tow-tier threshold model. We need to tune all these values once the
logic is ready to be enabled.
Differential Revision: http://reviews.llvm.org/D9966
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239164 91177308-0d34-0410-b5e6-96231b3b80d8
These are added mainly for the benefit of clang, but this also means that they
are now allowed in .fpu directives and we emit the correct .fpu directive when
single-precision-only is used.
Differential Revision: http://reviews.llvm.org/D10238
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239151 91177308-0d34-0410-b5e6-96231b3b80d8
Add getFPUFeatures to TargetParser, which gets the list of subtarget features
that are enabled/disabled for each FPU, and use it when handling the .fpu
directive.
No functional change in this commit, though clang will start behaving
differently once it starts using this.
Differential Revision: http://reviews.llvm.org/D10237
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239150 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Only restoring AvailableFeatures is not enough and will lead to buggy behaviour.
For example, if we have a feature enabled and we ".set pop", the next time we try
to ".set" that feature nothing will happen because the "!(STI.getFeatureBits()[Feature])"
check will be false, because we didn't restore STI.FeatureBits.
In order to fix this, we need to make MipsAssemblerOptions remember the STI.FeatureBits
instead of the AvailableFeatures and then regenerate AvailableFeatures each time we ".set pop".
This is because, AFAIK, there is no way to convert from AvailableFeatures back to STI.FeatureBits,
but the reverse is possible by using ComputeAvailableFeatures(STI.FeatureBits).
I also moved the updating of AssemblerOptions inside the "if" statement in
setFeatureBits() and clearFeatureBits(), as there is no reason to update if
nothing changes.
Reviewers: dsanders, mkuper
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9156
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239144 91177308-0d34-0410-b5e6-96231b3b80d8
isInductionPHI wants to calculate the stride based on the pointee size.
However, this is not possible when the pointee is zero sized.
This fixes PR23763.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239143 91177308-0d34-0410-b5e6-96231b3b80d8
Also, moved test cases from CodeGen/X86/fold-buildvector-bug.ll into
CodeGen/X86/buildvec-insertvec.ll and regenerated CHECK lines using
update_llc_test_checks.py.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239142 91177308-0d34-0410-b5e6-96231b3b80d8
I don't have the IR which is causing the build bot breakage but I can
postulate as to why they are timing out:
1. SimplifyWithOpReplaced was stripping flags from the simplified value.
2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because
it's simplification wasn't correct.
3. InstCombine would revisit the add instruction and note that it can
rederive the flags.
4. By modifying the value, we chose to revisit instructions which reuse
the value. One of the instructions is the original select, causing
LLVM to never reach fixpoint.
Instead, strip the flags only when we are sure we are going to perform
the simplification.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239141 91177308-0d34-0410-b5e6-96231b3b80d8
This is breaking a lot of build bots and is causing very long-running
compiles (infinite loops)?
Likely, we shouldn't return nullptr?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239139 91177308-0d34-0410-b5e6-96231b3b80d8
When we generate coverage data, we explicitly set each coverage map's
alignment to 8 (See InstrProfiling::lowerCoverageData), but when we
read the coverage data, we assume consecutive maps are exactly
adjacent. When we're dealing with 32 bit, maps can end on a 4 byte
boundary, causing us to think the padding is part of the next record.
Fix this by adjusting the buffer to an appropriately aligned address
between records.
This is pretty awkward to test, as it requires a binary with multiple
coverage maps to hit, so we'd need to check in multiple source files
and a binary blob as inputs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239129 91177308-0d34-0410-b5e6-96231b3b80d8
We cleverly handle cases where computation done in one argument of a select
instruction is suitable for the other operand, thus obviating the need
of the select and the comparison. However, the other operand cannot
have flags.
This fixes PR23757.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239115 91177308-0d34-0410-b5e6-96231b3b80d8
gc.statepoint intrinsics with a far immediate call target
were lowered incorrectly as pc-rel32 calls.
This change fixes the problem, and generates an indirect call
via a scratch register.
For example:
Intrinsic:
%safepoint_token = call i32 (i64, i32, void ()*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* inttoptr (i64 140727162896504 to void ()*), i32 0, i32 0, i32 0, i32 0)
Old Incorrect Lowering:
callq 140727162896504
New Correct Lowering:
movabsq $140727162896504, %rax
callq *%rax
In lowerCallFromStatepoint(), the callee-target was modified and
represented as a "TargetConstant" node, rather than a "Constant" node.
Undoing this modification enabled LowerCall() to generate the
correct CALL instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239114 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
A small bit that I missed when I updated the X86 backend to account for
the Win64 calling convention on non-Windows. Now we don't use dead
non-volatile registers when emitting a Win64 indirect tail call on
non-Windows.
Should fix PR23710.
Test Plan: Added test for the correct behavior based on the case I posted to PR23710.
Reviewers: rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10258
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239111 91177308-0d34-0410-b5e6-96231b3b80d8
Report proper error code from MachOObjectFile constructor if we
can't parse another segment load command (we already return a proper
error if segment load command contents is suspicious).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239109 91177308-0d34-0410-b5e6-96231b3b80d8
The big/small ordering here is based on signed values so SmallValue will
be INT_MIN and BigValue 0. This shouldn't be a problem but the code
assumed that BigValue always had more bits set than SmallValue.
We used to just miss the transformation, but a recent refactoring of
mine turned this into an assertion failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239105 91177308-0d34-0410-b5e6-96231b3b80d8
NVPTXISelDAGToDAG translates "addrspacecast to param" to
NVPTX::nvvm_ptr_gen_to_param
Added an llc test in bug21465.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239100 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
1) The only caller, ARMTargetParser::parseArch, uses the results for an "endswith" test; so, including the "arm" prefix into the result is unnecessary.
2) Most ARMTargetParser::parseArch callers pass it the output from ARMTargetParser::getCanonicalArchName; so, make this behaviour the default. Then, including the "arm" prefix into the cases is unnecessary.
Reviewers: rengolin
Reviewed By: rengolin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10249
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239099 91177308-0d34-0410-b5e6-96231b3b80d8
Basic block selection involves checking successor BBs for PHI nodes
that depend on the current BB. In case such BBs are found, the value
being selected is a constant and such constant already exists in
current BB, it's value is reused.
This might lead to wrong locations in some situations, especially if
same constant value ends up being materialized twice in two different
ways, which discards that sharing and leaves us with wrong debug
location in the successor BB.
In code this involves the following sequence of calls:
SelectionDAGBuilder::HandlePHINodesInSuccessorBlocks ->
SelectionDAGBuilder::CopyValueToVirtualRegister ->
SelectionDAGBuilder::getNonRegisterValue
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239089 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we can look at users, we can trivially do this: when we would
have otherwise disabled GlobalMerge (currently -O<3), we can just run
it for minsize functions, as it's usually a codesize win.
Differential Revision: http://reviews.llvm.org/D10054
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239087 91177308-0d34-0410-b5e6-96231b3b80d8
Fix the FIXME and remove this old as(1) compat option. It was useful for
bringup of the integrated assembler to diff object files, but now it's
just causing more relocations than strictly necessary to be generated.
rdar://21201804
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239084 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
With this patch, NVPTXLowerKernelArgs converts a kernel pointer argument to a
pointer in the global address space. This change, along with
NVPTXFavorNonGenericAddrSpaces, allows the NVPTX backend to emit ld.global.*
and st.global.* for accessing kernel pointer arguments.
Minor changes:
1. refactor: extract function convertToPointerInAddrSpace
2. fix a bug in the test case in bug21465.ll
Test Plan: lower-kernel-ptr-arg.ll
Reviewers: eliben, meheff, jholewinski
Reviewed By: jholewinski
Subscribers: wengxt, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10154
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239082 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Properly report the error in segment load commands from MachOObjectFile
constructor instead of crashing the program.
Adjust the test case accordingly.
Test Plan: regression test suite
Reviewers: rafael, filcab
Subscribers: llvm-commits
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239081 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Currently all load commands are parsed in MachOObjectFile constructor.
If the next load command cannot be parsed, or if command size is too
small, properly report it through the error code and fail to construct
the object, instead of crashing the program.
Test Plan: regression test suite
Reviewers: rafael, filcab
Subscribers: llvm-commits
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239080 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Instead, properly report this error from MachOObjectFile constructor.
Test Plan: regression test suite
Reviewers: rafael
Subscribers: llvm-commits
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239078 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Avoid parsing object file each time MachOObjectFile::getHeader() is
called. Instead, cache the header in MachOObjectFile constructor, where
it's parsed anyway. In future, we must avoid constructing the object
at all if the header can't be parsed.
Test Plan: regression test suite.
Reviewers: rafael
Subscribers: llvm-commits
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239075 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
-march=bpf -> host endian
-march=bpf_le -> little endian
-match=bpf_be -> big endian
Test Plan:
v1 was tested by IBM s390 guys and appears to be working there.
It bit rots too fast here.
Reviewers: chandlerc, tstellarAMD
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10177
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239071 91177308-0d34-0410-b5e6-96231b3b80d8
Method 'visitBUILD_VECTOR' in the DAGCombiner knows how to combine a
build_vector of a bunch of extract_vector_elt nodes and constant zero nodes
into a shuffle blend with a zero vector.
However, method 'visitBUILD_VECTOR' forgot that a floating point
build_vector may contain negative zero as well as positive zero.
Example:
define <2 x double> @example(<2 x double> %A) {
entry:
%0 = extractelement <2 x double> %A, i32 0
%1 = insertelement <2 x double> undef, double %0, i32 0
%2 = insertelement <2 x double> %1, double -0.0, i32 1
ret <2 x double> %2
}
Before this patch, llc (with -mattr=+sse4.1) wrongly generated
movq %xmm0, %xmm0 # xmm0 = xmm0[0],zero
So, the sign bit of the negative zero was effectively lost.
This patch fixes the problem by adding explicit checks for positive zero.
With this patch, llc produces the following code for the example above:
movhpd .LCPI0_0(%rip), %xmm0
where .LCPI0_0 referes to a 'double -0'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239070 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we sometimes know the address space, this can
theoretically do a better job.
This needs better test coverage, but this mostly depends on
first updating the loop optimizatiosn to provide the address
space.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239053 91177308-0d34-0410-b5e6-96231b3b80d8
When checking (High - Low + 1).sle(BitWidth), BitWidth would be truncated
to the size of the left-hand side. In the case of this PR, the left-hand
side was i4, so BitWidth=64 got truncated to 0 and the assert failed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239048 91177308-0d34-0410-b5e6-96231b3b80d8
Section symbols exist as an optimization: instead of having multiple relocations
point to different symbols, many of them can point to a single section symbol.
When that optimization is unused, a section symbol is also unused and adds no
extra information to the object file.
This saves a bit of space on the object files and makes the output of
llvm-objdump -t easier to read and consequently some tests get quite a bit
simpler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239045 91177308-0d34-0410-b5e6-96231b3b80d8
If the compare in a select pattern has another use then it can't be removed, so we'd just
be creating repeated code if we created a min/max node.
Spotted by Matt Arsenault!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239037 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is the first of several patches to eliminate StringRef forms of GNU
triples from the internals of LLVM. After this is complete, GNU triples
will be replaced by a more authoratitive representation in the form of
an LLVM TargetTuple.
Reviewers: rengolin
Reviewed By: rengolin
Subscribers: ted, llvm-commits, rengolin, jholewinski
Differential Revision: http://reviews.llvm.org/D10236
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239036 91177308-0d34-0410-b5e6-96231b3b80d8
Removed the redundant "llvm::" from class names in InstrProfiling.cpp
clang-format is ran on the changes.
Patch from Betul Buyukkurt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239034 91177308-0d34-0410-b5e6-96231b3b80d8
The fix is just that getOther had not been updated for packing the st_other
values in fewer bits and could return spurious values:
- unsigned Other = (getFlags() & (0x3f << ELF_STO_Shift)) >> ELF_STO_Shift;
+ unsigned Other = (getFlags() & (0x7 << ELF_STO_Shift)) >> ELF_STO_Shift;
Original message:
Pack the MCSymbolELF bit fields into MCSymbol's Flags.
This reduces MCSymolfELF from 64 bytes to 56 bytes on x86_64.
While at it, also make getOther/setOther easier to use by accepting unshifted
STO_* values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239012 91177308-0d34-0410-b5e6-96231b3b80d8
This reduces MCSymolfELF from 64 bytes to 56 bytes on x86_64.
While at it, also make getOther/setOther easier to use by accepting unshifted
STO_* values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239006 91177308-0d34-0410-b5e6-96231b3b80d8
port it to the new pass manager.
All this does is extract the inner "location" class used by AA into its
own full fledged type. This seems *much* cleaner as MemoryDependence and
soon MemorySSA also use this heavily, and it doesn't make much sense
being inside the AA infrastructure.
This will also make it much easier to break apart the AA infrastructure
into something that stands on its own rather than using the analysis
group design.
There are a few places where this makes APIs not make sense -- they were
taking an AliasAnalysis pointer just to build locations. I'll try to
clean those up in follow-up commits.
Differential Revision: http://reviews.llvm.org/D10228
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239003 91177308-0d34-0410-b5e6-96231b3b80d8
The first try (r238051) to land this was reverted due to ExecutionEngine build failure;
that was hopefully addressed by r238788.
The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure;
that was hopefully addressed by r238953.
This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.
The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.
Differential Revision: http://reviews.llvm.org/D8982
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239001 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Now users don't have to manually deal with getFirstLoadCommandInfo() /
getNextLoadCommandInfo(), calculate the number of load segments, etc.
No functionality change.
Test Plan: regression test suite
Reviewers: rafael, lhames, loladiro
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10144
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238983 91177308-0d34-0410-b5e6-96231b3b80d8
This avoids yet another last minute patching of the binding.
While at it, also simplify the weakref implementation a bit by not walking
past it in the expression evaluation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238982 91177308-0d34-0410-b5e6-96231b3b80d8
With this getBinging can now return the correct answer for all cases not
involving a .symver and the elf writer doesn't need to patch it last minute.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238980 91177308-0d34-0410-b5e6-96231b3b80d8
_GLOBAL_OFFSET_TABLE_ is not magical and we can now directly check for a
symbol never getting an explicit binding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238978 91177308-0d34-0410-b5e6-96231b3b80d8
The existing code would unnecessarily break LDRD/STRD apart with
non-adjacent registers, on thumb2 this is not necessary.
Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore
as there is not reason to set register hints anymore, changing that is
something for a future patch however.
Differential Revision: http://reviews.llvm.org/D9694
Recommiting after the revert in r238821, the buildbot still failed with
the patch removed so there seems to be another reason for the breakage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238935 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
But still handle them the same way since I don't know how they differ on
this target.
Of these, /U[qytnms]/ do not have backend tests but are accepted by clang.
No functional change intended.
Reviewers: t.p.northover
Reviewed By: t.p.northover
Subscribers: t.p.northover, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D8203
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238921 91177308-0d34-0410-b5e6-96231b3b80d8
Intel® Memory Protection Extensions (Intel® MPX) is a new feature in Skylake.
It is a part of KNL and SKX sets. It is also a part of Skylake client.
I added definition of %bnd0 - %bnd3 registers, each register is a pair of 64-bit integers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238916 91177308-0d34-0410-b5e6-96231b3b80d8
The windows buildbot originally failed because the check expressions are
evaluated as 64-bit values, even for 32-bit symbols. Fixed this by comparing
bottom 32-bits of the expressions.
The host/target endian mismatch issue is that it's invalid to read/write target
values using a host pointer without taking care of endian differences between
the target and host. Most (if not all) instances of
reinterpret_cast<uint32_t*>() in the RuntimeDyld are examples of this bug.
This has been fixed for Mips using the endian aware read/write functions.
The original commits were:
r238838:
[mips] Add RuntimeDyld tests for currently supported O32 relocations.
Reviewers: petarj, vkalintiris
Reviewed By: vkalintiris
Subscribers: vkalintiris, llvm-commits
Differential Revision: http://reviews.llvm.org/D10126
r238844:
[mips][mcjit] Add support for R_MIPS_PC32.
Summary:
This allows us to resolve relocations for DW_EH_PE_pcrel TType encodings
in the exception handling LSDA.
Also fixed a nearby typo.
Reviewers: petarj, vkalintiris
Reviewed By: vkalintiris
Subscribers: vkalintiris, llvm-commits
Differential Revision: http://reviews.llvm.org/D10127
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238915 91177308-0d34-0410-b5e6-96231b3b80d8