When targetting MIPS64R6 some of the patterns for select were guarded by a
broken predicate. The predicate was supposed to test if a constant value
could fit in a 16 bit zero-extended field. Instead the value was tested to
fit in a 16 bit sign-extended field. For negative constants of native word
width this resulted in wrong code generation.
Reviewers: vkalintiris, dsanders
Differential Review: http://reviews.llvm.org/D19378
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267151 91177308-0d34-0410-b5e6-96231b3b80d8
r267049 broke multiple buildbots (e.g. clang-cmake-mips, and clang-x86_64-linux-selfhost-modules) which the follow-ups have not yet resolved and this is preventing subsequent committers from being notified about additional failures on the affected buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267148 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When optimizing PHIs which have inputs floating point binary
operators, we preserve all IR flags except the fast math
flags.
This change removes the logic which tracked some of the IR flags
(no wrap, exact) and replaces it by doing an and on the IR flags of
all inputs to the PHI - which will also handle the fast math
flags.
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19370
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267139 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
rL256194 transforms truncations between vectors of integers into PACKUS/PACKSS
operations during DAG combine. This generates better code for truncate, so cost
of truncate needs to be changed but looks like it got changed only in SSE2 table
Whereas this change is also applicable for SSE4.1, so the cost of truncate needs
to be changed for that as well. Cost of “TRUNCATE v16i32 to v16i8” & “TRUNCATE
v16i16 to v16i8” should be same in SSE4.1 & SSE2 table. Removing their cost from
SSE4.1, so it will fall back to SSE2.
Reviewers: Simon Pilgrim
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267123 91177308-0d34-0410-b5e6-96231b3b80d8
Specifically, itineraries for LEON processors has been added, along with several LEON processor Subtargets. Although currently all these targets are pretty much identical, support for features that will differ among these processors will be added in the very near future.
The different Instruction Itinerary Classes (IICs) added are sufficient to differentiate between the instruction timings used by LEON and, quite probably, by generic Sparc processors too, but the focus of the exercise has been for LEON processors, as the requirement of my project. If the IICs are not sufficient for other Sparc processor types and you want to add a new itinerary for one of those, it should be relatively trivial to adapt this.
As none of the LEON processors has Quad Floats, or is a Version 9 processor, none of those instructions have itinerary classes defined and revert to the default "NoItinerary" instruction itinerary.
Phabricator Review: http://reviews.llvm.org/D19359
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267121 91177308-0d34-0410-b5e6-96231b3b80d8
We assumed that flags were only present on binary operators. This is
not true, they may also be present on calls and fcmps.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267113 91177308-0d34-0410-b5e6-96231b3b80d8
EarlyCSE had inconsistent behavior with regards to flag'd instructions:
- In some cases, it would pessimize if the available instruction had
different flags by not performing CSE.
- In other cases, it would miscompile if it replaced an instruction
which had no flags with an instruction which has flags.
Fix this by being more consistent with our flag handling by utilizing
andIRFlags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267111 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Also adds a small comment blurb on control flow + no-wrap flags, since
that question came up a few days back on llvm-dev.
Reviewers: bjarke.roune, broune
Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D19209
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267110 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This intrinsic returns true if the current thread belongs to a live pixel
and false if it belongs to a pixel that we are executing only for derivative
computation. It will be used by Mesa to implement gl_HelperInvocation.
Note that for pixels that are killed during the shader, this implementation
also returns true, but it doesn't matter because those pixels are always
disabled in the EXEC mask.
This unearthed a corner case in the instruction verifier, which complained
about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but
correct code, so make the verifier accept it as such.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19191
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267102 91177308-0d34-0410-b5e6-96231b3b80d8
Re-layer the functions in the new (i.e., newly correct) post-order
traversals in ValueEnumerator (r266947) and ValueMapper (r266949).
Instead of adding a node to the worklist in a helper function and
returning a flag to say what happened, return the node itself. This
makes the code way cleaner: the worklist is local to the main function,
there is no flag for an early loop exit (since we can cleanly bury the
loop), and it's perfectly clear when pointers into the worklist might be
invalidated.
I'm fixing both algorithms in the same commit to avoid repeating the
commit message; if you take the time to understand one the other should
be easy. The diff itself isn't entirely obvious since the traversals
have some noise (i.e., things to do), but here's the high-level change:
auto helper = [&WL](T *Op) { auto helper = [](T **&I, T **E) {
=> while (I != E) {
if (shouldVisit(Op)) { T *Op = *I++;
WL.push(Op, Op->begin()); if (shouldVisit(Op)) {
return true; return Op;
} }
return false; return nullptr;
}; };
=>
WL.push(S, S->begin()); WL.push(S, S->begin());
while (!empty()) { while (!empty()) {
auto *N = WL.top().N; auto *N = WL.top().N;
auto *&I = WL.top().I; auto *&I = WL.top().I;
bool DidChange = false;
while (I != N->end())
if (helper(*I++)) { => if (T *Op = helper(I, N->end()) {
DidChange = true; WL.push(Op, Op->begin());
break; continue;
} }
if (DidChange)
continue;
POT.push(WL.pop()); => POT.push(WL.pop());
} }
Thanks to Mehdi for helping me find a better way to layer this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267099 91177308-0d34-0410-b5e6-96231b3b80d8
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:
- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267098 91177308-0d34-0410-b5e6-96231b3b80d8
This removes the interfaces added (and not yet complete) to support
lazy reading of summaries. This support is not expected to be needed
since we are moving to a model where the full index is only being
traversed in the thin link step, instead of the back ends.
(The second part of this that I plan to do next is remove the
GlobalValueInfo from the ModuleSummaryIndex - it was mostly needed to
support lazy parsing of summaries. The index can instead reference the
summary structures directly.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267097 91177308-0d34-0410-b5e6-96231b3b80d8
This test used to write a .s file until r266971 fixed that. But on most bots,
the .s file still exists. Add an rm statement to clean up the bots. In a few
days, this statement can go away again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267095 91177308-0d34-0410-b5e6-96231b3b80d8
This was meant to be part of SVN r267080. cbz cannot use a high register, which
would be silently truncated. This has now been fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267092 91177308-0d34-0410-b5e6-96231b3b80d8
WIN__DBZCHK will insert a CBZ instruction into the stream. This instruction
reserves 3 bits for the condition register (rn). As such, we must ensure that
we restrict the register to a low register. Use the tGPR class instead of GPR
to ensure that this is properly constrained. In debug builds, we would attempt
to use lr as a condition register which would silently get truncated with no
hint that the register selection was incorrect.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267080 91177308-0d34-0410-b5e6-96231b3b80d8
We'd disabled them on x86 because back in the early days some host tools
couldn't handle the new load commands. This no longer holds: anyone capable of
deploying Clang should be able to deploy its copies of ar/ranlib/etc.
rdar://25254790
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267075 91177308-0d34-0410-b5e6-96231b3b80d8
When printing the properties required by a pass, only print the
properties that are set, and not those that are clear (only properties
that are set are verified, clear properties are "don't-care").
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267070 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Adds an instrumentation pass for the new EfficiencySanitizer ("esan")
performance tuning family of tools. Multiple tools will be supported
within the same framework. Preliminary support for a cache fragmentation
tool is included here.
The shared instrumentation includes:
+ Turn mem{set,cpy,move} instrinsics into library calls.
+ Slowpath instrumentation of loads and stores via callouts to
the runtime library.
+ Fastpath instrumentation will be per-tool.
+ Which memory accesses to ignore will be per-tool.
Reviewers: eugenis, vitalybuka, aizatsky, filcab
Subscribers: filcab, vkalintiris, pcc, silvas, llvm-commits, zhaoqin, kcc
Differential Revision: http://reviews.llvm.org/D19167
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267058 91177308-0d34-0410-b5e6-96231b3b80d8