Summary:
Implements fastLowerArguments() to avoid the need to fall back on
SelectionDAG for 0-4 argument functions that don't do tricky things like
passing double in a pair of i32's.
This allows us to move all except one test to -fast-isel-abort=3. The
remaining one has function prototypes of the form 'i32 (i32, double, double)'
which requires floats to be passed in GPR's.
Reviewers: sdardis
Subscribers: dsanders, llvm-commits, sdardis
Differential Revision: https://reviews.llvm.org/D22680
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276982 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We were using reserved VGPRs for SGPR spilling and this was causing
some programs with a workgroup size of 1024 to use more than 64
registers, which is illegal.
Reviewers: arsenm, mareko, nhaehnle
Subscribers: nhaehnle, arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D22032
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276980 91177308-0d34-0410-b5e6-96231b3b80d8
This version has two fixes compared to the original:
* In Registry.h the template static members are instantiated before they are
used, as clang gives an error if you do it the other way around.
* The use of the Registry template in clang-tidy is updated in the same way as
has been done everywhere else.
Original commit message:
Currently the Registry class contains the vestiges of a previous attempt to
allow plugins to be used on Windows without using BUILD_SHARED_LIBS, where a
plugin would have its own copy of a registry and export it to be imported by
the tool that's loading the plugin. This only works if the plugin is entirely
self-contained with the only interface between the plugin and tool being the
registry, and in particular this conflicts with how IR pass plugins work.
This patch changes things so that instead the add_node function of the registry
is exported by the tool and then imported by the plugin, which solves this
problem and also means that instead of every plugin having to export every
registry they use instead LLVM only has to export the add_node functions. This
allows plugins that use a registry to work on Windows if
LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276973 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
SI_ELSE is lowered into two parts:
s_or_saveexec_b64 dst, src (at the start of the basic block)
s_xor_b64 exec, exec, dst (at the end of the basic block)
The idea is that dst contains the exec mask of the preceding IF block. It can
happen that SIWholeQuadMode decides to switch from WQM to Exact mode inside
the basic block that contains SI_ELSE, in which case it introduces an instruction
s_and_b64 exec, exec, s[...]
which masks out bits that can correspond to both the IF and the ELSE paths.
So the resulting sequence must be:
s_or_savexec_b64 dst, src
s_and_b64 exec, exec, s[...] <-- added by SIWholeQuadMode
s_and_b64 dst, dst, exec <-- added by SILowerControlFlow
s_xor_b64 exec, exec, dst
Whether to add the additional s_and_b64 dst, dst, exec is currently determined
via the ExecModified tracking. With this change, it is instead determined by
an additional flag on SI_ELSE which is set by SIWholeQuadMode.
Finally: It also occured to me that an alternative approach for the long run
is for SILowerControlFlow to unconditionally emit
s_or_saveexec_b64 dst, src
...
s_and_b64 dst, dst, exec
s_xor_b64 exec, exec, dst
and have a pass that detects and cleans up the "redundant AND with exec"
pattern where possible. This could be useful anyway, because we also add
instructions
s_and_b64 vcc, exec, vcc
before s_cbranch_scc (in moveToALU), and those are often redundant. I have
some pending changes to how KILL is lowered that could also benefit from
such a cleanup pass.
In any case, this current patch could help in the short term with the whole
ExecModified business.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D22846
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276972 91177308-0d34-0410-b5e6-96231b3b80d8
These loop from 0 to AEK_XSCALE, which is currently defined as 0x80000000, and
thus the tests loop over the entire int range, which is unreasonable
and also too slow in debug builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276969 91177308-0d34-0410-b5e6-96231b3b80d8
When folding an expression, we run ConstantFoldConstantExpression on
each operand of that expression.
However, ConstantFoldConstantExpression can fail and retur nullptr.
Previously, we would bail on further refining the expression.
Instead, use the original operand and see if we can refine a later
operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276959 91177308-0d34-0410-b5e6-96231b3b80d8
Add unittest to {ARM | AArch64}TargetParser,and by the way correct problems as below:
1.Correct a incorrect indexing problem in AArch64TargetParser. The architecture enumeration
is shared across ARM and AArch64 in original implementation.But In the code,I just used the
index which was offset by the ARM, and this would index into the array incorrectly. To make
AArch64 has its own arch enum,or we will do a lot of slowly iterating.
2.Correct a spelling error. The parameter of llvm::AArch64::getArchExtName.
3.Correct a writing mistake, in llvm::ARM::parseArchISA.
Differential Revision: https://reviews.llvm.org/D21785
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276957 91177308-0d34-0410-b5e6-96231b3b80d8
A function may have instructions annotated with debug info without
having a subprogram.
This fixes PR28747.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276956 91177308-0d34-0410-b5e6-96231b3b80d8
The EP_CGSCCOptimizerLate extension point allows adding CallGraphSCC
passes at the end of the main CallGraphSCC passes and before any
function simplification passes run by CGPassManager.
Patch by Gor Nishanov!
Differential Revision: https://reviews.llvm.org/D22897
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276953 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
getName() involves a hashtable lookup, so is expensive given how
frequently isIntrinsic() is called. (In particular, many users cast to
IntrinsicInstr or one of its subclasses before calling
getIntrinsicID().)
This has an incidental functional change: Before, isIntrinsic() would
return true for any function whose name started with "llvm.", even if it
wasn't properly an intrinsic. The new behavior seems more correct to
me, because it's strange to say that isIntrinsic() is true, but
getIntrinsicId() returns "not an intrinsic".
Some callers want the old behavior -- they want to know whether the
caller is a recognized intrinsic, or might be one in some other version
of LLVM. For them, we added Function::hasLLVMReservedName(), which
checks whether the name starts with "llvm.".
This change is good for a 1.5% e2e speedup compiling a large Eigen
benchmark.
Reviewers: bogner
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22065
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276942 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
LCSSAWrapperPass currently doesn't override verifyAnalysis method, so pass
manager doesn't verify LCSSA. This patch adds the method so that we start
verifying LCSSA between loop passes.
Reviewers: chandlerc, sanjoy, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22888
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276941 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Given the crash in D22878, this patch converts the load/store vectorizer
to use explicit Instruction*s wherever possible. This is an overall
simplification and should be an improvement in safety, as we have fewer
naked cast<>s, and now where we use Value*, we really mean something
different from Instruction*.
This patch also gets rid of some cast<>s around Value*s returned by
Builder. Given that Builder constant-folds everything, we can't assume
much about what we get out of it.
One downside of this patch is that we have to copy our chain before
calling propagateMetadata. But I don't think this is a big deal, as our
chains are very small (usually 2 or 4 elems).
Reviewers: asbirlea
Subscribers: mzolotukhin, llvm-commits, arsenm
Differential Revision: https://reviews.llvm.org/D22887
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276938 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This lets us avoid creating and destroying a CallbackVH every time we
check the cache.
This is good for a 2% e2e speedup when compiling one of the large Eigen
tests at -O3.
FTR, I tried making the ValueCache hashtable one-level -- i.e., mapping
a pair (Value*, BasicBlock*) to a lattice value, and that didn't seem to
provide any additional improvement. Saving a word in LVILatticeVal by
merging the Tag and Val fields also didn't yield a speedup.
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21951
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276926 91177308-0d34-0410-b5e6-96231b3b80d8
llvm-cov's `-dump' option now emits information which helps debug source
range highlighting in html mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276924 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When we ask the builder to create a bitcast on a constant, we get back a
constant, not an instruction.
Reviewers: asbirlea
Subscribers: jholewinski, mzolotukhin, llvm-commits, arsenm
Differential Revision: https://reviews.llvm.org/D22878
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276922 91177308-0d34-0410-b5e6-96231b3b80d8
Before adding a new preheader block, check if there is a candidate block
where the loop setup could be placed speculatively. This will be off by
default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276919 91177308-0d34-0410-b5e6-96231b3b80d8
Factor out countDuplicatedInstructions to Count duplicated instructions at the
beginning and end of a diamond pattern. This is in prep for adding support for
diamonds that need to be tail-merged.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276910 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes the highlighting for lines without any coverage segments. I
don't have a neat way of testing this yet, but am working on it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276906 91177308-0d34-0410-b5e6-96231b3b80d8
Remove the implicit conversion from MachineInstrBundleIterator to
MachineInstr*, leaving behind an explicit conversion.
I *think* this is the last ilist_iterator-related implicit conversion to
ilist_node subclass. If I'm right, I can finally dig in and fix the UB
in ilist that these conversions were relying on.
Note that the implicit users of this conversion have already been
removed. If you have out-of-tree code that doesn't update, you might be
able to buy some time by temporarily reverting this commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276902 91177308-0d34-0410-b5e6-96231b3b80d8
Avoid implicit conversions from MachineInstrBundleIterator to
MachineInstr*, mainly by preferring MachineInstr& over MachineInstr*.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276899 91177308-0d34-0410-b5e6-96231b3b80d8
Fix intel syntax special case identifier operands that refer to a constant
(e.g. .set <ID> n) to be interpreted as immediate not memory in parsing.
Reviewers: rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22585
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276895 91177308-0d34-0410-b5e6-96231b3b80d8
The callee-saved registers that are saved in a function are not pristine,
and so they can be defined and used. In case of shrink-wrapping though,
there are blocks that are outside of the save/restore range, and in those
blocks the saved registers must be treated as pristine. To avoid any uses
of these registers, add them as live-in in all those blocks.
This was already done for blocks reaching function exits after restore,
add code that does the same for blocks reached from the function entry
before save.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276886 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Add a pass to bugpoint to make it transform conditional jumps into unconditional jumps.
Often, bugpoint generates output that has large numbers of br undef jumps, where
one side is dead.
What is happening is two fold:
1. It never tries to just pick a direction for the jump, and just see what happens
<<<< this patch
2. SimplifyCFG no longer is a good match for bugpoint's usecase. It
does too much.
Even things in SimplifyCFG, like removeUnreachableBlocks, go to great
lengths to transform undefined behavior into blocks and kill large
parts of the CFG. This is great for regular code, not so much for
bugpoint, which often generates UB on purpose (store undef is a great
example).
<<<< a followup patch that is coming, to move simplifycfg into a
separate reduction pass, and move the existing reduceCrashingBlocks
pass to use simpleSimplifyCFG.
Both of these patches significantly reduce the size and complexity of bugpoint
generated testcases.
Reviewers: chandlerc, majnemer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22841
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276884 91177308-0d34-0410-b5e6-96231b3b80d8
TargetOptions wants the ExceptionHandling enum. Move that to
MCTargetOptions.h to avoid transitively including Dwarf.h everywhere in
clang. Now you can add a DWARF tag without a full rebuild of clang
semantic analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276883 91177308-0d34-0410-b5e6-96231b3b80d8