If you've archived the DWP file somewhere it's probably useful to be
able to just tell llvm-symbolizer where it is when you're symbolizing
stack traces from the binary.
This only provides a mechanism for specifying a single DWP file, good if
you're symbolizing a program with a single DWP file, but it's likely if
the program is dynamically linked that you might have a DWP for each
dynamic library - in which case this feature won't help (at least as
it's surfaced in llvm-symbolizer for now) - in theory it could be
extended to specify a collection of DWP files that could all be
consulted for split CU hash resolution.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309498 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Most CPUs implementing AES fusion require instruction pairs of the form
AESE Vn, _
AESMC Vn, Vn
and
AESD Vn, _
AESIMC Vn, Vn
The constraint is added to AES(I)MC instructions which use the result of
an AES(E|D) instruction by using AES(I)MCTrr pseudo instructions, which
constraint source and destination registers to be the same.
A nice side effect of this change is that now all possible pairs are
scheduled back-to-back on the exynos-m1 for the misched-fusion-aes.ll
test case.
I had to update aes_load_store. The version I added initially was very
reduced and with the new constraint, AESE/AESMC could not be scheduled
back-to-back. I updated the test to be more realistic and still expose
the same scheduling problem as the initial test case.
Reviewers: t.p.northover, rengolin, evandro, kristof.beyls, silviu.baranga
Reviewed By: t.p.northover, evandro
Subscribers: aemerson, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D35299
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309495 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change gives a 0.25% speedup on execution time, a 0.82% improvement
in benchmark scores and a 0.20% increase in binary size on a Cortex-A53.
These numbers are the geomean results on a wide range of benchmarks from
the test-suite and a range of proprietary suites.
Reviewers: t.p.northover, aadg, silviu.baranga, mcrosier, rengolin
Reviewed By: rengolin
Subscribers: grimar, davide, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D35568
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309494 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than passing along most of the parameters, pass a reference to
the MCDWARFrameInfo instead. This makes it easier to pass additional
information about the frame to the checks. We need to keep the extra
constructor for the Key around to allow the construction of the null and
tombstone keys. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309493 91177308-0d34-0410-b5e6-96231b3b80d8
If the return column is different, we cannot coalesce the CIE across the
FDEs. Add that to the key calculation. This ensures that we emit a
separate CIE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309492 91177308-0d34-0410-b5e6-96231b3b80d8
move test/Transforms/SimplifyCFG/disable-lookup-table.ll into test/Transforms/SimplifyCFG/X86/disable-lookup-table.ll to avoid test failure when X86 backend is not enabled
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309487 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is in 2 parts:
1 - replace combineBT's use of SimplifyDemandedBits (hasOneUse only) with SelectionDAG::GetDemandedBits to more aggressively determine the lower bits used by BT.
2 - update SelectionDAG::GetDemandedBits to support ANY_EXTEND - if the demanded bits are only in the non-extended portion, then peek through and demand from the source value and then ANY_EXTEND that if we found a match.
Differential Revision: https://reviews.llvm.org/D35896
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309486 91177308-0d34-0410-b5e6-96231b3b80d8
Detect [/-][DU]NDEBUG in CMAKE_C_FLAGS* and pass them through to ocamlc.
This is necessary because their value might affect visibility of dump
functions in LLVM and ocamlc uses its own compiler and flags by default.
Differential Revision: https://reviews.llvm.org/D35898
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309483 91177308-0d34-0410-b5e6-96231b3b80d8
Install the OCaml dynamic libraries in the 'stubdirs' directory rather
than the llvm subdirectory in order to fix running executables created
by ocamlc. Otherwise, the executables fail to run being unable to locate
the libraries (unless the LLVM directory is explicitly added to
LD_LIBRARY_PATH).
The staging directories are not altered since they work for our
development setup anyway, and installing into two directories would
unnecessarily make the code more complex.
Differential Revision: https://reviews.llvm.org/D35995
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309481 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Now that SamplePGOSupport is part of PGOOpt, there are several places that need tweaking:
1. AddDiscriminator pass should *not* be invoked at ThinLTOBackend (as it's already invoked in the PreLink phase)
2. addPGOInstrPasses should only be invoked when either ProfileGenFile or ProfileUseFile is non-empty.
3. SampleProfileLoaderPass should only be invoked when SampleProfileFile is non-empty.
4. PGOIndirectCallPromotion should only be invoked in ProfileUse phase, or in ThinLTOBackend of SamplePGO.
Reviewers: chandlerc, tejohnson, davidxl
Reviewed By: chandlerc
Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D36040
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309478 91177308-0d34-0410-b5e6-96231b3b80d8
This commit
- Removes IsTailCall and replaces it with a target-defined unsigned
- Refactors getOutliningCallOverhead and getOutliningFrameOverhead so that they don't use IsTailCall
- Adds a call class + frame class classification to OutlinedFunction and Candidate respectively
This accomplishes a couple things.
Firstly, we don't need the notion of *tail call* in the general outlining algorithm.
Secondly, we now can have different "outlining classes" for each candidate within a set of candidates.
This will make it easy to add new ways to outline sequences for certain targets and dynamically choose
an appropriate cost model for a sequence depending on the context that that sequence lives in.
Ultimately, this should get us closer to being able to do something like, say avoid saving the link
register when outlining AArch64 instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309475 91177308-0d34-0410-b5e6-96231b3b80d8
Traceback (most recent call last):
File "llvm/utils/lit/tests/Inputs/shtest-format/external_shell/write-bad-encoding.py", line 5, in <module>
sys.stdout.write(b"a line with bad encoding: \xc2.")
sys.stdout.write doesn't accept bytes but sys.stdout.buffer.write accepts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309473 91177308-0d34-0410-b5e6-96231b3b80d8
Also refine the flat check to respect flat-for-global feature,
and constant fallback should check global handling, not
specifically MUBUF.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309471 91177308-0d34-0410-b5e6-96231b3b80d8
This diff removes the second argument of the method MachOObjectFile::exports.
In all in-tree uses this argument is equal to "this" and
without this argument the interface seems to be cleaner.
Test plan: make check-all
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309462 91177308-0d34-0410-b5e6-96231b3b80d8
When I tried running the script, the ARM regex parser could not parse
my code. It failed because the .Lfunc_end line has a comment at the
end of it, so this commit removes the newline at the end of the regex.
Patch by Joel Galenson!
Differential Revision: https://reviews.llvm.org/D35641
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309457 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This exposes LTO's Conf.SampleProfile as a command line option
(-lto-sample-profile-file) for testing via the llvm-lto2 utility.
Reviewers: pcc, danielcdh
Subscribers: mehdi_amini, inglorion, llvm-commits
Differential Revision: https://reviews.llvm.org/D36030
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309456 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Inlining threshold is increased by application of bonuses when the
callee has a single reachable basic block or is rich in vector
instructions. Similarly, inlining cost is reduced by applying a large
bonus when the last call to a static function is considered for
inlining. This patch disables the application of these bonuses when the
callsite or the callee is cold. The intention here is to prevent a large
cold callsite from being inlined to a non-cold caller that could prevent
the caller from being inlined. This is especially important when the
cold callsite is a last call to a static since the associated bonus is
very high.
Reviewers: chandlerc, davidxl
Subscribers: danielcdh, llvm-commits
Differential Revision: https://reviews.llvm.org/D35823
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309441 91177308-0d34-0410-b5e6-96231b3b80d8
This should fix googletest-format test failures on the clang modules
buildbots, which have a stale copy of the OneTest script in the build
directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309432 91177308-0d34-0410-b5e6-96231b3b80d8
There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.
rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309426 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
After some changes in SLP vectorizer we missed some additional checks to
limit the instructions for vectorization. We should not perform analysis
of the instructions if the parent of instruction is not the same as the
parent of the first instruction in the tree or it was analyzed already.
Subscribers: mzolotukhin
Differential Revision: https://reviews.llvm.org/D34881
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309425 91177308-0d34-0410-b5e6-96231b3b80d8
The conditional tail call logic did the wrong thing when both
destinations of a conditional branch were the same:
BB#1: derived from LLVM BB %entry
Live Ins: %EFLAGS
Predecessors according to CFG: BB#0
JE_1 <BB#5>, %EFLAGS<imp-use,kill>
JMP_1 <BB#5>
BB#5: derived from LLVM BB %sw.epilog
Predecessors according to CFG: BB#1
TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ...
We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5
successor. Then BB#5 would be deleted as it had no predecessors, leaving
a dangling "JMP_1 <BB#5>" reference behind to cause assertions later.
This patch checks that both conditional branch destinations are
different before doing the transform. The standard branch folding logic
is able to remove both the JMP_1 and the JE_1, and for my test case we
end up forming a better conditional tail call later.
Fixes PR33980
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309422 91177308-0d34-0410-b5e6-96231b3b80d8
This allows handling of a lot more of the interesting
cases in Blender. Most of the large functions unlikely
to be inlined have this pattern.
This is a special case for what clang emits for OpenCL 3
element vectors. Annoyingly, these are emitted as
<3 x elt>* pointers, but accessed as <4 x elt>* operations.
This also needs to handle cases where a struct containing
a single vector is used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309419 91177308-0d34-0410-b5e6-96231b3b80d8
It is better to return arguments directly in registers
if we are making a call rather than introducing expensive
stack usage. In one of sample compile from one of
Blender's many kernel variants, this fires on about
~20 different functions. Future improvements may be to
recognize simple cases where the pointer is indexing a small
array. This also fails when the store to the out argument
is in a separate block from the return, which happens in
a few of the Blender functions. This should also probably
be using MemorySSA which might help with that.
I'm not sure this is correct as a FunctionPass, but
MemoryDependenceAnalysis seems to not work with
a ModulePass.
I'm also not sure where it should run.I think it should
run before DeadArgumentElimination, so maybe either
EP_CGSCCOptimizerLate or EP_ScalarOptimizerLate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309416 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
LazyValueInfo currently computes the constant value of the switch condition through case edges, which allows the constant value to be propagated through the case edges.
But we have seen a case where a zero-extended value of the switch condition is used past case edges for which the constant propagation doesn't occur.
This patch adds a small logic to handle such a case in getEdgeValueLocal().
This is motivated by the Python 2.7 eval loop in PyEval_EvalFrameEx() where the lack of the constant propagation causes longer live ranges and more spill code than necessary.
With this patch, we see that the code size of PyEval_EvalFrameEx() decreases by ~5.4% and a performance test improves by ~4.6%.
Reviewers: wmi, dberlin, sanjoy
Reviewed By: sanjoy
Subscribers: davide, davidxl, llvm-commits
Differential Revision: https://reviews.llvm.org/D34822
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309415 91177308-0d34-0410-b5e6-96231b3b80d8