Summary:
As per Duncan's review for D12536, I extracted the sub-byte bit aligned
reading and writing code into lib/Support, and generalized it. Added calls from
BackpatchWord. Also added unittests.
Reviewers: dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13189
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248897 91177308-0d34-0410-b5e6-96231b3b80d8
Add support to the indexed instrprof reader and writer for the format
that will be used for value profiling.
Patch by Betul Buyukkurt, with minor modifications.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248833 91177308-0d34-0410-b5e6-96231b3b80d8
HHVM calling convention, hhvmcc, is used by HHVM JIT for
functions in translated cache. We currently support LLVM back end to
generate code for X86-64 and may support other architectures in the
future.
In HHVM calling convention any GP register could be used to pass and
return values, with the exception of R12 which is reserved for
thread-local area and is callee-saved. Other than R12, we always
pass RBX and RBP as args, which are our virtual machine's stack pointer
and frame pointer respectively.
When we enter translation cache via hhvmcc function, we expect
the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed
to standard ABI alignment. This affects stack object alignment and stack
adjustments for function calls.
One extra calling convention, hhvm_ccc, is used to call C++ helpers from
HHVM's translation cache. It is almost identical to standard C calling
convention with an exception of first argument which is passed in RBP
(before we use RDI, RSI, etc.)
Differential Revision: http://reviews.llvm.org/D12681
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248832 91177308-0d34-0410-b5e6-96231b3b80d8
BranchProbability now is represented by its numerator and denominator in uint32_t type. This patch changes this representation into a fixed point that is represented by the numerator in uint32_t type and a constant denominator 1<<31. This is quite similar to the representation of BlockMass in BlockFrequencyInfoImpl.h. There are several pros and cons of this change:
Pros:
1. It uses only a half space of the current one.
2. Some operations are much faster like plus, subtraction, comparison, and scaling by an integer.
Cons:
1. Constructing a probability using arbitrary numerator and denominator needs additional calculations.
2. It is a little less precise than before as we use a fixed denominator. For example, 1 - 1/3 may not be exactly identical to 1 / 3 (this will lead to many BranchProbability unit test failures). This should not matter when we only use it for branch probability. If we use it like a rational value for some precise calculations we may need another construct like ValueRatio.
One important reason for this change is that we propose to store branch probabilities instead of edge weights in MachineBasicBlock. We also want clients to use probability instead of weight when adding successors to a MBB. The current BranchProbability has more space which may be a concern.
Differential revision: http://reviews.llvm.org/D12603
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248633 91177308-0d34-0410-b5e6-96231b3b80d8
and assert when mask is too large to apply in the small case,
previously the extra words were silently ignored.
clang-format the entire function to match current code standards.
This is a rewrite of r247972 which was reverted in r247983 due to
warning and possible UB on 32-bits hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247993 91177308-0d34-0410-b5e6-96231b3b80d8
We can't apply two words of 32-bit mask in the small case
where the internal storage is just one 32-bit word.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247974 91177308-0d34-0410-b5e6-96231b3b80d8
Extend mask value to 64 bits before taking its complement and assert when mask is
too large to apply in the small case (previously the extra words were silently ignored).
http://reviews.llvm.org/D11890
Patch by James Touton!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247972 91177308-0d34-0410-b5e6-96231b3b80d8
This was a flawed change - it just caused the getElementType call to be
deferred until later, when we really need to remove it. Now that the IR
for GlobalAliases has been updated, the root cause is addressed that way
instead and this change is no longer needed (and in fact gets in the way
- because we want to pass the pointee type directly down further).
Follow up patches to push this through GlobalValue, bitcode format, etc,
will come along soon.
This reverts commit 236160.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247585 91177308-0d34-0410-b5e6-96231b3b80d8
with the StringRef::split method when used with a MaxSplit argument
other than '-1' (which nobody really does today, but which should
actually work).
The spec claimed both to split up to MaxSplit times, but also to append
<= MaxSplit strings to the vector. One of these doesn't make sense.
Given the name "MaxSplit", let's go with it being a max over how many
*splits* occur, which means the max on how many strings get appended is
MaxSplit+1. I'm not actually sure the implementation correctly provided
this logic either, as it used a really opaque loop structure.
The implementation was also playing weird games with nullptr in the data
field to try to rely on a totally opaque hidden property of the split
method that returns a pair. Nasty IMO.
Replace all of this with what is (IMO) simpler code that doesn't use the
pair returning split method, and instead just finds each separator and
appends directly. I think this is a lot easier to read, and it most
definitely matches the spec. Added some tests that exercise the corner
cases around StringRef() and StringRef("") that all now pass.
I'll start using this in code in the next commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247249 91177308-0d34-0410-b5e6-96231b3b80d8
splits to actually use the single character split routine which does
less work, and in a debug build is *substantially* faster.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247245 91177308-0d34-0410-b5e6-96231b3b80d8
on StringRef. Finding and splitting on a single character is
substantially faster than doing it on even a single character StringRef
-- we immediately get to a *very* tuned memchr call this way.
Even nicer, we get to this even in a debug build, shaving 18% off the
runtime of TripleTest.Normalization, helping PR23676 some more.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247244 91177308-0d34-0410-b5e6-96231b3b80d8
The purpose is to allow templated wrapper to work with either
ArrayRef or any convertible operation:
template<typename Container>
void wrapper(const Container &Arr) {
impl(makeArrayRef(Arr));
}
with Container being a std::vector, a SmallVector, or an ArrayRef.
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247214 91177308-0d34-0410-b5e6-96231b3b80d8
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247167 91177308-0d34-0410-b5e6-96231b3b80d8
This makes RemoveDuplicatePHINodes more effective and fixes an assertion
failure. Triggering the assertions requires a DenseSet reallocation
so this change only contains a constructive test.
I'll explain the issue with a small example. In the following function
there's a duplicate PHI, %4 and %5 are identical. When this is found
the DenseSet in RemoveDuplicatePHINodes contains %2, %3 and %4.
define void @F() {
br label %1
; <label>:1 ; preds = %1, %0
%2 = phi i32 [ 42, %0 ], [ %4, %1 ]
%3 = phi i32 [ 42, %0 ], [ %5, %1 ]
%4 = phi i32 [ 42, %0 ], [ 23, %1 ]
%5 = phi i32 [ 42, %0 ], [ 23, %1 ]
br label %1
}
after RemoveDuplicatePHINodes runs the function looks like this. %3 has
changed and is now identical to %2, but RemoveDuplicatePHINodes never
saw this.
define void @F() {
br label %1
; <label>:1 ; preds = %1, %0
%2 = phi i32 [ 42, %0 ], [ %4, %1 ]
%3 = phi i32 [ 42, %0 ], [ %4, %1 ]
%4 = phi i32 [ 42, %0 ], [ 23, %1 ]
br label %1
}
If the DenseSet does a reallocation now it will reinsert all
keys and stumble over %3 now having a different hash value than it had
when inserted into the map for the first time. This change clears the
set whenever a PHI is deleted and starts the progress from the
beginning, allowing %3 to be deleted and avoiding inconsistent DenseSet
state. This potentially has a negative performance impact because
it rescans all PHIs, but I don't think that this ever makes a difference
in practice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246694 91177308-0d34-0410-b5e6-96231b3b80d8
We only looked through casts when one operand was a constant. We can also look through casts when both operands are non-constant, but both are in fact the same cast type. For example:
%1 = icmp ult i8 %a, %b
%2 = zext i8 %a to i32
%3 = zext i8 %b to i32
%4 = select i1 %1, i32 %2, i32 %3
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246678 91177308-0d34-0410-b5e6-96231b3b80d8
of its strings when expanding the string literals from the macros, and
push all of the APIs to be StringRef instead of C-string APIs.
This (remarkably) removes a very non-trivial number of strlen calls. It
even deletes code and complexity from one of the primary users -- Clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246374 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes PR24621 and matches what we do for `DILocation`. Although
the limit seems somewhat artificial, there are places in the backend
that also assume 16-bit columns, so we may as well just be consistent
about the limits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246349 91177308-0d34-0410-b5e6-96231b3b80d8
Add `Function::setSubprogram()` and `Function::getSubprogram()`,
convenience methods to forward to `setMetadata()` and `getMetadata()`,
respectively, and deal in `DISubprogram` instead of `MDNode`.
Also add a verifier check to enforce that `!dbg` attachments are always
subprograms.
Originally (when I had the llvm-dev discussion back in April) I thought
I'd store a pointer directly on `llvm::Function` for these attachments
-- we frequently have debug info, and that's much cheaper than using map
in the context if there are no other function-level attachments -- but
for now I'm just using the generic infrastructure. Let's add the extra
complexity only if this shows up in a profile.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246339 91177308-0d34-0410-b5e6-96231b3b80d8
This commit extends the 'SlotMapping' structure and includes mappings for named
and numbered types in it. The LLParser is extended accordingly to fill out
those mappings at the end of module parsing.
This information is useful when we want to parse standalone constant values
at a later stage using the 'parseConstantValue' method. The constant values
can be constant expressions, which can contain references to types. In order
to parse such constant values, we have to restore the internal named and
numbered mappings for the types in LLParser, otherwise the parser will report
a parsing error. Therefore, this commit also introduces a new method called
'restoreParsingState' to LLParser, which uses the slot mappings to restore
some of its internal parsing state.
This commit is required to serialize constant value pointers in the machine
memory operands for the MIR format.
Reviewers: Duncan P. N. Exon Smith
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245740 91177308-0d34-0410-b5e6-96231b3b80d8
This is something like nullopt in std::experimental::optional. Optional
could already be constructed from None, so this seems like an obvious
extension from there.
I have a use in a future patch for Clang, though it may not go that
way/end up used - so this seemed worth committing now regardless.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245518 91177308-0d34-0410-b5e6-96231b3b80d8
folding the code into the main Analysis library.
There already wasn't much of a distinction between Analysis and IPA.
A number of the passes in Analysis are actually IPA passes, and there
doesn't seem to be any advantage to separating them.
Moreover, it makes it hard to have interactions between analyses that
are both local and interprocedural. In trying to make the Alias Analysis
infrastructure work with the new pass manager, it becomes particularly
awkward to navigate this split.
I've tried to find all the places where we referenced this, but I may
have missed some. I have also adjusted the C API to continue to be
equivalently functional after this change.
Differential Revision: http://reviews.llvm.org/D12075
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245318 91177308-0d34-0410-b5e6-96231b3b80d8
It was previously asserting in Visual C++ debug mode on a null
iterator passed to std::equal.
Test by Hans Wennborg!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245270 91177308-0d34-0410-b5e6-96231b3b80d8
This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.
I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.
But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.
To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.
To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.
With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.
Differential Revision: http://reviews.llvm.org/D12063
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245193 91177308-0d34-0410-b5e6-96231b3b80d8
This causes the other special members (like move and copy construction,
and move assignment) to come through for free. Some code in clang was
depending on the (deprecated, in the original code) copy ctor. Now that
there's no user-defined special members, they're all available without
any deprecation concerns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244835 91177308-0d34-0410-b5e6-96231b3b80d8
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.
matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.
C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.
ARM and AArch64 at least have different instructions for these different cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244580 91177308-0d34-0410-b5e6-96231b3b80d8
'llvm::TrailingObjects<`anonymous-namespace'::Class1,short,llvm::NoTrailingTypeArg>::additionalSizeToAlloc' :
cannot access protected member declared in class
'llvm::TrailingObjects<`anonymous-namespace'::Class1,short,llvm::NoTrailingTypeArg>'
I'm not sure how this compiles with gcc.
Aren't protecteded members accessible only with protected or public inheritance?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244199 91177308-0d34-0410-b5e6-96231b3b80d8
This is the first mechanical step in preparation for making this and all
the other alias analysis passes available to the new pass manager. I'm
factoring out all the totally boring changes I can so I'm moving code
around here with no other changes. I've even minimized the formatting
churn.
I'll reformat and freshen comments on the interface now that its located
in the right place so that the substantive changes don't triger this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244197 91177308-0d34-0410-b5e6-96231b3b80d8
This is intended to help support the idiom of a class that has some
other objects (or multiple arrays of different types of objects)
appended on the end, which is used quite heavily in clang.
Differential Revision: http://reviews.llvm.org/D11272
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244164 91177308-0d34-0410-b5e6-96231b3b80d8
For example of mingw-w64-g++-4.8.1,
llvm/unittests/ADT/ArrayRefTest.cpp: In member function 'virtual void {anonymous}::ArrayRefTest_AllocatorCopy_Test::TestBody()':
llvm/unittests/ADT/ArrayRefTest.cpp:56:40: internal compiler error: in count_type_elements, at expr.c:5523
} Array3Src[] = {{"hello"}, {"world"}};
^
Please submit a full bug report,
with preprocessed source if appropriate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244017 91177308-0d34-0410-b5e6-96231b3b80d8
std::copy does not work for non-trivially copyable classes when we're
copying into uninitialized memory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243995 91177308-0d34-0410-b5e6-96231b3b80d8
Various value handles needed to be copy constructible and copy
assignable (mostly for their use in DenseMap). But to avoid an API that
might allow accidental slicing, make these members protected in the base
class and make derived classes final (the special members become
implicitly public there - but disallowing further derived classes that
might be sliced to the intermediate type).
Might be worth having a warning a bit like -Wnon-virtual-dtor that
catches public move/copy assign/ctors in classes with virtual functions.
(suppressable in the same way - by making them protected in the base,
and making the derived classes final) Could be fancier and only diagnose
them when they're actually called, potentially.
Also allow a few default implementations where custom implementations
(especially with non-standard return types) were implemented.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243909 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a bug found while working on the bitcode reader. In
particular, the method BitstreamReader::AtEndOfStream doesn't always
behave correctly when processing a data streamer. The method
fillCurWord doesn't properly set CurWord/BitsInCurWord if the data
streamer was already at eof, but GetBytes had not yet set the
ObjectSize field of the streaming memory object.
This patch fixes this problem, and provides a test to show that
this problem has been fixed.
Patch by Karl Schimpf.
Differential Revision: http://reviews.llvm.org/D11391
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243890 91177308-0d34-0410-b5e6-96231b3b80d8