rather than the constructors of passes.
This simplifies the APIs of passes significantly and removes an error
prone pattern where the *same* manager had to be given to every
different layer. With the new API the analysis managers themselves will
have to be cross connected with proxy analyses that allow a pass at one
layer to query for the analysis manager of another layer. The proxy will
both expose a handle to the other layer's manager and it will provide
the invalidation hooks to ensure things remain consistent across layers.
Finally, the outer-most analysis manager has to be passed to the run
method of the outer-most pass manager. The rest of the propagation is
automatic.
I've used SFINAE again to allow passes to completely disregard the
analysis manager if they don't need or want to care. This helps keep
simple things simple for users of the new pass manager.
Also, the system specifically supports passing a null pointer into the
outer-most run method if your pass pipeline neither needs nor wants to
deal with analyses. I find this of dubious utility as while some
*passes* don't care about analysis, I'm not sure there are any
real-world users of the pass manager itself that need to avoid even
creating an analysis manager. But it is easy to support, so there we go.
Finally I renamed the module proxy for the function analysis manager to
the more verbose but less confusing name of
FunctionAnalysisManagerModuleProxy. I hate this name, but I have no idea
what else to name these things. I'm expecting in the fullness of time to
potentially have the complete cross product of types at the proxy layer:
{Module,SCC,Function,Loop,Region}AnalysisManager{Module,SCC,Function,Loop,Region}Proxy
(except for XAnalysisManagerXProxy which doesn't make any sense)
This should make it somewhat easier to do the next phases which is to
build the upward proxy and get its invalidation correct, as well as to
make the invalidation within the Module -> Function mapping pass be more
fine grained so as to invalidate fewer fuction analyses.
After all of the proxy analyses are done and the invalidation working,
I'll finally be able to start working on the next two fun fronts: how to
adapt an existing pass to work in both the legacy pass world and the new
one, and building the SCC, Loop, and Region counterparts. Fun times!
llvm-svn: 195400
Splitting a basic block will create a new ALU clause, so we need to make
sure we aren't moving uses of registers that are local to their
current clause into a new one.
I had a test case for this, but unfortunately unrelated schedule changes
invalidated it, and I wasn't been able to come up with another one.
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195399
The legalizer can now do this type of expansion for more
type combinations without loading and storing to and
from the stack.
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195398
This patch is a rewrite of the original patch commited in r194542. Instead of
relying on the type legalizer to do the splitting for us, we now peform the
splitting ourselves in the DAG combiner. This is necessary for the case where
the vector mask is a legal type after promotion and still wouldn't require
splitting.
Patch by: Juergen Ributzka
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195397
section use the form DW_FORM_data4 whilst in Dwarf 4 and later they
use the form DW_FORM_sec_offset.
This patch updates the places where such attributes are generated to
use the appropriate form depending on the Dwarf version. The DIE entries
affected have the following tags:
DW_AT_stmt_list, DW_AT_ranges, DW_AT_location, DW_AT_GNU_pubnames,
DW_AT_GNU_pubtypes, DW_AT_GNU_addr_base, DW_AT_GNU_ranges_base
It also adds a hidden command line option "--dwarf-version=<uint>"
to llc which allows the version of Dwarf to be generated to override
what is specified in the metadata; this makes it possible to update
existing tests to check the debugging information generated for both
Dwarf 4 (the default) and Dwarf 3 using the same metadata.
Patch (slightly modified) by Keith Walker!
llvm-svn: 195391
AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) | (y >> (64 - c))) when we are not optimizing for size.
It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel.
llvm-svn: 195383
The new command line flags are -dfsan-ignore-pointer-label-on-store and -dfsan-ignore-pointer-label-on-load. Their default value matches the current labelling scheme.
Additionally, the function __dfsan_union_load is marked as readonly.
Patch by Lorenzo Martignoni!
Differential Revision: http://llvm-reviews.chandlerc.com/D2187
llvm-svn: 195382
- Allow overriding PACKAGE_VERSION from the command-line
- Use PACKAGE_VERSION to set CPACK_PACKAGE_VERSION (used by the Win installer)
- Don't include the version number in the CPack install dir or registry key.
Differential revision: http://llvm-reviews.chandlerc.com/D2245
llvm-svn: 195379
Mask == ~InvMask asserts if the width of Mask and InvMask differ.
The combine isn't valid (with two exceptions, see below) if the widths differ
so test for this before testing Mask == ~InvMask.
In the specific cases of Mask=~0 and InvMask=0, as well as Mask=0 and
InvMask=~0, the combine is still valid. However, there are more appropriate
combines that could be used in these cases such as folding x & 0 to 0, or
x & ~0 to x.
llvm-svn: 195364
Summary:
LegalizeSetCCCondCode can now legalize SETEQ and SETNE by returning the inverse
condition and requesting that the caller invert the result of the condition.
The caller of LegalizeSetCCCondCode must handle the inverted CC, and they do
so as follows:
SETCC, BR_CC:
Invert the result of the SETCC with SelectionDAG::getNOT()
SELECT_CC:
Swap the true/false operands.
This is necessary for MSA which lacks an integer SETNE instruction.
Reviewers: resistor
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2229
llvm-svn: 195355
It broke, at least, i686 target. It is reproducible with "llc -mtriple=i686-unknown".
FYI, it didn't appear to add either "-O0" or "-fast-isel".
llvm-svn: 195339
it is completely optional, and sink the logic for handling the preserved
analysis set into it.
This allows us to implement the delegation logic desired in the proxy
module analysis for the function analysis manager where if the proxy
itself is preserved we assume the set of functions hasn't changed and we
do a fine grained invalidation by walking the functions in the module
and running the invalidate for them all at the manager level and letting
it try to invalidate any passes.
This in turn makes it blindingly obvious why we should hoist the
invalidate trait and have two collections of results. That allows
handling invalidation for almost all analyses without indirect calls and
it allows short circuiting when the preserved set is all.
llvm-svn: 195338
type and detect whether or not it provides an 'invalidate' member the
analysis manager should use.
This lets the overwhelming common case of *not* caring about custom
behavior when an analysis is invalidated be the the obvious default
behavior with no code written by the author of an analysis. Only when
they write code specifically to handle invalidation does it get used.
Both cases are actually covered by tests here. The test analysis uses
the default behavior, and the proxy module analysis actually has custom
behavior on invalidation that is firing correctly. (In fact, this is the
analysis which was the primary motivation for having custom invalidation
behavior in the first place.)
llvm-svn: 195332
clang optimizes tail calls, as in this example:
int foo(void);
int bar(void) {
return foo();
}
where the call is transformed to:
calll .L0$pb
.L0$pb:
popl %eax
.Ltmp0:
addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb), %eax
movl foo@GOT(%eax), %eax
popl %ebp
jmpl *%eax # TAILCALL
However, the GOT references must all be resolved at dlopen() time, and so this
approach cannot be used with lazy dynamic linking (e.g. using RTLD_LAZY), which
usually populates the PLT with stubs that perform the actual resolving.
This patch changes X86TargetLowering::LowerCall() to skip tail call
optimization, if the called function is a global or external symbol.
Patch by Dimitry Andric!
PR15086
llvm-svn: 195318
This proxy will fill the role of proxying invalidation events down IR
unit layers so that when a module changes we correctly invalidate
function analyses. Currently this is a very coarse solution -- any
change blows away the entire thing -- but the next step is to make
invalidation handling more nuanced so that we can propagate specific
amounts of invalidation from one layer to the next.
The test is extended to place a module pass between two function pass
managers each of which have preserved function analyses which get
correctly invalidated by the module pass that might have changed what
functions are even in the module.
llvm-svn: 195304
MappingTrait template specializations can now have a validate() method which
performs semantic checking. For details, see <http://llvm.org/docs/YamlIO.html>.
llvm-svn: 195286
The instruction definitions incorrectly specified that popcntd and popcntw have
record forms; they do not. This mistake was causing invalid code generation.
llvm-svn: 195272
We now only allow breaking source order if the exit block frequency is
significantly higher than the other exit block. The actual bias is
currently under a flag so the best cut-off can be found; the flag
defaults to the old behavior. The idea is to get some benchmark coverage
over different values for the flag and pick the best one.
When we require the new frequency to be at least 20% higher than the old
frequency I see a 5% speedup on zlib's deflate when compressing a random
file on x86_64/westmere. Hal reported a small speedup on Fhourstones on
a BG/Q and no regressions in the test suite.
The test case is the full long_match function from zlib's deflate. I was
reluctant to add it for previous tweaks to branch probabilities because
it's large and potentially fragile, but changed my mind since it's an
important use case and more likely to break with all the current work
going into the PGO infrastructure.
Differential Revision: http://llvm-reviews.chandlerc.com/D2202
llvm-svn: 195265
While not strictly necessary (the class has an invariant that
"setDebugInfoOffset" is called before "getDebugInfoOffset" - anyone
client that actually gets the default zero offset is buggy/broken) this
is consistent with the code as originally written and the removal of the
initialization was an accident in r195166.
Suggested by Manman Ren.
llvm-svn: 195263