The analysis manager was made not optional and turned into a
reference instead of a pointer in r272978. Some comments were
still refering to the previous behavior.
llvm-svn: 277857
This prevents handles from being invalidated (through iterator invalidation)
when new modules are added.
No test-case yet: This bug was uncovered during work on an upcoming patch for
weak symbol support and the testcase for that feature will implicitly test for
correct behavior here.
llvm-svn: 277847
This differs from the previous version by being more careful about template
instantiation/specialization in order to prevent errors when building with
clang -Werror. Specifically:
* begin is not defined in the template and is instead instantiated when Head
is. I think the warning when we don't do that is wrong (PR28815) but for now
at least do it this way to avoid the warning.
* Instead of performing template specializations in LLVM_INSTANTIATE_REGISTRY
instead provide a template definition then do explicit instantiation. No
compiler I've tried has problems with doing it the other way, but strictly
speaking it's not permitted by the C++ standard so better safe than sorry.
Original commit message:
Currently the Registry class contains the vestiges of a previous attempt to
allow plugins to be used on Windows without using BUILD_SHARED_LIBS, where a
plugin would have its own copy of a registry and export it to be imported by
the tool that's loading the plugin. This only works if the plugin is entirely
self-contained with the only interface between the plugin and tool being the
registry, and in particular this conflicts with how IR pass plugins work.
This patch changes things so that instead the add_node function of the registry
is exported by the tool and then imported by the plugin, which solves this
problem and also means that instead of every plugin having to export every
registry they use instead LLVM only has to export the add_node functions. This
allows plugins that use a registry to work on Windows if
LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is used.
llvm-svn: 277806
Add a generalized IRBuilderCallbackInserter, which is just given a
callback to execute after insertion. This can be used to get rid of
the custom inserter in InstCombine, which will in turn allow me to add
target specific InstCombineCalls API for intrinsics without horrible
layering violations.
llvm-svn: 277784
This is the forth patch in the coroutine series. CoroEaly pass now lowers coro.resume
and coro.destroy intrinsics by replacing them with an indirect call to an address
returned by coro.subfn.addr intrinsic. This is done so that CGPassManager recognizes
devirtualization when CoroElide replaces a call to coro.subfn.addr with an appropriate
function address.
Patch by Gor Nishanov!
Differential Revision: https://reviews.llvm.org/D22998
llvm-svn: 277765
Summary:
TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses.
A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted.
Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures:
- x86 failing tests either did not have the right target, or the right alignment.
- NVPTX failing tests did not have the right alignment.
- AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info
considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks
for a maximum size allowed and relies on the next condition checking for %4 for correctness.
This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list).
Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure),
so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context.
Reviewers: arsenm, jlebar, tstellarAMD
Subscribers: jholewinski, arsenm, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D23068
llvm-svn: 277735
On modern Intel processors hardware SQRT in many cases is faster than RSQRT
followed by Newton-Raphson refinement. The patch introduces a simple heuristic
to choose between hardware SQRT instruction and Newton-Raphson software
estimation.
The patch treats scalars and vectors differently. The heuristic is that for
scalars the compiler should optimize for latency while for vectors it should
optimize for throughput. It is based on the assumption that throughput bound
code is likely to be vectorized.
Basically, the patch disables scalar NR for big cores and disables NR completely
for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores.
Secondly, vector SQRT has been greatly improved in Skylake and has better
throughput compared to NR.
Differential Revision: https://reviews.llvm.org/D21379
llvm-svn: 277725
overloaded (and simpler).
Sean rightly pointed out in code review that we've started using
"wrapper pass" as a specific part of the old pass manager, and in fact
it is more applicable there. Here, we really have a pass *template* to
build a repeated pass, so call it that.
llvm-svn: 277689
pdbdump calls DbiStreamBuilder::commit through PDBFileBuilder::commit
without calling DbiStreamBuilder::finalize. Because `finalize` initializes
`Header` member, `Header` remained nullptr which caused a crash bug.
Differential Revision: https://reviews.llvm.org/D23143
llvm-svn: 277681
changing them to Expected<> to allow them to pass through llvm Errors.
No functional change.
This commit by itself will break the next lld builds. I’ll be committing the
matching change for lld immediately next.
llvm-svn: 277656
This reverts commit the revert commit r277627. The build errors
mentioned in r277627 were likely caused by an unclean build directory.
Sorry for the noise.
llvm-svn: 277630
This reverts commit r277540. It breaks the build with:
../lib/Object/Archive.cpp:264:41: error: return type of out-of-line definition of 'llvm::object::ArchiveMemberHeader::getUID' differs from that in the declaration
Expected<unsigned> ArchiveMemberHeader::getUID() const {
~~~~~~~~~~~~~~~~~~ ^
include/llvm/Object/Archive.h:53:12: note: previous declaration is here
unsigned getUID() const;
~~~~~~~~ ^
llvm-svn: 277627
MappedBlockSTream can work with any sequence of block data where
the ordering is specified by a list of block numbers. So rather
than manually stitch them together in the case of the FPM, reuse
this functionality so that we can treat the FPM as if it were
contiguous.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D23066
llvm-svn: 277609
required by MSVC 2013.
This also makes the repeating pass wrapper assignable. Mildly
unfortunate as it means we can't use a const member for the int, but
that is a really minor invariant to try to preserve at the cost of loss
of regularity of the type. Yet another annoyance of the particular C++
object / move semantic model.
llvm-svn: 277582
manager.
While this has some utility for debugging and testing on its own, it is
primarily intended to demonstrate the technique for adding custom
wrappers that can provide more interesting interation behavior in
a nice, orthogonal, and composable layer.
Being able to write these kinds of very dynamic and customized controls
for running passes was one of the motivating use cases of the new pass
manager design, and this gives a hint at how they might look. The actual
logic is tiny here, and most of this is just wiring in the pipeline
parsing so that this can be widely used.
I'm adding this now to show the wiring without a lot of business logic.
This is a precursor patch for showing how a "iterate up to N times as
long as we devirtualize a call" utility can be added as a separable and
composable component along side the CGSCC pass management.
Differential Revision: https://reviews.llvm.org/D22405
llvm-svn: 277581
reason about and less error prone.
The core idea is to fully parse the text without trying to identify
passes or structure. This is done with a single state machine. There
were various bugs in the logic around this previously that were repeated
and scattered across the code. Having a single routine makes it much
easier to fix and get correct. For example, this routine doesn't suffer
from PR28577.
Then the actual pass construction is handled using *much* easier to read
code and simple loops, with particular pass manager construction sunk to
live with other pass construction. This is especially nice as the pass
managers *are* in fact passes.
Finally, the "implicit" pass manager synthesis is done much more simply
by forming "pre-parsed" structures rather than having to duplicate tons
of logic.
One of the bugs fixed by this was evident in the tests where we accepted
a pipeline that wasn't really well formed. Another bug is PR28577 for
which I have added a test case.
The code is less efficient than the previous code but I'm really hoping
that's not a priority. ;]
Thanks to Sean for the review!
Differential Revision: https://reviews.llvm.org/D22724
llvm-svn: 277561
Move those two options to llc:
The options in CommandFlags.h are shared by dsymutil, gold, llc,
llvm-dwp, llvm-lto, llvm-mc, lto, opt.
-stop-after/-start-after only affect codegen passes however only gold and llc
actually create codegen passes and I believe these flags to be only
useful for users of llc. For the other tools they are just highly
confusing: -stop-after claims to "Stop compilation after a specific
pass" which is not true in the context of the "opt" tool.
Differential Revision: https://reviews.llvm.org/D23050
llvm-svn: 277551
None of GlobalISel requires the property, but this lets us use the
verifier instead of rolling our own "all instructions selected" check.
llvm-svn: 277484
Selected: the InstructionSelect pass ran and all pre-isel generic
instructions have been eliminated; i.e., all instructions are now
target-specific or non-pre-isel generic instructions (e.g., COPY).
Since only pre-isel generic instructions can have generic virtual register
operands, this also means that all generic virtual registers have been
constrained to virtual registers (assigned to register classes) and that
all sizes attached to them have been eliminated.
This lets us enforce certain invariants across passes.
This property is GlobalISel-specific, but is always available.
llvm-svn: 277482
Fixes PR28670
Summary:
Rewrite the use optimizer to be less memory intensive and 50% faster.
Fixes PR28670
The new use optimizer works like a standard SSA renaming pass, storing
all possible versions a MemorySSA use could get in a stack, and just
tracking indexes into the stack.
This uses much less memory than caching N^2 alias query results.
It's also a lot faster.
The current version defers phi node walking to the normal walker.
Reviewers: george.burgess.iv
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23032
llvm-svn: 277480
The InstructionSelect pass assumes that RegBankSelect ran; set the
property on all tests (thereby verifying the test inputs) and require
it in the pass.
llvm-svn: 277477
RegBankSelected: the RegBankSelect pass ran and all generic virtual
registers have been assigned to a register bank.
This lets us enforce certain invariants across passes.
This property is GlobalISel-specific, but is always available.
llvm-svn: 277475
RegBankSelect and InstructionSelect run after the legalizer and
require a Legalized function: check that all instructions are legal.
Note that this should be in the MachineVerifier, but it can't use the
MachineLegalizer as it's currently in the separate GlobalISel library.
Note that the RegBankSelect verifier checks have the same layering
problem, but we only use inline methods so end up not needing to link
against the GlobalISel library.
llvm-svn: 277472
Legalized: The MachineLegalizer ran; all pre-isel generic instructions
have been legalized, i.e., all instructions are now one of:
- generic and always legal (e.g., COPY)
- target-specific
- legal pre-isel generic instructions.
This lets us enforce certain invariants across passes.
This property is GlobalISel-specific, but is always available.
llvm-svn: 277470
Added ability to estimate the entry count of the extracted function and
the branch probabilities of the exit branches.
Patch by River Riddle!
Differential Revision: https://reviews.llvm.org/D22744
llvm-svn: 277411
Summary: By generalize the interface, users are able to inject more flexible Node token into the algorithm, for example, a pair of vector<Node>* and index integer. Currently I only migrated SCCIterator to use NodeRef, but more is coming. It's a NFC.
Reviewers: dblaikie, chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22937
llvm-svn: 277399
Common symbol support in ORC was broken in r270716 when the symbol resolution
rules in RuntimeDyld were changed. With the switch to lazily materialized
symbols in r277386, common symbols can be supported by having
RuntimeDyld::emitCommonSymbols search for (but not materialize!) definitions
elsewhere in the logical dylib.
This patch adds the 'Common' flag to JITSymbolFlags, and the necessary check
to RuntimeDyld::emitCommonSymbols.
llvm-svn: 277397
The FPM is split at regular intervals across the MSF file, as the MS code
suggests. It turns out that the value of the interval is precisely the
block size. If the block size is 4096, then there are two Fpm pages every
4096 blocks.
So here we teach the PDBFile class to parse a split FPM, and also add more
options when dumping the FPM to display some additional information such
as orphaned pages (pages which the FPM says are allocated, but which
nothing appears to use), use after free pages (pages which the FPM says
are not allocated, but which are referenced by a stream), and multiple use
pages (pages which the FPM says are allocated but are used more than
once).
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D23022
llvm-svn: 277388
This patch replaces RuntimeDyld::SymbolInfo with JITSymbol: A symbol class
that is capable of lazy materialization (i.e. the symbol definition needn't be
emitted until the address is requested). This can be used to support common
and weak symbols in the JIT (though this is not implemented in this patch).
For consistency, RuntimeDyld::SymbolResolver is renamed to JITSymbolResolver.
For space efficiency a new class, JITEvaluatedSymbol, is introduced that
behaves like the old RuntimeDyld::SymbolInfo - i.e. it is just a pair of an
address and symbol flags. Instances of JITEvaluatedSymbol can be used in
symbol-tables to avoid paying the space cost of the materializer.
llvm-svn: 277386
We used to combine "sext(setcc x, y, cc) -> (select (setcc x, y, cc), -1, 0)"
Instead, we should combine to (select (setcc x, y, cc), T, 0) where the value
of T is 1 or -1, depending on the type of the setcc, and getBooleanContents()
for the type if it is not i1.
This fixes PR28504.
llvm-svn: 277371
As it turns out, modref queries are broken with CFLAA. Specifically,
the data source we were using for determining modref behaviors
explicitly ignores operations on non-pointer values. So, it wouldn't
note e.g. storing an i32 to an i32* (or loading an i64 from an i64*).
It also ignores external function calls, rather than acting
conservatively for them.
(N.B. These operations, where necessary, *are* tracked by CFLAA; we just
use a different mechanism to do so. Said mechanism is relatively
imprecise, so it's unlikely that we can provide reasonably good modref
answers with it as implemented.)
Patch by Jia Chen.
Differential Revision: https://reviews.llvm.org/D22978
llvm-svn: 277366