Again, I'd like to emphasize to everyone that this sort of markup change
is *not* what you should be concerned about when writing docs. Focus on
*content*.
I applaud Chandler for focusing on the fantastic content of this new
section!
llvm-svn: 227305
By Asaf Badouh and Elena Demikhovsky
Added special nodes for rounding: FMADD_RND, FMSUB_RND..
It will prevent merge between nodes with rounding and other standard nodes.
llvm-svn: 227303
tracing code.
Managed static was just insane overhead for this. We took memory fences
and external function calls in every path that pushed a pretty stack
frame. This includes a multitude of layers setting up and tearing down
passes, the parser in Clang, everywhere. For the regression test suite
or low-overhead JITs, this was contributing to really significant
overhead.
Even the LLVM ThreadLocal is really overkill here because it uses
pthread_{set,get}_specific logic, and has careful code to both allocate
and delete the thread local data. We don't actually want any of that,
and this code in particular has problems coping with deallocation. What
we want is a single TLS pointer that is valid to use during global
construction and during global destruction, any time we want. That is
exactly what every host compiler and OS we use has implemented for
a long time, and what was standardized in C++11. Even though not all of
our host compilers support the thread_local keyword, we can directly use
the platform-specific keywords to get the minimal functionality needed.
Provided this limited trial survives the build bots, I will move this to
Compiler.h so it is more widely available as a light weight if limited
alternative to the ThreadLocal class. Many thanks to David Majnemer for
helping me think through the implications across platforms and craft the
MSVC-compatible syntax.
The end result is *substantially* faster. When running llc in a tight
loop over a small IR file targeting the aarch64 backend, this improves
its performance by over 10% for me. It also seems likely to fix the
remaining regressions seen by JIT users with threading enabled.
This may actually have more impact on real-world compile times due to
the use of the pretty stack tracing utility throughout the rest of Clang
or LLVM, but I've not collected any detailed measurements.
llvm-svn: 227300
querying of the pass registry.
The pass manager relies on the static registry of PassInfo objects to
perform all manner of its functionality. I don't understand why it does
much of this. My very vague understanding is that this registry is
touched both during static initialization *and* while each pass is being
constructed. As a consequence it is hard to make accessing it not
require a acquiring some lock. This lock ends up in the hot path of
setting up, tearing down, and invaliditing analyses in the legacy pass
manager.
On most systems you can observe this as a non-trivial % of the time
spent in 'ninja check-llvm'. However, I haven't really seen it be more
than 1% in extreme cases of compiling more real-world software,
including LTO.
Unfortunately, some of the GPU JITs are seeing this taking essentially
all of their time because they have very small IR running through
a small pass pipeline very many times (at least, this is the vague
understanding I have of it).
This patch tries to minimize the cost of looking up PassInfo objects by
leveraging the fact that the objects themselves are immutable and they
are allocated separately on the heap and so don't have their address
change. It also requires a change I made the last time I tried to debug
this problem which removed the ability to de-register a pass from the
registry. This patch creates a single access path to these objects
inside the PMTopLevelManager which memoizes the result of querying the
registry. This is somewhat gross as I don't really know if
PMTopLevelManager is the *right* place to put it, and I dislike using
a mutable member to memoize things, but it seems to work.
For long-lived pass managers this should completely eliminate
the cost of acquiring locks to look into the pass registry once the
memoized cache is warm. For 'ninja check' I measured about 1.5%
reduction in CPU time and in total time on a machine with 32 hardware
threads. For normal compilation, I don't know how much this will help,
sadly. We will still pay the cost while we populate the memoized cache.
I don't think it will hurt though, and for LTO or compiles with many
small functions it should still be a win. However, for tight loops
around a pass manager with many passes and small modules, this will help
tremendously. On the AArch64 backend I saw nearly 50% reductions in time
to complete 2000 cycles of spinning up and tearing down the pipeline.
Measurements from Owen of an actual long-lived pass manager show more
along the lines of 10% improvements.
Differential Revision: http://reviews.llvm.org/D7213
llvm-svn: 227299
This patch folds fcmp in some cases of interest in Julia. The patch adds a function CannotBeOrderedLessThanZero that returns true if a value is provably not less than zero. I.e. the function returns true if the value is provably -0, +0, positive, or a NaN. The patch extends InstructionSimplify.cpp to fold instances of fcmp where:
- the predicate is olt or uge
- the first operand is provably not less than zero
- the second operand is zero
The motivation for handling these cases optimizing away domain checks for sqrt in Julia for common idioms such as sqrt(x*x+y*y)..
http://reviews.llvm.org/D6972
llvm-svn: 227298
abomination.
For starters, this API is incredibly slow. In order to lookup the name
of a pass it must take a memory fence to acquire a pointer to the
managed static pass registry, and then potentially acquire locks while
it consults this registry for information about what passes exist by
that name. This stops the world of LLVMs in your process no matter
how little they cared about the result.
To make this more joyful, you'll note that we are preserving many passes
which *do not exist* any more, or are not even analyses which one might
wish to have be preserved. This means we do all the work only to say
"nope" with no error to the user.
String-based APIs are a *bad idea*. String-based APIs that cannot
produce any meaningful error are an even worse idea. =/
I have a patch that simply removes this API completely, but I'm hesitant
to commit it as I don't really want to perniciously break out-of-tree
users of the old pass manager. I'd rather they just have to migrate to
the new one at some point. If others disagree and would like me to kill
it with fire, just say the word. =]
llvm-svn: 227294
polymorphism, and virtual dispatch.
This is essentially trying to explain the emerging design techniques
being used in LLVM these days somewhere more accessible than the
comments on a particular piece of infrastructure. It covers the
"concepts-based polymorphism" that caused some confusion during initial
reviews of the new pass manager as well as the tagged-dispatch mechanism
used pervasively in LLVM and Clang.
Perhaps most notably, I've tried to provide some criteria to help
developers choose between these options when designing new pieces of
infrastructure.
Differential Revision: http://reviews.llvm.org/D7191
llvm-svn: 227292
The change was made so we could re-use a platform if one was already created instead of creating a new one, but it would fail in the above case. To fix this, if we have a selected platform, we verify that the platform matches the current platform before we try to re-use it. We do this by asking the OptionGroupPlatform if the platform matches. If so, it returns true and we don't create a new platform, else we do.
llvm-svn: 227288
This has wider implications than I expected when I reviewed the patch: It can
cause JIT crashes where clients have used the default value for AbortOnFailure
during symbol lookup. I'm currently investigating alternative approaches and I
hope to have this back in tree soon.
llvm-svn: 227287
This adds two command line options:
--symbols dumps a list of all symbols found in the PDB.
--symbol-details dumps the same list, but with detailed information
for every symbol such as type, attributes, etc.
llvm-svn: 227286
Summary:
Also add enum types for __C_specific_handler and _CxxFrameHandler3 for
which we know a few things.
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7214
llvm-svn: 227284
And since enough of these are doing the right thing, add a test case to verify we are doing the right thing with freeze drying ObjC object types
Fixes rdar://18092770
llvm-svn: 227282
Otherwise, in the most important case and the only case where SS and
TempSS are different (which is when the CXXScopeSpec should be dropped,
and TempSS is NULL) the wrong SourceRange will be used in the fixit for
the typo correction. Fixes the remaining issue in PR20626.
llvm-svn: 227278
Namely, this commit provides an actual implementation of how to retrieve the byte size in a sane way for an ObjC class, by scanning ivar offsets and byte sizes, figuring out the farthest-from-base ivar, and adding its byte size to that
Still NFC
llvm-svn: 227277
This adds two command line options to llvm-pdbdump.
--source-files prints a flat list of all source files in the PDB.
--compilands prints a list of all compilands (e.g. object files)
that the PDB knows about, and for each one, a list of
source files that the compiland is composed of as well
as a hash of the original source file.
llvm-svn: 227276
This is necessary because the byte size of an ObjC class type is not reliably statically knowable (e.g. because superclasses sit deep in frameworks that we have no debug info for)
The lack of reliable size info is a problem when trying to freeze-dry an ObjC instance (not the pointer, the pointee)
This commit lays the foundation for having language runtimes help in figuring out byte sizes, and having ClangASTType ask for runtime help
No feature change as no runtime actually implements the logic, and nowhere is an ExecutionContext passed in yet
llvm-svn: 227274
This commit creates infinite loop in DAG combine for in the LLVM test-suite
for aarch64 with mcpu=cylcone (just having neon may be enough to expose this).
llvm-svn: 227272
This test tests the equivalent of:
breakpoint set --name count
breakpoint set --selector count
breakpoint set --name isEqual:
breakpoint set --selector isEqual:
breakpoint set --name "-[MyClass(MyCategory) myCategoryFunction]"
breakpoint set --name "-[MyClass myCategoryFunction]"
breakpoint set --name "[MyClass myCategoryFunction]"
llvm-svn: 227271
Use __clear_cache builtin instead of cacheflush() in
Unix Memory::InvalidateInstructionCache().
Differential Revision: http://reviews.llvm.org/D7198
llvm-svn: 227269
COMDATs must be identically named to the symbol. When support for COMDATs was
introduced, the symbol rewriter was not updated, resulting in rewriting failing
for symbols which were placed into COMDATs. This corrects the behaviour and
adds test cases for this.
llvm-svn: 227261
That kind of reference was used only in ELFFile, and the use of
that reference there didn't seem to make sense. All test still
pass (after adjusting symbol names) without that code. LLD is
still be able to link LLD and Clang. Looks like we just don't
need this.
http://reviews.llvm.org/D7189
llvm-svn: 227259
Make sure "void *ctx" doesn't point to an object which already went out
of scope. This might also fix -Wuninitialized warnings GCC 4.7 produces
while building ASan runtime.
llvm-svn: 227258
PDB stores some of its data in streams and some in tables.
This patch teaches llvm-pdbdump to dump basic summary data
for the debug tables.
In support of this, this patch also adds some DIA helper
classes, such as a wrapper around an IDiaSymbol interface,
as well as helpers for outputting various enumerations to
a raw_ostream.
llvm-svn: 227257