actually finish wiring up the old call graph.
There were bugs in the old call graph that hadn't been caught because it
wasn't being tested. It wasn't being tested because it wasn't in the
pipeline system and we didn't have a printing pass to run in tests. This
fixes all of that.
As for why I'm still keeping the old call graph alive its so that I can
port GlobalsAA to the new pass manager with out forking it to work with
the lazy call graph. That's clearly the right eventual design, but it
seems pragmatic to defer that until its necessary. The old call graph
works just fine for GlobalsAA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263104 91177308-0d34-0410-b5e6-96231b3b80d8
location in the opt tool to live along side the analysis in LLVM's
libraries.
No functionality changed here, but this will allow me to port the
printer to the new pass manager as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263101 91177308-0d34-0410-b5e6-96231b3b80d8
There is another pass by the generic name 'CallGraphPrinter' which is
actually just a call graph printer tucked away inside the opt tool. I'd
like to bring it out and make it follow the same patterns as the rest of
the CallGraph code, but doing so would end up conflicting with the name
of the DOT printing pass. So this makes the DOT printing pass name be
more precise.
No functionality changed here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263100 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This provides a macro that expands to __builtin_debugtrap() for clang,
and __debugbreak() for MSVC.
It intentionally expands to nothing for compilers that do not support a
similar mechanism that halts the debugger without otherwise crashing the
process.
Differential Revision: http://reviews.llvm.org/D18002
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263095 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is intended to be a performance flag, on the same level as clang
cc1 option "--disable-free". LLVM will never initialize it by default,
it will be up to the client creating the LLVMContext to request this
behavior. Clang will do it by default in Release build (just like
--disable-free).
"opt" and "llc" can opt-in using -disable-named-value command line
option.
When performing LTO on llvm-tblgen, the initial merging of IR peaks
at 92MB without this patch, and 86MB after this patch,setNameImpl()
drops from 6.5MB to 0.5MB.
The total link time goes from ~29.5s to ~27.8s.
Compared to a compile-time flag (like the IRBuilder one), it performs
very close. I profiled on SROA and obtain these results:
420ms with IRBuilder that preserve name
372ms with IRBuilder that strip name
375ms with IRBuilder that preserve name, and a runtime flag to strip
Reviewers: chandlerc, dexonsmith, bogner
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D17946
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263086 91177308-0d34-0410-b5e6-96231b3b80d8
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.
There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.
Differential Revision: http://reviews.llvm.org/D17962
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263082 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches LICM's implementation of store promotion to exploit the fact that the memory location being accessed might be provable thread local. The fact it's thread local weakens the requirements for where we can insert stores since no other thread can observe the write. This allows us perform store promotion even in cases where the store is not guaranteed to execute in the loop.
Two key assumption worth drawing out is that this assumes a) no-capture is strong enough to imply no-escape, and b) standard allocation functions like malloc, calloc, and operator new return values which can be assumed not to have previously escaped.
In future work, it would be nice to generalize this so that it works without directly seeing the allocation site. I believe that the nocapture return attribute should be suitable for this purpose, but haven't investigated carefully. It's also likely that we could support unescaped allocas with similar reasoning, but since SROA and Mem2Reg should destroy those, they're less interesting than they first might seem.
Differential Revision: http://reviews.llvm.org/D16783
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263072 91177308-0d34-0410-b5e6-96231b3b80d8
Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263062 91177308-0d34-0410-b5e6-96231b3b80d8
I somehow missed this. The case in GCC (global_alloc) was similar to
the new testcase except it had an array of structs rather than a two
dimensional array.
Fixes RP26885.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263058 91177308-0d34-0410-b5e6-96231b3b80d8
As part of r251146 InstCombine was extended to call computeKnownBits on
every value in the function to determine whether it happens to be
constant. This increases typical compiletime by 1-3% (5% in irgen+opt
time) in my measurements. On the other hand this case did not trigger
once in the whole llvm-testsuite.
This patch introduces the notion of ExpensiveCombines which are only
enabled for OptLevel > 2. I removed the check in InstructionSimplify as
that is called from various places where the OptLevel is not known but
given the rarity of the situation I think a check in InstCombine is
enough.
Differential Revision: http://reviews.llvm.org/D16835
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263047 91177308-0d34-0410-b5e6-96231b3b80d8
We changed several functions in LoopAccessAnalysis to use PSE instead of
taking SE and a SCEV predicate as arguments, but didn't update the comments.
This also fixes a comment in ScalarEvolution, where we refered to Preds
when the argument name was A.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263009 91177308-0d34-0410-b5e6-96231b3b80d8
Supprot DPP syntax as used in SP3 (except several operands syntax).
Added dpp-specific operands in td-files.
Added DPP flag to TSFlags to determine if instruction is dpp in InstPrinter.
Support for VOP2 DPP instructions in td-files.
Some tests for DPP instructions.
ToDo:
- VOP2bInst:
- vcc is considered as operand
- AsmMatcher doesn't apply mnemonic aliases when parsing operands
- v_mac_f32
- v_nop
- disable instructions with 64-bit operands
- change dpp_ctrl assembler representation to conform sp3
Review: http://reviews.llvm.org/D17804
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263008 91177308-0d34-0410-b5e6-96231b3b80d8
This is intended to provide a parallel (threaded) ThinLTO scheme
for linker plugin use through the libLTO C API.
The intent of this patch is to provide a first implementation as a
proof-of-concept and allows linker to start supporting ThinLTO by
definiing the libLTO C API. Some part of the libLTO API are left
unimplemented yet. Following patches will add support for these.
The current implementation can link all clang/llvm binaries.
Differential Revision: http://reviews.llvm.org/D17066
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262977 91177308-0d34-0410-b5e6-96231b3b80d8
This re-applies r262886 with a fix for 32 bit platforms that have 8 byte
pointer alignment, effectively reverting r262892.
Original Message:
Currently some SDNode operands are malloc'd, some are stored inline in
subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
This scheme is complex, inconsistent, and makes refactoring SDNodes
fairly difficult.
Instead, we can allocate all of the operands using an ArrayRecycler
that wraps a BumpPtrAllocator. This keeps the cache locality when
iterating operands, improves locality when iterating SDNodes without
looking at operands, and vastly simplifies the ownership semantics.
It also means we stop overallocating SDNodes by 2-3x and will make it
simpler to fix the rampant undefined behaviour we have in how we
mutate SDNodes from one kind to another (See llvm.org/pr26808).
This is NFC other than the changes in memory behaviour, and I ran some
LNT tests to make sure this didn't hurt compile time. Not many tests
changed: there were a couple of 1-2% regressions reported, but there
were more improvements (of up to 4%) than regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262902 91177308-0d34-0410-b5e6-96231b3b80d8
Looks like the largest SDNode is different between 32 and 64 bit now,
so this is breaking 32 bit bots. Reverting while I figure out a fix.
This reverts r262886.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262892 91177308-0d34-0410-b5e6-96231b3b80d8
Currently some SDNode operands are malloc'd, some are stored inline in
subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
This scheme is complex, inconsistent, and makes refactoring SDNodes
fairly difficult.
Instead, we can allocate all of the operands using an ArrayRecycler
that wraps a BumpPtrAllocator. This keeps the cache locality when
iterating operands, improves locality when iterating SDNodes without
looking at operands, and vastly simplifies the ownership semantics.
It also means we stop overallocating SDNodes by 2-3x and will make it
simpler to fix the rampant undefined behaviour we have in how we
mutate SDNodes from one kind to another (See llvm.org/pr26808).
This is NFC other than the changes in memory behaviour, and I ran some
LNT tests to make sure this didn't hurt compile time. Not many tests
changed: there were a couple of 1-2% regressions reported, but there
were more improvements (of up to 4%) than regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262886 91177308-0d34-0410-b5e6-96231b3b80d8
Without actually parsing a type it is difficult to perdict where
the type definition ends. In other words, instead of expecting
the user of the parser API to hand over only the relevant bits
of the string being parsed, take the whole string, parse the type,
and get back the number of characters that have been read.
This will be used by the MIR testing infrastructure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262884 91177308-0d34-0410-b5e6-96231b3b80d8
Now the type API is always available, but when global-isel is not
built the implementation does nothing.
Note: The implementation free of ifdefs is WIP and tracked here in PR26576.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262873 91177308-0d34-0410-b5e6-96231b3b80d8
The mir infrastructure will need this for generic instructions and currently
this feature was only available through the anonymous TypePrinter class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262869 91177308-0d34-0410-b5e6-96231b3b80d8
This is useful for MIR serialization. Indeed generic machine instructions
must have a type and we don't want to duplicate the logic in the MIParser.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262868 91177308-0d34-0410-b5e6-96231b3b80d8
Until now curly braces could only be used in MS inline assembly to mark block start/end.
All curly braces were removed completely at a very early stage.
This approach caused bugs like:
"m{o}v eax, ebx" turned into "mov eax, ebx" without any error.
In addition, AVX-512 added special operands (e.g., k registers), which are also surrounded by curly braces that mark them as such.
Now, we need to keep the curly braces and identify at a later stage if they are marking block start/end (if so, ignore them), or surrounding special AVX-512 operands (if so, parse them as such).
This patch fixes the bug described above and enables the use of AVX-512 special operands.
This commit is the the llvm part of the patch.
The clang part of the review is: http://reviews.llvm.org/D17766
The llvm part of the review is: http://reviews.llvm.org/D17767
Differential Revision: http://reviews.llvm.org/D17767
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262843 91177308-0d34-0410-b5e6-96231b3b80d8
duplicated comments.
In several cases these had diverged making them especially nice to
canonicalize. I checked to make sure we weren't losing important
information of course.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262825 91177308-0d34-0410-b5e6-96231b3b80d8
Just cleaning this up, no functionality changed. Next up will be moving
it to use the sum type instead of arbitrary "pointer"-like enums.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262822 91177308-0d34-0410-b5e6-96231b3b80d8
the new pass manager.
The port will involve substantial edits here, and would likely introduce
bad formatting if formatted in isolation, so just get all the formatting
up to snuff. I'll also go through and try to freshen the doxygen here as
well as modernizing some of the code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262821 91177308-0d34-0410-b5e6-96231b3b80d8