This makes it possible to distinguish between mesa shaders
and other kernels even in the presence of compute shaders.
Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Differential Revision: http://reviews.llvm.org/D18559
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265589 91177308-0d34-0410-b5e6-96231b3b80d8
default constructor, instead of relying on the default constructor of
unique_ptr.
Second attempt at fixing the windows bot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265584 91177308-0d34-0410-b5e6-96231b3b80d8
helper class.
The default constructor creates invalid (isValid() == false) instances
and may be used to communicate that a mapping was not found.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265581 91177308-0d34-0410-b5e6-96231b3b80d8
Use a DenseSet instead of a DenseMap for constants in LLVMContextImpl.
Last time I looked at this was some time before r223588, when
DenseSet<V> had no advantage over DenseMap<V,char>. After r223588,
there's a 50% memory savings.
This is all mechanical. There were little bits of missing API from
DenseSet so I added the trivial implementations:
- iterator::operator++(int)
- template <class LookupKeyT> insert_as(ValueTy, LookupKeyT)
There should be no functionality change, just reduced memory consumption
(this wasn't on a profile or anything; just a cleanup I stumbled on).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265577 91177308-0d34-0410-b5e6-96231b3b80d8
The MDStringCache doesn't need to be explicitly cleared before
destruction. The destructor handles it at least as efficiently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265576 91177308-0d34-0410-b5e6-96231b3b80d8
This will be used by the register bank select pass to assign register banks
for generic virtual registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265573 91177308-0d34-0410-b5e6-96231b3b80d8
when GISel is not built.
The positive side effects are:
- We do not have to define dummy implementation
- We do not have to do weird gymnastic to avoid like issues (like
missing constructor or vtable for the base classes)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265570 91177308-0d34-0410-b5e6-96231b3b80d8
Rework the access to GlobalISel APIs to contain how much of
the APIs we need to access for the final executable to build when
GlobalISel is not built.
This prevents massive usage of ifdefs in various places. Now, all the
GlobalISel ifdefs will be happing only in AArch64TargetMachine.cpp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265567 91177308-0d34-0410-b5e6-96231b3b80d8
1. Add FullUnrollMaxCount option that works like MaxCount, but also limits
the unroll count for fully unrolled loops. So if a loop has an iteration
count over this, it won't fully unroll.
2. Add CLI options for MaxCount and the new option, so they can be tested
(plus a test).
3. Make partial unrolling obey MaxCount.
An example use-case (the out of tree one this is originally designed for) is
a target’s TTI can analyze a loop and decide on a max unroll count separate
from the size threshold, e.g. based on register pressure, then constrain
LoopUnroll to not exceed that, regardless of the size of the unrolled loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265562 91177308-0d34-0410-b5e6-96231b3b80d8
The method checks that the value is fully defined accross the different partial
mappings and that the partial mappings are compatible between each other.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265556 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r265550. There're problems with endianness on dumping instruction bytes. Need to find out how to use support::ulittle32_t type properly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265554 91177308-0d34-0410-b5e6-96231b3b80d8
This will avoid link-time error as the defautl constructor of RegisterBankInfo is
the only one available when GlobalISel is not built.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265549 91177308-0d34-0410-b5e6-96231b3b80d8
when DenseMap growed and moved memory. I verified it fixed the bootstrap
problem on x86_64-linux-gnu but I cannot verify whether it fixes
the bootstrap error on clang-ppc64be-linux. I will watch the build-bot
result closely.
Replace analyzeSiblingValues with new algorithm to fix its compile
time issue. The patch is to solve PR17409 and its duplicates.
analyzeSiblingValues is a N x N complexity algorithm where N is
the number of siblings generated by reg splitting. Although it
causes siginificant compile time issue when N is large, it is also
important for performance since it removes redundent spills and
enables rematerialization.
To solve the compile time issue, the patch removes analyzeSiblingValues
and replaces it with lower cost alternatives containing two parts. The
first part creates a new spill hoisting method in postOptimization of
register allocation. It does spill hoisting at once after all the spills
are generated instead of inside every instance of selectOrSplit. The
second part queries the define expr of the original register for
rematerializaiton and keep it always available during register allocation
even if it is already dead. It deletes those dead instructions only in
postOptimization. With the two parts in the patch, it can remove
analyzeSiblingValues without sacrificing performance.
Differential Revision: http://reviews.llvm.org/D15302
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265547 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When the backedge taken codition is computed from an icmp, SCEV can
deduce the backedge taken count only if one of the sides of the icmp
is an AddRecExpr. However, due to sign/zero extensions, we sometimes
end up with something that is not an AddRecExpr.
However, we can use SCEV predicates to produce a 'guarded' expression.
This change adds a method to SCEV to get this expression, and the
SCEV predicate associated with it.
In HowManyGreaterThans and HowManyLessThans we will now add a SCEV
predicate associated with the guarded backedge taken count when the
analyzed SCEV expression is not an AddRecExpr. Note that we only do
this as an alternative to returning a 'CouldNotCompute'.
We use new feature in Loop Access Analysis and LoopVectorize to analyze
and transform more loops.
Reviewers: anemet, mzolotukhin, hfinkel, sanjoy
Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D17201
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265535 91177308-0d34-0410-b5e6-96231b3b80d8
AArch64InstrInfo::optimizeCompareInstr has a bug which causes generation of incorrect code (PR#27158).
The patch refactors the function to simplify reviewing the fix of the bug.
1. Function name ‘modifiesConditionCode’ is changed to ‘areCFlagsAccessedBetweenInstrs’
to reflect that the function can check modifying accesses, reading accesses or both.
2. Function ‘AArch64InstrInfo::optimizeCompareInstr’
- Documented the function
- Cmp_NZCV is DeadNZCVIdx to reflect that it is an operand index of dead NZCV
- The code for the case of substituting CmpInstr is put into separate
functions the main of them is ‘substituteCmpInstr’.
Differential Revision: http://reviews.llvm.org/D18609
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265531 91177308-0d34-0410-b5e6-96231b3b80d8
To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has
undefined behavior for negative numbers other than -0.0 (which allows
for better optimization, because there is no need to worry about errno
being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt."
This means that it's unsafe to replace sqrt with llvm.sqrt unless the
call is annotated with nnan.
Thanks to Hal Finkel for pointing this out!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265521 91177308-0d34-0410-b5e6-96231b3b80d8
We never delete any MDString until the context is destroyed. Might as
well throw them onto a BumpPtrAllocator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265520 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of copying arguments from the source function to the
destination, steal them. This has a few advantages.
- The ValueMap doesn't need to be seeded with (or cleared of)
Arguments.
- Often the destination function won't have created any arguments yet,
so this avoids malloc traffic.
- Argument names don't need to be copied.
Because argument lists are lazy, this required a new
Function::stealArgumentListFrom helper.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265519 91177308-0d34-0410-b5e6-96231b3b80d8
Fixed to adapt a use of enterBasicBlock() in my last commit (because I
had follow on patches in my repository that change the code).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265513 91177308-0d34-0410-b5e6-96231b3b80d8
Make it obvious that the argument cannot be nullptr.
Remove an unnecessary nullptr check in initRegState.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265511 91177308-0d34-0410-b5e6-96231b3b80d8
We must remove all aliased registers which may be more than the all sub
and super registers combined.
Bug found while reading the code. The bug does not affect any existing
target as the only use of register aliases I could found were control
registers on ARM and Hexagon which are all reserved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265510 91177308-0d34-0410-b5e6-96231b3b80d8
r265273 added Mapper::mapBlockAddress, which delays mapping a
blockaddress value until the function has a body. The condition was
backwards, and should be checking Function::empty instead of
GlobalValue::isDeclaration.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265508 91177308-0d34-0410-b5e6-96231b3b80d8