IndVarSimplify is willing to move divide instructions outside of their
loop bodies if they are invariant of the loop. However, it may not be
safe to expand them if we do not know if they can trap.
Instead, check to see if it is not safe to expand the instruction and
skip the expansion.
This fixes PR16041.
Testcase by Rafael Ávila de Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183239 91177308-0d34-0410-b5e6-96231b3b80d8
The ARM backend did not expect LDRBi12 to hold a constant pool operand.
Allow for LLVM to deal with the instruction similar to how it deals with
LDRi12.
This fixes PR16215.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183238 91177308-0d34-0410-b5e6-96231b3b80d8
Specifying the load address for Darwin i386 dylibs was a performance
optimization for dyld that is not relevant for x86_64 or arm. We can just
remove this now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183230 91177308-0d34-0410-b5e6-96231b3b80d8
The problem this time seems to be a thinko. We were assuming that in the CFG
A
| \
| B
| /
C
speculating the basic block B would cause only the phi value for the B->C edge
to be speculated. That is not true, the phi's are semantically in the edges, so
if the A->B->C path is taken, any code needed for A->C is not executed and we
have to consider it too when deciding to speculate B.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183226 91177308-0d34-0410-b5e6-96231b3b80d8
Specifically the following work was done:
1. If the operation was not implemented, I implemented it.
2. If the operation was already implemented, I just moved its location
in the APFloat header into the IEEE-754R 5.7.2 section. If the name was
incorrect, I put in a comment giving the true IEEE-754R name.
Also unittests have been added for all of the functions which did not
already have a unittest.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183179 91177308-0d34-0410-b5e6-96231b3b80d8
(4.58s vs 3.2s on an oldish Mac Tower).
The corresponding src is excerpted bellow. The lopp accounts for about 90% of execution time.
--------------------
cat -n test-suite/MultiSource/Benchmarks/Olden/em3d/make_graph.c
90
91 for (k=0; k<j; k++)
92 if (other_node == cur_node->to_nodes[k]) break;
The defective layout is sketched bellow, where the two branches need to swap.
------------------------------------------------------------------------
L:
...
if (cond) goto out-of-loop
goto L
While this code sequence is defective, I don't understand why it incurs 1/3 of
execution time. CPU-event-profiling indicates the poor laoyout dose not increase
in br-misprediction; it dosen't increase stall cycle at all, and it dosen't
prevent the CPU detect the loop (i.e. Loop-Stream-Detector seems to be working fine
as well)...
The root cause of the problem is that the layout pass calls AnalyzeBranch()
with basic-block which is not updated to reflect its current layout.
rdar://13966341
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183174 91177308-0d34-0410-b5e6-96231b3b80d8
PR16069 is an interesting case where an incoming value to a PHI is a
trap value while also being a 'ConstantExpr'.
We do not consider this case when performing the 'HoistThenElseCodeToIf'
optimization.
Instead, make our modifications more conservative if we detect that we
cannot transform the PHI to a select.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183152 91177308-0d34-0410-b5e6-96231b3b80d8
This was missing from r182908. I didn't noticed it at the time because the MCJIT tests were
disabled when building with cmake on ppc64 (which I fixed in r183143).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183147 91177308-0d34-0410-b5e6-96231b3b80d8
The CopyToReg nodes will sometimes try to copy a value from a VGPR to an
SGPR. This kind of copy is not possible, so we need to detect
VGPR->SGPR copies and do something else. The current strategy is to
replace these copies with VGPR->VGPR copies and hope that all the users
of CopyToReg can accept VGPRs as arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183132 91177308-0d34-0410-b5e6-96231b3b80d8
The lowering of stores is now mostly handled in the tablegen files. No
more BUFFER_STORE nodes I generated during legalization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183130 91177308-0d34-0410-b5e6-96231b3b80d8
This is needed in clang so one can check if the object needs the
destructor called after its memory was freed. This is useful when
creating many APInt/APFloat objects with placement new, where the
overhead of tracking the pointers for cleanup is significant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183100 91177308-0d34-0410-b5e6-96231b3b80d8
index greater than the size of the vector is invalid. The shuffle may be
shrinking the size of the vector. Fixes a crash!
Also drop the maximum recursion depth of the safety check for this
optimization to five.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183080 91177308-0d34-0410-b5e6-96231b3b80d8