register-indirect address with an offset of 0.
It used to be that a DBG_VALUE is a register-indirect value if the offset
(operand 1) is nonzero. The new convention is that a DBG_VALUE is
register-indirect if the first operand is a register and the second
operand is an immediate. For plain registers use the combination reg, reg.
rdar://problem/13658587
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180816 91177308-0d34-0410-b5e6-96231b3b80d8
Always fold a shuffle-of-shuffle into a single shuffle when there's only one
input vector in the first place. Continue to be more conservative when there's
multiple inputs.
rdar://13402653
PR15866
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180802 91177308-0d34-0410-b5e6-96231b3b80d8
First, taking advantage of the fact that the virtual base registers are allocated in order of the local frame offsets, remove the quadratic register-searching behavior. Because of the ordering, we only need to check the last virtual base register created.
Second, store the frame index in the FrameRef structure, and get the frame index and the local offset from this structure at the top of the loop iteration. This allows us to de-nest the loops in insertFrameReferenceRegisters (and I think makes the code cleaner). I also moved the needsFrameBaseReg check into the first loop over instructions so that we don't bother pushing FrameRefs for instructions that don't want a virtual base register anyway.
Lastly, and this is the only functionality change, avoid the creation of single-use virtual base registers. These are currently not useful because, in general, they end up replacing what would be one r+r instruction with an add and a r+i instruction. Committing this removes the XFAIL in CodeGen/PowerPC/2007-09-07-LoadStoreIdxForms.ll
Jim has okayed this off-list.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180799 91177308-0d34-0410-b5e6-96231b3b80d8
The actual storage was already using unsigned, but the interface was using
uint64_t. This is wasteful on 32 bits and looks to be the root causes of
a miscompilation on Windows where a value was being sign extended to 64bits
to compare with the result of getSlotIndex.
Patch by Pasi Parviainen!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180791 91177308-0d34-0410-b5e6-96231b3b80d8
Differences in bitwidth between X and Y could exist even if C1 and C2 have
the same Log2 representation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180779 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes the optimization introduced in r179748 and reverted in r179750.
While the optimization was sound, it did not properly respect differences in
bit-width.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180777 91177308-0d34-0410-b5e6-96231b3b80d8
1. VarArgStyleRegisters: functionality that emits "store" instructions for byval regs moved out into separated method "StoreByValRegs". Before this patch VarArgStyleRegisters had confused use-cases. It was used for both variadic functions and for regular functions with byval parameters. In last case it created new stack-frame and registered it as VarArg frame, that is wrong.
This patch replaces VarArgsStyleRegisters usage for byval parameters with StoreByValRegs method.
2. In ARMMachineFunctionInfo, "get/setVarArgsRegSaveSize" was renamed to "get/setArgRegsSaveSize". By the same reason. Sometimes it was used for variadic functions, and sometimes for byval parameters in regular functions. Actually, this property means the size of registers, that keeps arguments, and thats why it was renamed.
3. In ARMISelLowering.cpp, ARMTargetLowering class, in methods computeRegArea and StoreByValRegs, VARegXXXXXX was renamed to ArgRegsXXXXXX still by the same reasons.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180774 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes 2013-04-04-RelocAddend.ll. We don't have a testcase for non external
relocs with an Addend. I will try to write one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180767 91177308-0d34-0410-b5e6-96231b3b80d8
- Revise previous patches of the same purpose by fixing
*) grep <PA> | not grep <PB> semantically is not the same as
CHECK: <PA>{{^<PB>.*$}} as the former will check all occurrences of <PA>
while the later only check the first match. As the result, CHECK needs
putting in all place where <PA> occurs.
*) grep <PA> | count <N> needs a final CHECK-NOT of the same pattern.
(As 'CHECK-<N>' is proposed for discussion, converting 'grep | count <N>'
where N > 1 is postponed.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180742 91177308-0d34-0410-b5e6-96231b3b80d8
The `llvm.tls_init_funcs' (created by the front-end) holds pointers to the TLS
initialization functions. These need to be placed into the correct section so
that they are run before `main()'.
<rdar://problem/13733006>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180737 91177308-0d34-0410-b5e6-96231b3b80d8
For regular object files this is only meaningful for common symbols. An object
file format with direct support for atoms should be able to provide alignment
information for all symbols.
This replaces getCommonSymbolAlignment and fixes
test-common-symbols-alignment.ll on darwin. This also includes a fix to
MachOObjectFile::getSymbolFlags. It was marking undefined symbols as common
(already tested by existing mcjit tests now that it is used).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180736 91177308-0d34-0410-b5e6-96231b3b80d8
The implemented RuntimeDyldImpl interface is public. Everything else is private.
Since these classes are not inherited from (yet), there is no need to have
protected members.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180733 91177308-0d34-0410-b5e6-96231b3b80d8
This resurrects r179957, but adds code that makes sure we don't touch
atomic/volatile stores:
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:
a[i] =
may-alias with a[i] load
if (cond)
a[i] = Y
into an unconditional store.
a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp
We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case where the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.
hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180731 91177308-0d34-0410-b5e6-96231b3b80d8