The ARM backend has been using most of the MachO related subtarget
checks almost interchangeably, and since the only target it's had to
run on has been IOS (which is all three of MachO, Darwin and IOS) it's
worked out OK so far.
But we'd like to support embedded targets under the "*-*-none-macho"
triple, which means everything starts falling apart and inconsistent
behaviours emerge.
This patch should pick a reasonably sensible set of behaviours for the
new triple (and any others that come along, with luck). Some choices
were debatable (notably FP == r7 or r11), but we can revisit those
later when deficiencies become apparent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198617 91177308-0d34-0410-b5e6-96231b3b80d8
This requires a knowledge of the stack size which is not known until
the frame is complete, hence the need for the XCoreFTAOElim pass
which lowers the XCoreISD::FRAME_TO_ARGS_OFFSET instrution into its
final form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198614 91177308-0d34-0410-b5e6-96231b3b80d8
We also narrow the liveness of FP & LR during the prologue to
reflect the actual usage of the registers.
I have been unable to construct a test to prove the previous live
range was too large.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198611 91177308-0d34-0410-b5e6-96231b3b80d8
Longer term, we want to move users to "*-*-*-macho" for embedded work, but for
now people are relying on the last thing we told them, which is unfortunately
"*-*-darwin-eabi".
rdar://problem/15703934
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198602 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of crashing, raise an error when a subtraction expression
involves an undefined symbol.
This fixes PR18375.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198590 91177308-0d34-0410-b5e6-96231b3b80d8
The 0x66 prefix toggles between 16-bit and 32-bit addressing mode.
So in 32-bit mode it is used to switch to 16-bit addressing mode for the
following instruction, while in 16-bit mode it's the other way round — it's
used to switch to 32-bit mode instead.
Thus, emit the 0x66 prefix byte for OpSize only in 32-bit (and 64-bit) mode,
and introduce a new OpSize16 bit which is used in 16-bit mode instead.
This is just the basic infrastructure for that change; a subsequent patch
will add the new OpSize16 bit to the 32-bit instructions that need it.
Patch from David Woodhouse.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198586 91177308-0d34-0410-b5e6-96231b3b80d8
This is not really expected to work right yet. Mostly because we will
still emit the OpSize (0x66) prefix in all the wrong places, along with
a number of other corner cases. Those will all be fixed in the subsequent
commits.
Patch from David Woodhouse.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198584 91177308-0d34-0410-b5e6-96231b3b80d8
There is a wrong assumption that the vector element type and the
type of each ConstantSDNode in the build_vector were the same.
However, when promoting the integer operand of a legally typed
build_vector, the operand type and the vector element type do not
need to be the same
(See method 'DAGTypeLegalizer::PromoteIntOp_BUILD_VECTOR' in
LegalizeIntegerTypes.cpp).
in AArch64 backend, the following dag sequence:
C0: i1 = Constant<0>
C1: i1 = Constant<-1>
V: v8i1 = BUILD_VECTOR C1, C1, C0, C0, C0, C0, C0, C0
is type-legalized into:
NewC0: i32 = Constant<0>
NewC1: i32 = Constant<1>
V: v8i8 = BUILD_VECTOR NewC1, NewC1, NewC0, NewC0, NewC0, NewC0, NewC0, NewC0
Forcing a getZeroExtend to VTBits to ensure that the new constant
is correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198582 91177308-0d34-0410-b5e6-96231b3b80d8
This moves the check up into the parent class so that all targets can use it
without having to copy (and keep in sync) the same error message.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198579 91177308-0d34-0410-b5e6-96231b3b80d8
Move the ARM EHABI unwind opcode definitions from the ARM MCTargetDesc into LLVM
Support. This enables sharing of the definitions across the ARM target code as
well as llvm-readobj. This will allow implementation of the unwind decoding in
llvm-readobj.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198576 91177308-0d34-0410-b5e6-96231b3b80d8
Add some tests to validate correct register selection, including a fix
to an existing test which was requiring the *wrong* output.
Patch from David Woodhouse.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198566 91177308-0d34-0410-b5e6-96231b3b80d8
Removed vzeroupper from AVX-512 mode - our optimization gude does not recommend to insert vzeroupper at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198557 91177308-0d34-0410-b5e6-96231b3b80d8
Missed this when adding the skeleton analysis. Caught by a build break
in the next patch I'm working on when trying to use the analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198556 91177308-0d34-0410-b5e6-96231b3b80d8
__builtin_returnaddress requires that the value passed into is be a constant.
However, at -O0 even a constant expression may not be converted to a constant.
Emit an error message intead of crashing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198531 91177308-0d34-0410-b5e6-96231b3b80d8
All other uses of this macro in LLVM/clang have been moved to the function
definition so follow suite (and the usage advice) here too for consistency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198516 91177308-0d34-0410-b5e6-96231b3b80d8
This commit was the source of crasher PR18384:
While deleting: label %for.cond127
An asserting value handle still pointed to this value!
UNREACHABLE executed at llvm/lib/IR/Value.cpp:671!
Reverting to get the builders green, feel free to re-land after fixing up.
(Renato has a handy isolated repro if you need it.)
This reverts commit r198478.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198503 91177308-0d34-0410-b5e6-96231b3b80d8
getSCEV for an ashr instruction creates an intermediate zext
expression when it truncates its operand.
The operand is initially inside the loop, so the narrow zext
expression has a non-loop-invariant loop disposition.
LoopSimplify then runs on an outer loop, hoists the ashr operand, and
properly invalidate the SCEVs that are mapped to value.
The SCEV expression for the ashr is now an AddRec with the hoisted
value as the now loop-invariant start value.
The LoopDisposition of this wide value was properly invalidated during
LoopSimplify.
However, if we later get the ashr SCEV again, we again try to create
the intermediate zext expression. We get the same SCEV that we did
earlier, and it is still cached because it was never mapped to a
Value. When we try to create a new AddRec we abort because we're using
the old non-loop-invariant LoopDisposition.
I don't have a solution for this other than to clear LoopDisposition
when LoopSimplify hoists things.
I think the long-term strategy should be to perform LoopSimplify on
all loops before computing SCEV and before running any loop opts on
individual loops. It's possible we may want to rerun LoopSimplify on
individual loops, but it should rarely do anything, so rarely require
invalidating SCEV.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198478 91177308-0d34-0410-b5e6-96231b3b80d8
The motivation is to mark dump methods as used in debug builds so that they can
be called from lldb, but to not do so in release builds so that they can be
dead-stripped.
There's lots of potential follow-up work suggested in the thread
"Should dump methods be LLVM_ATTRIBUTE_USED only in debug builds?" on cfe-dev,
but everyone seems to agreen on this subset.
Macro name chosen by fair coin toss.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198456 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r198441.
This change doesn't build on Windows, and doesn't do the right thing on
Linux and other platforms that don't use a _Z prefix instead of __Z for
C++ names.
It also had no tests, so it wasn't clear how to fix it forward.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198445 91177308-0d34-0410-b5e6-96231b3b80d8
symbol name, also put the human readable name in a comment.
Also fix a bug in LLVMDisasmInstruction() that was not flushing
the raw_svector_ostream for the disassembled instruction string
before copying it to the output buffer that was causing truncation
of the output.
rdar://10173828
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198441 91177308-0d34-0410-b5e6-96231b3b80d8
Before this patch any program that wanted to know the final symbol name of a
GlobalValue had to link with Target.
This patch implements a compromise solution where the mangler uses DataLayout.
This way, any tool that already links with Target (llc, clang) gets the exact
behavior as before and new IR files can be mangled without linking with Target.
With this patch the mangler is constructed with just a DataLayout and DataLayout
is extended to include the information the Mangler needs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198438 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r198398, thus reapplying r198397.
I had accidentally introduced an endianness issue when applying the hash
to the type unit. Using support::ulittle64_t in the reinterpret_cast in
addDwarfTypeUnitType fixes this issue.
Original commit message:
Debug Info: Type Units: Simplify type hashing using IR-provided unique
names.
What's good for LTO metadata size problems ought to be good for non-LTO
debug info size too, so let's rely on the same uniqueness in both cases.
If it's insufficient for non-LTO for whatever reason (since we now won't
be uniquing CU-local types or any C types - but these are likely to not
be the most significant contributors to type bloat) we should consider a
frontend solution that'll help both LTO and non-LTO alike, rather than
using DWARF-level DIE-hashing that only helps non-LTO debug info size.
It's also much simpler this way and benefits C++ even more since we can
deduplicate lexically separate definitions of the same C++ type since
they have the same mangled name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198436 91177308-0d34-0410-b5e6-96231b3b80d8
The loop rerolling pass was failing with an assertion failure from a
failed cast on loops like this:
void foo(int *A, int *B, int m, int n) {
for (int i = m; i < n; i+=4) {
A[i+0] = B[i+0] * 4;
A[i+1] = B[i+1] * 4;
A[i+2] = B[i+2] * 4;
A[i+3] = B[i+3] * 4;
}
}
The code was casting the SCEV-expanded code for the new
induction variable to a phi-node. When the loop had a non-constant
lower bound, the SCEV expander would end the code expansion with an
add insted of a phi node and the cast would fail.
It looks like the cast to a phi node was only needed to get the
induction variable value coming from the backedge to compute the end
of loop condition. This patch changes the loop reroller to compare
the induction variable to the number of times the backedge is taken
instead of the iteration count of the loop. In other words, we stop
the loop when the current value of the induction variable ==
IterationCount-1. Previously, the comparison was comparing the
induction variable value from the next iteration == IterationCount.
This problem only seems to occur on 32-bit targets. For some reason,
the loop is not rerolled on 64-bit targets.
PR18290
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198425 91177308-0d34-0410-b5e6-96231b3b80d8
cycles
This allows the value equality check to work even if we don't have a dominator
tree. Also add some more comments.
I was worried about compile time impacts and did not implement reachability but
used the dominance check in the initial patch. The trade-off was that the
dominator tree was required.
The llvm utility function isPotentiallyReachable cuts off the recursive search
after 32 visits. Testing did not show any compile time regressions showing my
worries unjustfied.
No compile time or performance regressions at O3 -flto -mavx on test-suite +
externals.
Addresses review comments from r198290.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198400 91177308-0d34-0410-b5e6-96231b3b80d8
Reverting due to bot failure I won't have time to investigate until
tomorrow.
This reverts commit r198397.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198398 91177308-0d34-0410-b5e6-96231b3b80d8
What's good for LTO metadata size problems ought to be good for non-LTO
debug info size too, so let's rely on the same uniqueness in both cases.
If it's insufficient for non-LTO for whatever reason (since we now won't
be uniquing CU-local types or any C types - but these are likely to not
be the most significant contributors to type bloat) we should consider a
frontend solution that'll help both LTO and non-LTO alike, rather than
using DWARF-level DIE-hashing that only helps non-LTO debug info size.
It's also much simpler this way and benefits C++ even more since we can
deduplicate lexically separate definitions of the same C++ type since
they have the same mangled name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198397 91177308-0d34-0410-b5e6-96231b3b80d8
The cgo problem was that it wants dwarf2 which doesn't support direct
constant encoding of the location. So let's add support for dwarf2
encoding (using a location expression) of data member locations.
This reverts commit r198385.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198389 91177308-0d34-0410-b5e6-96231b3b80d8
Apologies for the noise - we're seeing some Go failures with cgo
interacting with Clang's debug info due to this change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198385 91177308-0d34-0410-b5e6-96231b3b80d8
The greedy register allocator tries to split a live-range around each
instruction where it is used or defined to relax the constraints on the entire
live-range (this is a last chance split before falling back to spill).
The goal is to have a big live-range that is unconstrained (i.e., that can use
the largest legal register class) and several small local live-range that carry
the constraints implied by each instruction.
E.g.,
Let csti be the constraints on operation i.
V1=
op1 V1(cst1)
op2 V1(cst2)
V1 live-range is constrained on the intersection of cst1 and cst2.
tryInstructionSplit relaxes those constraints by aggressively splitting each
def/use point:
V1=
V2 = V1
V3 = V2
op1 V3(cst1)
V4 = V2
op2 V4(cst2)
Because of how the coalescer infrastructure works, each new variable (V3, V4)
that is alive at the same time as V1 (or its copy, here V2) interfere with V1.
Thus, we end up with an uncoalescable copy for each split point.
To make tryInstructionSplit less aggressive, we check if the split point
actually relaxes the constraints on the whole live-range. If it does not, we do
not insert it.
Indeed, it will not help the global allocation problem:
- V1 will have the same constraints.
- V1 will have the same interference + possibly the newly added split variable
VS.
- VS will produce an uncoalesceable copy if alive at the same time as V1.
<rdar://problem/15570057>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198369 91177308-0d34-0410-b5e6-96231b3b80d8
CR logicals (crand, crxor, etc.) on the P7 need to be in the first slot of each
dispatch group. The old itinerary entry was just wrong (but has not mattered
because we don't generate these instructions).
This will matter when, in an upcoming commit, we start generating these
instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198359 91177308-0d34-0410-b5e6-96231b3b80d8
Several of the 64-bit fixed-point instructions with immediate operands were
using the 32-bit (i32) operand nodes instead of the corresponding 64-bit (i64)
operand definitions (u16imm instead of u16imm64, for example).
This error has had no effect so far, but would have caused type-checking
violations with an upcoming change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198356 91177308-0d34-0410-b5e6-96231b3b80d8
As noted in the comment above CodeGenPrepare::OptimizeInst, which aggressively
sinks compares to reduce pressure on the condition register(s), for targets
such as PowerPC with multiple condition registers, this may not be the right
thing to do. This adds an HasMultipleConditionRegisters boolean to TLI, and
CodeGenPrepare::OptimizeInst is skipped when HasMultipleConditionRegisters is
true.
This functionality will be used by the PowerPC backend in an upcoming commit.
Especially when the PowerPC backend starts tracking individual condition
register bits as separate allocatable entities (which will happen in this
upcoming commit), this sinking from CodeGenPrepare::OptimizeInst is
significantly suboptimial.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198354 91177308-0d34-0410-b5e6-96231b3b80d8
Use an if statement instead of a pair of ternary operators checking
the same condition.
Use a cheap method call rather than returning the local symbol.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198351 91177308-0d34-0410-b5e6-96231b3b80d8
Even within a multiclass, we had been generating concrete implicit anonymous
defs when parsing values (generally in value lists). This behavior was
incorrect, and led to errors when multiclass parameters were used in the
parameter list of the implicit anonymous def.
If we had some multiclass:
multiclass mc<string n> {
... : SomeClass<SomeOtherClass<n> >
The capture of the multiclass parameter 'n' would not work correctly, and
depending on how the implicit SomeOtherClass was used, either TableGen would
ignore something it shouldn't, or would crash.
To fix this problem, when inside a multiclass, we generate prototype anonymous
defs for implicit anonymous defs (just as we do for explicit anonymous defs).
Within the multiclass, the current record prototype is populated with a node
that is essentially: !cast<SomeOtherClass>(!strconcat(NAME, anon_value_name)).
This is then resolved to the correct concrete anonymous def, in the usual way,
when NAME is resolved during multiclass instantiation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198348 91177308-0d34-0410-b5e6-96231b3b80d8
TableGen had been generating a different name for an anonymous multiclass's
NAME for every def in the multiclass. This had an unfortunate side effect: it
was impossible to reference one def within the multiclass from another (in the
parameter list, for example). By making sure we only generate an anonymous name
once per multiclass (which, as it turns out, requires only changing the name
parameter to reference type), we can now concatenate NAME within the multiclass
with a def name in order to generate a reference to that def.
This does not matter so much, in and of itself, but is necessary for a
follow-up commit that will fix variable capturing in implicit anonymous
multiclass defs (and that is important).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198340 91177308-0d34-0410-b5e6-96231b3b80d8
When widening an IV to remove s/zext, we generally try to eliminate
the original narrow IV. However, LCSSA phi nodes outside the loop were
still using the original IV. Clean this up more aggressively to avoid
redundancy in generated code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198338 91177308-0d34-0410-b5e6-96231b3b80d8
This patch makes it possible to select the ABI with -mattr. It will be used to
forward clang's -target-abi option to llvm's CodeGen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198304 91177308-0d34-0410-b5e6-96231b3b80d8
When there are cycles in the value graph we have to be careful interpreting
"Value*" identity as "value" equivalence. We interpret the value of a phi node
as the value of its operands.
When we check for value equivalence now we make sure that the "Value*" dominates
all cycles (phis).
%0 = phi [%noaliasval, %addr2]
%l = load %ptr
%addr1 = gep @a, 0, %l
%addr2 = gep @a, 0, (%l + 1)
store %ptr ...
Before this patch we would return NoAlias for (%0, %addr1) which is wrong
because the value of the load is from different iterations of the loop.
Tested on x86_64 -mavx at O3 and O3 -flto with no performance or compile time
regressions.
PR18068
radar://15653794
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198290 91177308-0d34-0410-b5e6-96231b3b80d8
During the years there have been some attempts at figuring out how to
align byval arguments. A look at the commit log suggests that they
were
* Use the ABI alignment.
* When that was not sufficient for x86-64, I added the 's' specification to
DataLayout.
* When that was not sufficient Evan added the virtual getByValTypeAlignment.
* When even that was not sufficient, we just got the FE to add the alignment
to the byval.
This patch is just a simple cleanup that removes my first attempt at fixing the
problem. I also added an AArch64 implementation of getByValTypeAlignment to
make sure this patch is a nop. I also left the 's' parsing for backward
compatibility.
I will send a short email to llvmdev about the change for anyone maintaining
an out of tree target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198287 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Support/ThreadLocal.cpp:53:15: error: typedef 'SIZE_TOO_BIG' locally defined but not used [-Werror=unused-local-typedefs]
typedef int SIZE_TOO_BIG[sizeof(pthread_key_t) <= sizeof(data) ? 1 : -1];
Done the C++11 way, switching on and using LLVM_STATIC_ASSERT() instead of LLVM_ATTRIBUTE_UNUSED.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198255 91177308-0d34-0410-b5e6-96231b3b80d8
Checking the trailing letter of the mnemonic is insufficient. Be more thorough
in the scanning of the instruction to ensure that we correctly work with the
predicated mnemonics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198235 91177308-0d34-0410-b5e6-96231b3b80d8
r198196: Use a pointer to keep track of the skeleton unit for each normal unit and construct it up front.
r198199: Reapply r198196 with a fix to zero initialize the skeleton pointer.
r198202: Fix aranges and split dwarf by ensuring that the symbol and relocation back to the compile unit from the aranges section is to the skeleton unit and not the one in the dwo.
with a fix to use integer 0 for DW_AT_low_pc since the relocation to the text section symbol was causing issues with COFF. Accordingly remove addLocalLabelAddress and machinery since we're not currently using it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198222 91177308-0d34-0410-b5e6-96231b3b80d8
r198196: Use a pointer to keep track of the skeleton unit for each normal unit and construct it up front.
r198199: Reapply r198196 with a fix to zero initialize the skeleton pointer.
r198202: Fix aranges and split dwarf by ensuring that the symbol and relocation back to the compile unit from the aranges section is to the skeleton unit and not the one in the dwo.
They could be reproducible with explicit target.
llvm/lib/MC/WinCOFFObjectWriter.cpp:224: bool {anonymous}::COFFSymbol::should_keep() const: Assertion `Section->Number != -1 && "Sections with relocations must be real!"' failed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198208 91177308-0d34-0410-b5e6-96231b3b80d8
back to the compile unit from the aranges section is to the skeleton
unit and not the one in the dwo.
Do this by adding a method to grab a forwarded on local sym and local
section by querying the skeleton if one exists and using that. Add
a few tests to verify the relocations are back to the correct section.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198202 91177308-0d34-0410-b5e6-96231b3b80d8
and construct it up front. Add address ranges at the end and a helper
routine so that we're not needlessly using an indirction in the case
of split dwarf.
Update testcases according to the new ordering of attributes on
the compile unit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198196 91177308-0d34-0410-b5e6-96231b3b80d8
For AArch64 backend, if DAGCombiner see "sext(setcc)", it will
combine them together to a single setcc with extended value type.
Then if it see "zext(setcc)", it assumes setcc is Vxi1, and try to
create "(and (vsetcc), (1, 1, ...)". While setcc isn't Vxi1,
DAGcombiner will create wrong node and get wrong code emitted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198190 91177308-0d34-0410-b5e6-96231b3b80d8
(unittests/ExecutionEngine/JIT/CMakeLists.txt is still missing for now, since
it handles export files in a strange way: It generates a .exports file from a
.def file instead of the other way round.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198183 91177308-0d34-0410-b5e6-96231b3b80d8
The DPR and SPR register lists are also register lists. Furthermore, the
registers need not be checked individually since the register type can be
checked via the list kind. Use that to simplify the logic and fix the incorrect
assertion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198174 91177308-0d34-0410-b5e6-96231b3b80d8
In order to provide compatibility with the GNU assembler, provide aliases for
pre-UAL mnemonics for floating point operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198172 91177308-0d34-0410-b5e6-96231b3b80d8
The vstm family of VFP instructions belong to the VFP store itinerary class, not
the VFP load itinerary class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198170 91177308-0d34-0410-b5e6-96231b3b80d8
Directive parsers must return false if the target assembler is interested in
handling the directive. The Error member function returns true always. Using
the 'return Error()' pattern would incorrectly indicate to the general parser
that the target was not interested in the directive, when in reality it simply
encountered a badly formed directive or some other error. This corrects the
behaviour to ensure that the parser behaves appropriately.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198132 91177308-0d34-0410-b5e6-96231b3b80d8
Schedule more conservatively to account for stalls on floating point
resources and latency. Use the AGU resource to model latency stalls
since it's shared between FP and LD/ST instructions. This might not be
completely accurate but should work well in practice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198125 91177308-0d34-0410-b5e6-96231b3b80d8
Many vector operations never had itineraries. Since the new machine
model was a mapping from existing itinerary classes, we don't have a
model for these. We still want to migrate A9 even though no one has
invested in a complete model, so mark it incomplete to avoid the
scheduler asserting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198123 91177308-0d34-0410-b5e6-96231b3b80d8
PostGenericScheduler uses either the new machine model or the hazard
checker for top-down scheduling. Most of the infrastructure for PreRA
machine scheduling is reused.
With a some tuning, this should allow MachineScheduler to be default
for all ARM targets, including cortex-A9, using the new machine
model. Likewise, with additional tuning, it should be able to replace
PostRAScheduler for all targets.
The PostMachineScheduler pass does not currently run the
AntiDepBreaker. There is less need for it on targets that are already
running preRA MachineScheduler. I want to prove it's necessary before
committing to the maintenance burden.
The PostMachineScheduler also currently removes kill flags and adds
them all back later. This is a bit ridiculous. I'd prefer passes to
directly use a liveness utility than rely on flags.
A test case that enables this scheduler will be included in a
subsequent checkin that updates the A9 model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198122 91177308-0d34-0410-b5e6-96231b3b80d8
Factor the MachineFunctionPass into MachineSchedulerBase.
Split the DAG class into ScheduleDAGMI and SchedulerDAGMILive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198119 91177308-0d34-0410-b5e6-96231b3b80d8
vector shift by immedate count (VSHLI/VSRLI/VSRAI) into a build_vector when
the vector in input to the shift is a build_vector of all constants or UNDEFs.
Target specific nodes for packed shifts by immediate count are in
general introduced by function 'getTargetVShiftByConstNode' (in
X86ISelLowering.cpp) when lowering shift operations, SSE/AVX immediate
shift intrinsics and (only in very few cases) SIGN_EXTEND_INREG dag
nodes.
This patch adds extra rules for simplifying vector shifts inside
function 'getTargetVShiftByConstNode'.
Added file test/CodeGen/X86/vec_shift5.ll to verify that packed
shifts by immediate are correctly folded into a build_vector when the
input vector to the shift dag node is a vector of constants or undefs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198113 91177308-0d34-0410-b5e6-96231b3b80d8
The GNU assembler supports .rep as an alias for .rept. This simply creates the
alias for it and introduces a test for both .rept and .rep.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198097 91177308-0d34-0410-b5e6-96231b3b80d8
widespread glibc bugs.
The glibc implementation of exp10 has a very serious precision bug in
version 2.15 (and older versions). This is still very widely used (the
current Ubuntu LTS for example uses it) and so it isn't reasonable to
make transforms that produce these functions. This fixes many
miscompiles introduced when we started transforming pow(10.0, ...) into
exp10, and it may have fixed other latent miscompiles where exp10
provided sufficient precision but exp10f did not.
This is all really horrible. The primary bug has been fixed for over
a year and glibc 2.18 works correctly for the test cases I have, but it
will be 2017 before the LTS using 2.15 is no longer supported by Ubuntu
(and thus reasonable for folks to be relying on). =[ We're either going
to need to live without these optimizations, or find a way to switch
behavior more dynamically than using simply the fact that the OS is
"Linux".
To make matters worse, there appears to be significant testing and
fixing of numerous other bugs in the exp10 family of functions right now
in glibc. While those haven't been causing problems I've seen in the
wild, it gives me concerns that we may need to wait until an even later
release of glibc before we can reliably transform code into exp10.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198093 91177308-0d34-0410-b5e6-96231b3b80d8
just calling into MAI and is only abstracting for a single interface that
we actually need to check in multiple places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198092 91177308-0d34-0410-b5e6-96231b3b80d8
ConstantSDNodes (or UNDEFs) into a simple BUILD_VECTOR.
For example, given the following sequence of dag nodes:
i32 C = Constant<1>
v4i32 V = BUILD_VECTOR C, C, C, C
v4i32 Result = SIGN_EXTEND_INREG V, ValueType:v4i1
The SIGN_EXTEND_INREG node can be folded into a build_vector since
the vector in input is a BUILD_VECTOR of constants.
The optimized sequence is:
i32 C = Constant<-1>
v4i32 Result = BUILD_VECTOR C, C, C, C
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198084 91177308-0d34-0410-b5e6-96231b3b80d8
It's no longer necessary to lazily add members to the DICompositeType
member list. Instead any lazy members (special member functions and
member template instantiations) are added to the parent late based on
their context link, the same way that nested types have always been
handled (never being in the member list - just added to the parent DIE
lazily based on context).
Clang's been updated not to use this function anymore as it improves
type unit consistency by never emitting lazy members in type units.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198079 91177308-0d34-0410-b5e6-96231b3b80d8
much more clear to me. I meant to make this change before committing the
original patch, but forgot to merge it in. Sorry.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198069 91177308-0d34-0410-b5e6-96231b3b80d8
This is an iterator which you can build around a MemoryBuffer. It will
iterate through the non-empty, non-comment lines of the buffer as
a forward iterator. It should be small and reasonably fast (although it
could be made much faster if anyone cares, I don't really...).
This will be used to more simply support the text-based sample
profile file format, and is largely based on the original patch by
Diego. I've re-worked the style of it and separated it from the work of
producing a MemoryBuffer from a file which both simplifies the interface
and makes it easier to test.
The style of the API follows the C++ standard naming conventions to fit
in better with iterators in general, much like the Path and FileSystem
interfaces follow standard-based naming conventions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198068 91177308-0d34-0410-b5e6-96231b3b80d8
The .even directive aligns content to an evan-numbered address. This is an ARM
specific directive applicable to any section.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198031 91177308-0d34-0410-b5e6-96231b3b80d8
E.g. the codegen result is
fmls v1.2s, v0.2s, v2.s[3]
which is expected to be
fmls v0.2s, v1.2s, v2.s[3]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198001 91177308-0d34-0410-b5e6-96231b3b80d8
...namely LOAD AND ADD, LOAD AND AND, LOAD AND OR and LOAD AND EXCLUSIVE OR.
LOAD AND ADD LOGICAL isn't really separately useful for LLVM.
I'll look at adding reusing the CC results in new year.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197985 91177308-0d34-0410-b5e6-96231b3b80d8
DAG.getVectorShuffle() doesn't always return a vector_shuffle node.
If mask is the exact sequence of it's operand(For example, operand_0
is v8i8, and the mask is 0, 1, 2, 3, 4, 5, 6, 7), it will directly
return that operand. So a check is added here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197967 91177308-0d34-0410-b5e6-96231b3b80d8
This failure caused by improper condition when lowering shuffle_vector
to scalar_to_vector. After this patch NEON_VDUP with v1i64 will not
be generated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197966 91177308-0d34-0410-b5e6-96231b3b80d8
Check for single use of fmul node in fused multiply patterns
to allow generation of fused multiply add/sub instructions.
Otherwise fmul operation ends up being repeated more than
once which does not help peformance on targets with
only one MAC unit, as for example cortex-a53.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197929 91177308-0d34-0410-b5e6-96231b3b80d8
The correct pattern matching should be:
- fnmadd is (-Ra) + (-Rn)*Rm which should be matched as:
fma (fneg node:$Rn), node:$Rm, (fneg node:$Ra) and as
(f32 (fsub (f32 (fneg FPR32:$Ra)), (f32 (fmul FPR32:$Rn, FPR32:$Rm))))
- fnmsub is (-Ra) + Rn*Rm which should be matched as
fma node:$Rn, node:$Rm, (fneg node:$Ra) and as
(f32 (fsub (f32 (fmul FPR32:$Rn, FPR32:$Rm)), FPR32:$Ra))))
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197928 91177308-0d34-0410-b5e6-96231b3b80d8
Split sadd.with.overflow into add + sadd.with.overflow to allow
analysis and optimization. This should ideally be done after
InstCombine, which can perform code motion (eventually indvars should
run after all canonical instcombines). We want ISEL to recombine the
add and the check, at least on x86.
This is currently under an option for reducing live induction
variables: -liv-reduce. The next step is reducing liveness of IVs that
are live out of the overflow check paths. Once the related
optimizations are fully developed, reviewed and tested, I do expect
this to become default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197926 91177308-0d34-0410-b5e6-96231b3b80d8
(optional) DWARF sections, so compiling with -g does not result in
different code being generated.
rdar://problem/15623193
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197922 91177308-0d34-0410-b5e6-96231b3b80d8
The bkpt mnemonic has an implicit immediate constant of 0 unless otherwise
specified. Add an instruction alias for the unvalued breakpoint mnemonic to
treat it as a 0. This improves compatibility with GNU AS.
Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197913 91177308-0d34-0410-b5e6-96231b3b80d8
If the Scalarizer scalarized a vector PHI but could not scalarize
all uses of it, it would insert a series of insertelements to reconstruct
the vector PHI value from the scalar ones. The problem was that it would
emit these insertelements immediately after the PHI, even if there were
other PHIs after it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197909 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Before this change the instrumented code before Ret instructions looked like:
<Unpoison Frame Redzones>
if (Frame != OriginalFrame) // I.e. Frame is fake
<Poison Complete Frame>
Now the instrumented code looks like:
if (Frame != OriginalFrame) // I.e. Frame is fake
<Poison Complete Frame>
else
<Unpoison Frame Redzones>
Reviewers: eugenis
Reviewed By: eugenis
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2458
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197907 91177308-0d34-0410-b5e6-96231b3b80d8
Backends like OptParserEmitter assume that record names can be used as valid
identifiers.
The period '.' in generated anonymous names broke that assumption, causing a
build-time error and in practice forcing all records to be named.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197869 91177308-0d34-0410-b5e6-96231b3b80d8
If the extension of a loaded value is compared against zero and used in
other arithmetic, InstCombine will change the comparison to use the
unextended load. It's also possible that the comparison could be against
the unextended load from the outset.
In DAG form this becomes a truncation of an extending load. We want to
strip the truncation if possible so that we can use load-and-test instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197804 91177308-0d34-0410-b5e6-96231b3b80d8
The handling of ANY_EXTEND and ZERO_EXTEND was too strict. In this context
we can treat ZERO_EXTEND in much the same way as an AND and then also handle
outermost ZERO_EXTENDs.
I couldn't find a test that benefited from the ANY_EXTEND change, but it's
more obvious to write it this way once SIGN_EXTEND and ZERO_EXTEND are
handled differently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197802 91177308-0d34-0410-b5e6-96231b3b80d8