methods are no longer needed now that LinearScan has gone away.
(Contains tweaks trivialSpillEverywhere to enable the removal of getNewVRegs).
llvm-svn: 151658
debug info for assembly files. We were already doing the right thing when
producing debug info for C/C++.
ELF linkers don't know dwarf, so they depend on these relocations to produce
valid dwarf output.
llvm-svn: 151655
To avoid problems with zero shifts when getting the bits that move between words
we use a trick: first shift the by amount-1, then do another shift by one. When
amount is 0 (and size 32) we first shift by 31, then by one, instead of by 32.
Also fix a latent bug that emitted the low and high words in the wrong order
when shifting right.
Fixes PR12113.
llvm-svn: 151637
When the GEP index is a vector of pointers, the code that calculated the size
of the element started from the vector type, and not the contained pointer type.
As a result, instead of looking at the data element pointed by the vector, this
code used the size of the vector. This works for 32bit members (on 32bit
systems), but not for other types. Added code to peel the vector type and
added a test.
llvm-svn: 151626
the processor keeps a return addresses stack (RAS) which stores the address
and the instruction execution state of the instruction after a function-call
type branch instruction.
Calling a "noreturn" function with normal call instructions (e.g. bl) can
corrupt RAS and causes 100% return misprediction so LLVM should use a
unconditional branch instead. i.e.
mov lr, pc
b _foo
The "mov lr, pc" is issued in order to get proper backtrace.
rdar://8979299
llvm-svn: 151623
When an outgoing call takes more than 2k of arguments on the stack, we
don't allocate that call frame in the prolog, but adjust the stack
pointer immediately before the call instead.
This causes problems with the emergency spill slot because PEI can't
track stack pointer adjustments on the second pass, and if the outgoing
arguments are too big, SP can't be used to reach the emergency spill
slot at all.
Work around these problems by ensuring there is a base or frame pointer
that can be used to access the emergency spill slot.
<rdar://problem/10917166>
llvm-svn: 151604
We on the linker to resolve calls to the appropriate BL/BLX instruction
to make interworking function correctly. It uses the symbol in the
relocation to do that, so we need to be careful about being too clever.
To enable this for ARM mode, split the BL/BLX fixup kind off from the
unconditional-branch fixups.
rdar://10927209
llvm-svn: 151571
After the SlotIndex slot names were updated, it is possible to apply
stricter checks to live intervals.
Also treat bundles as bags of operands when checking live intervals.
llvm-svn: 151531
value numbers to be assigned when calculating any particular value number.
Enhance the logic that detects new value numbers to take this into account,
for a tiny compile time speedup. Fix a comment typo while there.
llvm-svn: 151522
%cmp (eg: A==B) we already replace %cmp with "true" under the true edge, and
with "false" under the false edge. This change enhances this to replace the
negated compare (A!=B) with "false" under the true edge and "true" under the
false edge. Reported to improve perlbench results by 1%.
llvm-svn: 151517
verifier does. This correctly handles invoke.
Thanks to Duncan, Andrew and Chris for the comments.
Thanks to Joerg for the early testing.
llvm-svn: 151469
Reverting this because it breaks static linking on ppc64. Specifically, it may be linkonce_odr functions that are the problem.
With this patch, if you link statically, calls to some functions end up calling their descriptor addresses instead
of calling to their entry points. This causes the execution to fail with SIGILL (b/c the descriptor address just
has some pointers, not code).
llvm-svn: 151433
[Joe Groff] Hi everyone. My previous patch applied as r151382 had a few problems:
Clang raised a warning, and X86 LowerOperation would assert out for
fptoui f64 to i32 because it improperly lowered to an illegal
BUILD_PAIR. Here's a patch that addresses these issues. Let me know if
any other changes are necessary. Thanks.
llvm-svn: 151432
are optimization hints, but at -O0 we're not optimizing. This becomes a problem
when the alwaysinline attribute is abused.
rdar://10921594
llvm-svn: 151429
uses of the vreg, since the old kills may no longer be valid. This was causing
-verify-machineinstrs to complain about uses after kills, and could potentially
have been causing subtle register allocation issues, but I haven't come across a
test case yet.
llvm-svn: 151425
reserving a physical register ($gp or $28) for that purpose.
This will completely eliminate loads that restore the value of $gp after every
function call, if the register allocator assigns a callee-saved register, or
eliminate unnecessary loads if it assigns a temporary register.
example:
.cpload $25 // set $gp.
...
.cprestore 16 // store $gp to stack slot 16($sp).
...
jalr $25 // function call. clobbers $gp.
lw $gp, 16($sp) // not emitted if callee-saved reg is chosen.
...
lw $2, 4($gp)
...
jalr $25 // function call.
lw $gp, 16($sp) // not emitted if $gp is not live after this instruction.
...
llvm-svn: 151402
Add support for a missed case when the symbols in a difference
expression are in the same section but not the same fragment.
rdar://10924681
llvm-svn: 151345
variable declaration as an argument because we want that address
anyhow for our debug information.
This seems to fix rdar://9965111, at least we have more debug
information than before and from reading the assembly it appears
to be the correct location.
llvm-svn: 151335
I'll let the buildbots determine the compile time improvements from this
change, but 464.h264ref has 5% faster codegen at -O2.
This patch does cause some assembly changes. Branch folding can make
different decisions about calls with dead return values.
CriticalAntiDepBreaker may choose different registers because its
liveness tracking is affected. MachineCopyPropagation may sometimes
leave a dead copy behind.
llvm-svn: 151331
The tied source operand of tMUL is the second source operand, not the
first like every other two-address thumb instruction. Special case it
in the size reduction pass to make sure we create the tMUL instruction
properly.
llvm-svn: 151315
In historical reason, Interpreter's external entries had prefix "lle_X_" as C linkage, even for well-known entries in EE/Interpreter.
Now, at least on ToT, they are resolved via FuncNames[] mapper.
We will not need their symbols are expected to be exported any more.
Clang r150128 has introduced the warning <"%0 has C-linkage specified, but returns user-defined type %1 which is incompatible with C">.
llvm-svn: 151312
Assuming that a single std::set node adds 3 control words, a bitvector
can store (3*8+4)*8=224 registers in the allocated memory of a single
element in the std::set (x86_64). Also we don't have to call malloc
for every register added.
llvm-svn: 151269
rdar://10873652
As part of this I updated the llvm-mc disassembler C API to always call the
SymbolLookUp call back even if there is no getOpInfo call back. If there is a
getOpInfo call back that is tried first and then if that gets no information
then the SymbolLookUp is called. I also made the code more robust by
memset(3)'ing to zero the LLVMOpInfo1 struct before then setting
SymbolicOp.Value before for the call to getOpInfo. And also don't use any
values from the LLVMOpInfo1 struct if getOpInfo returns 0. And also don't
use any of the ReferenceType or ReferenceName values from SymbolLookUp if it
returns NULL. rdar://10873563 and rdar://10873683
For the X86 target also fixed bugs so the annotations get printed.
Also fixed a few places in the ARM target that was not producing symbolic
operands for some instructions. rdar://10878166
llvm-svn: 151267
Before register allocation, instructions can be moved across calls in
order to reduce register pressure. After register allocation, we don't
gain a lot by moving callee-saved defs across calls. In fact, since the
scheduler doesn't have a good idea how registers are used in the callee,
it can't really make good scheduling decisions.
This changes the schedule in two ways: 1. Latencies to call uses and
defs are no longer accounted for, causing some random shuffling around
calls. This isn't really a problem since those uses and defs are
inaccurate proxies for what happens inside the callee. They don't
represent registers used by the call instruction itself.
2. Instructions are no longer moved across calls. This didn't happen
very often, and the scheduling decision was made on dubious information
anyway.
As with any scheduling change, benchmark numbers shift around a bit,
but there is no positive or negative trend from this change.
This makes the post-ra scheduler 5% faster for ARM targets.
The secret motivation for this patch is the introduction of register
mask operands representing call clobbers. The most efficient way of
handling regmasks in ScheduleDAGInstrs is to model them as barriers for
physreg live ranges, but not for virtreg live ranges. That's fine
pre-ra, but post-ra it would have the same effect as this patch.
llvm-svn: 151265
it with memcpy. This also fixes a problem on big-endian hosts, where
addUnaligned would return different results depending on the alignment
of the data.
llvm-svn: 151247
value is zero. Instead of a cmov + op, issue an conditional op instead. e.g.
cmp r9, r4
mov r4, #0
moveq r4, #1
orr lr, lr, r4
should be:
cmp r9, r4
orreq lr, lr, #1
That is, optimize (or x, (cmov 0, y, cond)) to (or.cond x, y). Similarly extend
this to xor as well as (and x, (cmov -1, y, cond)) => (and.cond x, y).
It's possible to extend this to ADD and SUB but I don't think they are common.
rdar://8659097
llvm-svn: 151224
The standard function epilog includes a .size directive, but ppc64 uses
an alternate local symbol to tag the actual start of each function.
Until recently, binutils accepted the .size directive as:
.size test1, .Ltmp0-test1
however, using this directive with recent binutils will result in the error:
.size expression for XXX does not evaluate to a constant
so we must use the label which actually tags the start of the function.
llvm-svn: 151200
Add some data structures to represent for loops. These will be
referenced during object processing to do any needed iteration and
instantiation.
Add foreach keyword support to the lexer.
Add a mode to indicate that we're parsing a foreach loop. This allows
the value parser to early-out when processing the foreach value list.
Add a routine to parse foreach iteration declarations. This is
separate from ParseDeclaration because the type of the named value
(the iterator) doesn't match the type of the initializer value (the
value list). It also needs to add two values to the foreach record:
the iterator and the value list.
Add parsing support for foreach.
Add the code to process foreach loops and create defs based
on iterator values.
Allow foreach loops to be matched at the top level.
When parsing an IDValue check if it is a foreach loop iterator for one
of the active loops. If so, return a VarInit for it.
Add Emacs keyword support for foreach.
Add VIM keyword support for foreach.
Add tests to check foreach operation.
Add TableGen documentation for foreach.
Support foreach with multiple objects.
Support non-braced foreach body with one object.
Do not require types for the foreach declaration. Assume the iterator
type from the iteration list element type.
llvm-svn: 151164
chip in r139383, and the PSP components of the triple are really
annoying to parse. Let's leave this chapter behind. There is no reason
to expect LLVM to see a PSP-related triple these days, and so no
reasonable motivation to support them.
It might be reasonable to prune a few of the older MIPS triple forms in
general, but as those at least cause no burden on parsing (they aren't
both a chip and an OS!), I'm happy to leave them in for now.
llvm-svn: 151156
Affect on SD scheduling and postRA scheduling:
Printing the DAG will display the nodes in top-down topological order.
This matches the order within the MBB and makes my life much easier in general.
Affect on misched:
We don't need to track virtual register uses at all. This is awesome.
I also intend to rely on the SUnit ID as a topo-sort index. So if A < B then we cannot have an edge B -> A.
llvm-svn: 151135
This makes RAFast 4% faster, and it gets rid of the dodgy DenseMap
iteration.
This also revealed that RAFast would sometimes dereference DenseMap
iterators after erasing other elements from the map. That does seem to
work in the current DenseMap implementation, but SparseSet doesn't allow
it.
llvm-svn: 151111
bundles. This method takes a bundle start and an MI being bundled, and makes
the intervals for the MI's operands appear to start/end on the bundle start.
Also fixes some minor cosmetic issues (whitespace, naming convention) in the
HMEditor code.
llvm-svn: 151099
they'll be simple enough to simulate, and to reduce the chance we'll encounter
equal but different simple pointer constants.
This removes the symptoms from PR11352 but is not a full fix. A proper fix would
either require a guarantee that two constant objects we simulate are folded
when equal, or a different way of handling equal pointers (ie., trying a
constantexpr icmp on them to see whether we know they're equal or non-equal or
unsure).
llvm-svn: 151093
This transformation is not safe in some pathological cases (signed icmp of pointers should be an
extremely rare thing, but it's valid IR!). Add an explanatory comment.
Kudos to Duncan for pointing out this edge case (and not giving up explaining it until I finally got it).
llvm-svn: 151055
They're private static methods but we can just make them static
functions in the implementation. It makes the implementations a touch
more wordy, but takes another chunk out of the header file.
Also, take the opportunity to switch the names to the new coding
conventions.
No functionality changed here.
llvm-svn: 151047
Passes after RegAlloc should be able to rely on MRI->getNumVirtRegs() == 0.
This makes sharing code for pre/postRA passes more robust.
Now, to check if a pass is running before the RA pipeline begins, use MRI->isSSA().
To check if a pass is running after the RA pipeline ends, use !MRI->getNumVirtRegs().
PEI resets virtual regs when it's done scavenging.
PTX will either have to provide its own PEI pass or assign physregs.
llvm-svn: 151032
construction. Simplify its interface, implementation, and users
accordingly as there is no longer an 'uninitialized' state to check for.
Also, fixes a bug lurking in the interface as there was one method that
didn't correctly check for initialization.
llvm-svn: 151024
know where users will be added. Because of this, it cannot use
Builder.GetInsertPoint at all.
This patch
* removes the FIXME about adding the assert.
* adds a comment explaining hy we don't have one.
* removes a broken logic that only works for some callers and is not needed
since r150884.
* adds an assert to caller that would have caught the bug fixed by r150884.
llvm-svn: 151015
ecx = mov eax
al = mov ch
The second copy is not a nop because the sub-indices of ecx,ch is not the
same of that of eax/al.
Re-enabled machine-cp.
PR11940
llvm-svn: 151002
- Ignore pointer casts.
- Also expand GEPs that aren't constantexprs when they have one use or only constant indices.
- We now compile "&foo[i] - &foo[j]" into "i - j".
llvm-svn: 150961