Now that there can be multiple hint registers from targets, it doesn't
make sense to have a function that returns 'the' preferred register.
llvm-svn: 169190
Targets can provide multiple hints now, so getRegAllocPref() doesn't
make sense any longer because it only returns one preferred register.
Replace it with getSimpleHint() in the remaining heuristics. This
function only
llvm-svn: 169188
This change tries to simmplify E1 = " X >> C1 << C2" into :
- E2 = "X << (C2 - C1)" if C2 > C1, or
- E2 = "X >> (C1 - C2)" if C1 > C2, or
- E2 = X if C1 == C2.
Reviewed by Nadav. Thanks!
llvm-svn: 169182
Virtual registers with a known preferred register are prioritized by
RAGreedy. This function makes the condition explicit without depending
on getRegAllocPref().
llvm-svn: 169179
This small change adds support for that. It will make all MCJIT tests pass
in make-check on BigEndian platforms.
Patch by Petar Jovanovic.
llvm-svn: 169178
This change adds endian-awareness to MipsJITInfo and emitWordLE in
MipsCodeEmitter has become emitWord now to support both endianness.
Patch by Petar Jovanovic.
llvm-svn: 169177
This provides the same functionality as getRawAllocationOrder() for the
even/odd hints, but without the many constant register arrays.
llvm-svn: 169169
"Windows.h" includes <Windows.h> which defines a bunch of stuff it shouldn't
(even with all the restriction macros). We have no control over this file, so
make it's scope as small as possible.
llvm-svn: 169165
The TargetRegisterInfo::getRegAllocationHints() function is going to
replace the existing mechanisms for providing target-dependent hints to
the register allocator: ResolveRegAllocHint() and
getRawAllocationOrder().
The new hook is more flexible because it allows the target to provide
multiple preferred candidate registers for each virtual register, and it
is easier to use because targets are not required to return a reference
to a constant array like getRawAllocationOrder().
An optional VirtRegMap argument can be used to provide target-dependent
hints that depend on the provisional assignments of other virtual
registers.
llvm-svn: 169154
which is the legality of the if-conversion transformation. The next step is to
implement the cost-model for the if-converted code as well as the
vectorization itself.
llvm-svn: 169152
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131
The partitioning logic attempted to handle uses of an alloca with an
offset starting before the alloca so long as the use had some overlap
with the alloca itself. However, there was a bug where we tested
'(uint64_t)Offset >= AllocSize' without first checking whether 'Offset'
was positive. As a consequence, essentially every negative offset (that
is, starting *before* the alloca does) would be thrown out, even if it
was overlapping. The subsequent code to throw out negative offsets which
were actually non-overlapping was essentially dead. The code to *handle*
overlapping negative offsets was actually dead!
I've just removed all of this, and taught SROA to discard any uses which
start prior to the alloca from the beginning. It has the lovely property
of simplifying the code. =] All the tests still pass, and in fact no new
tests are needed as this is already covered by our testsuite. Fixing the
code so that negative offsets work the way the comments indicate they
were supposed to work causes regressions. That's how I found this.
Anyways, this is all progress in the correct direction -- tightening up
SROA to be maximally aggressive. Some day, I really hope to turn
out-of-bounds accesses to an alloca into 'unreachable'.
llvm-svn: 169120
Also check in a case to repeat the issue, on which 'opt -globalopt' consumes 1.6GB memory.
The big memory footprint cause is that current GlobalOpt one by one hoists and stores the leaf element constant into the global array, in each iteration, it recreates the global array initializer constant and leave the old initializer alone. This may result in many obsolete constants left.
For example: we have global array @rom = global [16 x i32] zeroinitializer
After the first element value is hoisted and installed: @rom = global [16 x i32] [ 1, 0, 0, ... ]
After the second element value is installed: @rom = global [16 x 32] [ 1, 2, 0, 0, ... ] // here the previous initializer is obsolete
...
When the transform is done, we have 15 obsolete initializers left useless.
llvm-svn: 169079
- Each macro instantiation introduces a new buffer, and FindBufferForLoc() is
linear, so previously macro instantiation could be N^2 for some pathological
inputs.
llvm-svn: 169073
The TwoAddressInstructionPass takes the machine code out of SSA form by
expanding REG_SEQUENCE instructions into copies. It is no longer
necessary to rewrite the registers used by a REG_SEQUENCE instruction
because the new coalescer algorithm can do it now.
REG_SEQUENCE is just converted to a sequence of sub-register copies now.
llvm-svn: 169067