basic-block segments bottom-up instead of top down. This
is the first step in a general restructuring of the way
register liveness is tracked in the post-RA scheduler.
llvm-svn: 63643
is given, override the subtarget settings and enable 64-bit support.
This restores the earlier behavior, and fixes regressions on
Non-64-bit-capable x86-32 hosts.
This isn't necessarily the best approach, but the most obvious
alternative is to require -mcpu=x86-64 or -mattr=+64bit to be used
with -march=x86-64 when the host doesn't have 64-bit support. This
makes things little more consistent, but it's less convenient, and
it has the practical drawback of requiring lots of test changes, so
I opted for the above approach for now.
llvm-svn: 63642
accessed at least once as a vector. This prevents it from
compiling the example in not-a-vector into:
define double @test(double %A, double %B) {
%tmp4 = insertelement <7 x double> undef, double %A, i32 0
%tmp = insertelement <7 x double> %tmp4, double %B, i32 4
%tmp2 = extractelement <7 x double> %tmp, i32 4
ret double %tmp2
}
instead, producing the integer code. Producing vectors when they
aren't otherwise in the program is dangerous because a lot of other
code treats them carefully and doesn't want to break them down.
OTOH, many things want to break down tasty i448's.
llvm-svn: 63638
in any old order. Since analyzing a node analyzes its
operands also, this can mean that when we pop a node
off the list of nodes to be analyzed, it may already
have been analyzed.
llvm-svn: 63632
With the new world order, it can handle cases where the first
store into the alloca is an element of the vector, instead of
requiring the first analyzed store to have the vector type
itself. This allows us to un-xfail
test/CodeGen/X86/vec_ins_extract.ll.
llvm-svn: 63590
they are useful to analyses other than BasicAliasAnalysis.cpp. Include
the full comment for isIdentifiedObject in the header file. Thanks to
Chris for suggeseting this.
llvm-svn: 63589
information. This eliminates the need for the Flags field in MemSDNode,
so this makes LoadSDNode and StoreSDNode smaller. Also, it makes
FoldingSetNodeIDs for loads and stores two AddIntegers smaller.
llvm-svn: 63577
SSE2, however it's possible to disable SSE2, and the subtarget support
code thinks that if 64-bit implies SSE2 and SSE2 is disabled then
64-bit should also be disabled. Instead, just mark all the 64-bit
subtargets as explicitly supporting SSE2.
Also, move the code that makes -march=x86-64 enable 64-bit support by
default to only apply when there is no explicit subtarget. If you
need to specify a subtarget and you want 64-bit code, you'll need to
select a subtarget that supports 64-bit code.
llvm-svn: 63575
crashes or wrong code with codegen of large integers:
eliminate the legacy getIntegerVTBitMask and
getIntegerVTSignBit methods, which returned their
value as a uint64_t, so couldn't handle huge types.
llvm-svn: 63494
turn icmp eq a+x, b+x into icmp eq a, b if a+x or b+x has other uses. This
may have been increasing register pressure leading to the bzip2 slowdown.
llvm-svn: 63487
improvements to the EvaluateInDifferentType code. This code works
by just inserted a bunch of new code and then seeing if it is
useful. Instcombine is not allowed to do this: it can only insert
new code if it is useful, and only when it is converging to a more
canonical fixed point. Now that we iterate when DCE makes progress,
this causes an infinite loop when the code ends up not being used.
llvm-svn: 63483
returned by getShiftAmountTy may be too small
to hold shift values (it is an i8 on x86-32).
Before and during type legalization, use a large
but legal type for shift amounts: getPointerTy;
afterwards use getShiftAmountTy, fixing up any
shift amounts with a big type during operation
legalization. Thanks to Dan for writing the
original patch (which I shamelessly pillaged).
llvm-svn: 63482
simplifydemandedbits to simplify instructions with *multiple
uses* in contexts where it can get away with it. This allows
it to simplify the code in multi-use-or.ll into a single 'add
double'.
This change is particularly interesting because it will cover
up for some common codegen bugs with large integers created due
to the recent SROA patch. When working on fixing those bugs,
this should be disabled.
llvm-svn: 63481
Now, if it detects that "V" is the same as some other value,
SimplifyDemandedBits returns the new value instead of RAUW'ing it immediately.
This has two benefits:
1) simpler code in the recursive SimplifyDemandedBits routine.
2) it allows future fun stuff in instcombine where an operation has multiple
uses and can be simplified in one context, but not all.
#2 isn't implemented yet, this patch should have no functionality change.
llvm-svn: 63479
be able to handle *ANY* alloca that is poked by loads and stores of
bitcasts and GEPs with constant offsets. Before the code had a number
of annoying limitations and caused it to miss cases such as storing into
holes in structs and complex casts (as in bitfield-sroa) where we had
unions of bitfields etc. This also handles a number of important cases
that are exposed due to the ABI lowering stuff we do to pass stuff by
value.
One case that is pretty great is that we compile
2006-11-07-InvalidArrayPromote.ll into:
define i32 @func(<4 x float> %v0, <4 x float> %v1) nounwind {
%tmp10 = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %v1)
%tmp105 = bitcast <4 x i32> %tmp10 to i128
%tmp1056 = zext i128 %tmp105 to i256
%tmp.upgrd.43 = lshr i256 %tmp1056, 96
%tmp.upgrd.44 = trunc i256 %tmp.upgrd.43 to i32
ret i32 %tmp.upgrd.44
}
which turns into:
_func:
subl $28, %esp
cvttps2dq %xmm1, %xmm0
movaps %xmm0, (%esp)
movl 12(%esp), %eax
addl $28, %esp
ret
Which is pretty good code all things considering :).
One effect of this is that SROA will start generating arbitrary bitwidth
integers that are a multiple of 8 bits. In the case above, we got a
256 bit integer, but the codegen guys assure me that it can handle the
simple and/or/shift/zext stuff that we're doing on these operations.
This addresses rdar://6532315
llvm-svn: 63469
information output. However, many target specific tool chains prefer to encode
only one compile unit in an object file. In this situation, the LLVM code
generator will include debugging information entities in the compile unit
that is marked as main compile unit. The code generator accepts maximum one main
compile unit per module. If a module does not contain any main compile unit
then the code generator will emit multiple compile units in the output object
file.
[Part 1]
Update DebugInfo APIs to accept optional boolean value while creating DICompileUnit to mark the unit as "main" unit. By defaults all units are considered non-main. Update SourceLevelDebugging.html to document "main" compile unit.
Update DebugInfo APIs to not accept and encode separate source file/directory entries while creating various llvm.dbg.* entities. There was a recent, yet to be documented, change to include this additional information so no documentation changes are required here.
Update DwarfDebug to handle "main" compile unit. If "main" compile unit is seen then all DIEs are inserted into "main" compile unit. All other compile units are used to find source location for llvm.dbg.* values. If there is not any "main" compile unit then create unique compile unit DIEs for each llvm.dbg.compile_unit.
[Part 2]
Create separate llvm.dbg.compile_unit for each input file. Mark compile unit create for main_input_filename as "main" compile unit. Use appropriate compile unit, based on source location information collected from the tree node, while creating llvm.dbg.* values using DebugInfo APIs.
---
This is Part 1.
llvm-svn: 63400
the LowerPartSet(). It didn't handle the situation correctly when
the low, high argument values are in reverse order (low > high)
with 'Val' type i32 (a corner case).
llvm-svn: 63388
If a MachineInstr doesn't have a memoperand but has an opcode that
is known to load or store, assume its memory reference may alias
*anything*, including stack slots which the compiler completely
controls.
To partially compensate for this, teach the ScheduleDAG building
code to do basic getUnderlyingValue analysis. This greatly
reduces the number of instructions that require restrictive
dependencies. This code will need to be revisited when we start
doing real alias analysis, but it should suffice for now.
llvm-svn: 63370
- Modify TableGen to add the DebugLoc when calling getTargetNode.
(The light-weight wrappers are only temporary. The non-DebugLoc version will be
removed once the whole debug info stuff is finished with.)
llvm-svn: 63273
dagcombines that help it match in several more cases. Add
several more cases to test/CodeGen/X86/bt.ll. This doesn't
yet include matching for BT with an immediate operand, it
just covers more register+register cases.
llvm-svn: 63266