Previously, generation of stack protectors was done exclusively in the
pre-SelectionDAG Codegen LLVM IR Pass "Stack Protector". This necessitated
splitting basic blocks at the IR level to create the success/failure basic
blocks in the tail of the basic block in question. As a result of this,
calls that would have qualified for the sibling call optimization were no
longer eligible for optimization since said calls were no longer right in
the "tail position" (i.e. the immediate predecessor of a ReturnInst
instruction).
Then it was noticed that since the sibling call optimization causes the
callee to reuse the caller's stack, if we could delay the generation of
the stack protector check until later in CodeGen after the sibling call
decision was made, we get both the tail call optimization and the stack
protector check!
A few goals in solving this problem were:
1. Preserve the architecture independence of stack protector generation.
2. Preserve the normal IR level stack protector check for platforms like
OpenBSD for which we support platform specific stack protector
generation.
The main problem that guided the present solution is that one can not
solve this problem in an architecture independent manner at the IR level
only. This is because:
1. The decision on whether or not to perform a sibling call on certain
platforms (for instance i386) requires lower level information
related to available registers that can not be known at the IR level.
2. Even if the previous point were not true, the decision on whether to
perform a tail call is done in LowerCallTo in SelectionDAG which
occurs after the Stack Protector Pass. As a result, one would need to
put the relevant callinst into the stack protector check success
basic block (where the return inst is placed) and then move it back
later at SelectionDAG/MI time before the stack protector check if the
tail call optimization failed. The MI level option was nixed
immediately since it would require platform specific pattern
matching. The SelectionDAG level option was nixed because
SelectionDAG only processes one IR level basic block at a time
implying one could not create a DAG Combine to move the callinst.
To get around this problem a few things were realized:
1. While one can not handle multiple IR level basic blocks at the
SelectionDAG Level, one can generate multiple machine basic blocks
for one IR level basic block. This is how we handle bit tests and
switches.
2. At the MI level, tail calls are represented via a special return
MIInst called "tcreturn". Thus if we know the basic block in which we
wish to insert the stack protector check, we get the correct behavior
by always inserting the stack protector check right before the return
statement. This is a "magical transformation" since no matter where
the stack protector check intrinsic is, we always insert the stack
protector check code at the end of the BB.
Given the aforementioned constraints, the following solution was devised:
1. On platforms that do not support SelectionDAG stack protector check
generation, allow for the normal IR level stack protector check
generation to continue.
2. On platforms that do support SelectionDAG stack protector check
generation:
a. Use the IR level stack protector pass to decide if a stack
protector is required/which BB we insert the stack protector check
in by reusing the logic already therein. If we wish to generate a
stack protector check in a basic block, we place a special IR
intrinsic called llvm.stackprotectorcheck right before the BB's
returninst or if there is a callinst that could potentially be
sibling call optimized, before the call inst.
b. Then when a BB with said intrinsic is processed, we codegen the BB
normally via SelectBasicBlock. In said process, when we visit the
stack protector check, we do not actually emit anything into the
BB. Instead, we just initialize the stack protector descriptor
class (which involves stashing information/creating the success
mbbb and the failure mbb if we have not created one for this
function yet) and export the guard variable that we are going to
compare.
c. After we finish selecting the basic block, in FinishBasicBlock if
the StackProtectorDescriptor attached to the SelectionDAGBuilder is
initialized, we first find a splice point in the parent basic block
before the terminator and then splice the terminator of said basic
block into the success basic block. Then we code-gen a new tail for
the parent basic block consisting of the two loads, the comparison,
and finally two branches to the success/failure basic blocks. We
conclude by code-gening the failure basic block if we have not
code-gened it already (all stack protector checks we generate in
the same function, use the same failure basic block).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188755 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a llvm.copysign intrinsic; We already have Libfunc recognition for
copysign (which is turned into the FCOPYSIGN SDAG node). In order to
autovectorize calls to copysign in the loop vectorizer, we need a corresponding
intrinsic as well.
In addition to the expected changes to the language reference, the loop
vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into
an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a
few lists in LegalizeVector{Ops,Types} so that vector copysigns can be
expanded.
In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN
be Expand for vector types. This seems correct for all in-tree targets, and I
think is the right thing to do because, previously, there was no way to generate
vector-values FCOPYSIGN nodes (and most targets don't specify an action for
vector-typed FCOPYSIGN).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188728 91177308-0d34-0410-b5e6-96231b3b80d8
Until gdb supports the new accelerator tables we should add the
pubnames section so that gdb_index can be generated from gold
at link time. On darwin we already emit the accelerator tables
and so don't need to worry about pubnames.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188708 91177308-0d34-0410-b5e6-96231b3b80d8
- split WidenVecRes_Binary into WidenVecRes_Binary and WidenVecRes_BinaryCanTrap
- WidenVecRes_BinaryCanTrap preserves the original behaviour for operations
that can trap
- WidenVecRes_Binary simply widens the operation and improves codegen for
3-element vectors by allowing widening and promotion on x86 (matches the
behaviour of unary and ternary operation widening)
- use WidenVecRes_Binary for operations on integers.
Reviewed by: nrotem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188699 91177308-0d34-0410-b5e6-96231b3b80d8
We had previously been asserting when faced with a FCOPYSIGN f64, ppcf128 node
because there was no way to expand the FCOPYSIGN node. Because ppcf128 is the
sum of two doubles, and the first double must have the larger magnitude, we
can take the sign from the first double. As a result, in addition to fixing the
crash, this is also an optimization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188655 91177308-0d34-0410-b5e6-96231b3b80d8
We check this in many/all other cases, just missed this one it seems.
Perhaps it'd be worth unifying this so we never emit zero-length
DW_AT_names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188649 91177308-0d34-0410-b5e6-96231b3b80d8
Teach the generic instruction selection helper functions to constrain
the register classes of their input operands. For non-physical register
references, the generic code needs to be careful not to mess that up
when replacing references to result registers. As the comment indicates
for MachineRegisterInfo::replaceRegWith(), it's important to call
constrainRegClass() first.
rdar://12594152
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188593 91177308-0d34-0410-b5e6-96231b3b80d8
Generalize r188163 to cope with return types other than MVT::i32, just
as the existing visitMemCmpCall code did. I've split this out into a
subroutine so that it can be used for other upcoming patches.
I also noticed that I'd used the wrong API to record the out chain.
It's a load that uses DAG.getRoot() rather than getRoot(), so the out
chain should go on PendingLoads. I don't have a testcase for that because
we don't do any interesting scheduling on z yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188540 91177308-0d34-0410-b5e6-96231b3b80d8
When new virtual registers are created during splitting/spilling, defer
creation of the live interval until we need to use the live interval.
Along with the recent commits to notify LiveRangeEdit when new virtual
registers are created, this makes it possible for functions like
TargetInstrInfo::loadRegFromStackSlot() and
TargetInstrInfo::storeRegToStackSlot() to create multiple virtual
registers as part of the process of generating loads/stores for
different register classes, and then have the live intervals for those
new registers computed when they are needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188437 91177308-0d34-0410-b5e6-96231b3b80d8
Add a delegate class to MachineRegisterInfo with a single virtual
function, MRI_NoteNewVirtualRegister(). Update LiveRangeEdit to inherit
from this delegate class and override the definition of the callback
with an implementation that tracks the newly created virtual registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188435 91177308-0d34-0410-b5e6-96231b3b80d8
Track new virtual registers by register number, rather than by the live
interval created for them. This is the first step in separating the
creation of new virtual registers and new live intervals. Eventually
live intervals will be created and populated on demand after the virtual
registers have been created and used in instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188434 91177308-0d34-0410-b5e6-96231b3b80d8
A common idiom is to use zero and all-ones as sentinal values and to
check for both in a single conditional ("x != 0 && x != (unsigned)-1").
That generates code, for i32, like:
testl %edi, %edi
setne %al
cmpl $-1, %edi
setne %cl
andb %al, %cl
With this transform, we generate the simpler:
incl %edi
cmpl $1, %edi
seta %al
Similar improvements for other integer sizes and on other platforms. In
general, combining the two setcc instructions into one is better.
rdar://14689217
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188315 91177308-0d34-0410-b5e6-96231b3b80d8
LowerCallTo returns a pair with the return value of the call as the first
element and the chain associated with the return value as the second element. If
we lower a call that has a void return value, LowerCallTo returns an SDValue
with a NULL SDNode and the chain for the call. Thus makeLibCall by just
returning the first value makes it impossible for you to set up the chain so
that the call is not eliminated as dead code.
I also updated all references to makeLibCall to reflect the new return type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188300 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We need to do two things:
- Initialize BSSSection in MCObjectFileInfo::InitCOFFMCObjectFileInfo
- Teach TargetLoweringObjectFileCOFF::SelectSectionForGlobal what to do
with it
This fixes PR16861.
Reviewers: rnk
Reviewed By: rnk
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1361
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188244 91177308-0d34-0410-b5e6-96231b3b80d8
CUs.
Currently only hashes the name of CUs and the names of any children,
but it's an obvious first step to show the framework. The testcase
should continue to be correct, however, as it's an empty TU.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188243 91177308-0d34-0410-b5e6-96231b3b80d8
For now this is restricted to fixed-length comparisons with a length
in the range [1, 256], as for memcpy() and MVC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188163 91177308-0d34-0410-b5e6-96231b3b80d8
If the tail-callee and caller give the same bits via the same signext/zeroext
attribute then a tail-call should be allowed, since the extension has already
been done by the callee.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188159 91177308-0d34-0410-b5e6-96231b3b80d8
This patch decouples the stack protector pass so that we can support stack
protector implementations that do not use the IR level generated stack protector
fail basic block.
No codesize increase is caused by this change since the MI level tail merge pass
properly merges together the fail condition blocks (see the updated test).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188105 91177308-0d34-0410-b5e6-96231b3b80d8
Previously the asserts were only checking that RHS and LHS were the same type and had the same element type as the result. All downstream code for ISD::VECTOR_SHUFFLE requires the types to be the same.
Also removed one unnecessary check of matched element counts that was present in the code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188051 91177308-0d34-0410-b5e6-96231b3b80d8
For most libm ISD nodes, TargetLoweringBase::initActions sets the default
scalar-type action to Expand, and leaves the vector-type action default as
Legal. This is not appropriate for the new ISD::FROUND node (which no backend
but PowerPC handles explicitly).
Fixes PR16842.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188048 91177308-0d34-0410-b5e6-96231b3b80d8