[PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)

This change aims to unify and correct our logic for when we need to allow for
the possibility of the linker adding a TOC restoration instruction after a
call. This comes up in two contexts:

 1. When determining tail-call eligibility. If we make a tail call (i.e.
    directly branch to a function) then there is no place for the linker to add
    a TOC restoration.
 2. When determining when we need to add a nop instruction after a call.
    Likewise, if there is a possibility that the linker might need to add a
    TOC restoration after a call, then we need to put a nop after the call
    (the bl instruction).

First problem: We were using similar, but different, logic to decide (1) and
(2). This is just wrong. Both the resideInSameModule function (used when
determining tail-call eligibility) and the isLocalCall function (used when
deciding if the post-call nop is needed) were supposed to be determining the
same underlying fact (i.e. might a TOC restoration be needed after the call).
The same logic should be used in both places.

Second problem: The logic in both places was wrong. We only know that two
functions will share the same TOC when both functions come from the same
section of the same object. Otherwise the linker might cause the functions to
use different TOC base addresses (unless the multi-TOC linker option is
disabled, in which case only shared-library boundaries are relevant). There are
a number of factors that can cause functions to be placed in different sections
or come from different objects (-ffunction-sections, explicitly-specified
section names, COMDAT, weak linkage, etc.). All of these need to be checked.
The existing logic only checked properties of the callee, but the properties of
the caller must also be checked (for example, calling from a function in a
COMDAT section means calling between sections).

There was a conceptual error in the resideInSameModule function in that it
allowed tail calls to functions with weak linkage and protected/hidden
visibility. While protected/hidden visibility does prevent the function
implementation from being replaced at runtime (via interposition), it does not
prevent the linker from using an alternate implementation at link time (i.e.
using some strong definition to replace the provided weak one during linking).
If this happens, then we're still potentially looking at a required TOC
restoration upon return.

Otherwise, in general, the post-call nop is needed wherever ELF interposition
needs to be supported. We don't currently support ELF interposition at the IR
level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html
for more information), and I don't think we should try to make it appear to
work in the backend in spite of that fact. Unfortunately, because of the way
that the ABI works, we need to generate code as if we supported interposition
whenever the linker might insert stubs for the purpose of supporting it.

Differential Revision: https://reviews.llvm.org/D27231

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291003 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Hal Finkel 2017-01-04 21:05:13 +00:00
parent 76894fb000
commit 68c84942ec
3 changed files with 174 additions and 46 deletions

View File

@ -3981,40 +3981,46 @@ static int CalculateTailCallSPDiff(SelectionDAG& DAG, bool isTailCall,
static bool isFunctionGlobalAddress(SDValue Callee);
static bool
resideInSameModule(SDValue Callee, Reloc::Model RelMod) {
resideInSameSection(const Function *Caller, SDValue Callee,
const TargetMachine &TM) {
// If !G, Callee can be an external symbol.
GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee);
if (!G) return false;
const GlobalValue *GV = G->getGlobal();
if (GV->isDeclaration()) return false;
switch(GV->getLinkage()) {
default: llvm_unreachable("unknow linkage type");
case GlobalValue::AvailableExternallyLinkage:
case GlobalValue::ExternalWeakLinkage:
if (!G)
return false;
// Callee with weak linkage is allowed if it has hidden or protected
// visibility
case GlobalValue::LinkOnceAnyLinkage:
case GlobalValue::LinkOnceODRLinkage: // e.g. c++ inline functions
case GlobalValue::WeakAnyLinkage:
case GlobalValue::WeakODRLinkage: // e.g. c++ template instantiation
if (GV->hasDefaultVisibility())
return false;
const GlobalValue *GV = G->getGlobal();
if (!GV->isStrongDefinitionForLinker())
return false;
case GlobalValue::ExternalLinkage:
case GlobalValue::InternalLinkage:
case GlobalValue::PrivateLinkage:
break;
// Any explicitly-specified sections and section prefixes must also match.
// Also, if we're using -ffunction-sections, then each function is always in
// a different section (the same is true for COMDAT functions).
if (TM.getFunctionSections() || GV->hasComdat() || Caller->hasComdat() ||
GV->getSection() != Caller->getSection())
return false;
if (const auto *F = dyn_cast<Function>(GV)) {
if (F->getSectionPrefix() != Caller->getSectionPrefix())
return false;
}
// With '-fPIC', calling default visiblity function need insert 'nop' after
// function call, no matter that function resides in same module or not, so
// we treat it as in different module.
if (RelMod == Reloc::PIC_ && GV->hasDefaultVisibility())
// If the callee might be interposed, then we can't assume the ultimate call
// target will be in the same section. Even in cases where we can assume that
// interposition won't happen, in any case where the linker might insert a
// stub to allow for interposition, we must generate code as though
// interposition might occur. To understand why this matters, consider a
// situation where: a -> b -> c where the arrows indicate calls. b and c are
// in the same section, but a is in a different module (i.e. has a different
// TOC base pointer). If the linker allows for interposition between b and c,
// then it will generate a stub for the call edge between b and c which will
// save the TOC pointer into the designated stack slot allocated by b. If we
// return true here, and therefore allow a tail call between b and c, that
// stack slot won't exist and the b -> c stub will end up saving b'c TOC base
// pointer into the stack slot allocated by a (where the a -> b stub saved
// a's TOC base pointer). If we're not considering a tail call, but rather,
// whether a nop is needed after the call instruction in b, because the linker
// will insert a stub, it might complain about a missing nop if we omit it
// (although many don't complain in this case).
if (!TM.shouldAssumeDSOLocal(*Caller->getParent(), GV))
return false;
return true;
@ -4130,11 +4136,11 @@ PPCTargetLowering::IsEligibleForTailCallOptimization_64SVR4(
!isa<ExternalSymbolSDNode>(Callee))
return false;
// Check if Callee resides in the same module, because for now, PPC64 SVR4 ABI
// (ELFv1/ELFv2) doesn't allow tail calls to a symbol resides in another
// module.
// Check if Callee resides in the same section, because for now, PPC64 SVR4
// ABI (ELFv1/ELFv2) doesn't allow tail calls to a symbol resides in another
// section.
// ref: https://bugzilla.mozilla.org/show_bug.cgi?id=973977
if (!resideInSameModule(Callee, getTargetMachine().getRelocationModel()))
if (!resideInSameSection(MF.getFunction(), Callee, getTargetMachine()))
return false;
// TCO allows altering callee ABI, so we don't have to check further.
@ -4592,14 +4598,6 @@ PrepareCall(SelectionDAG &DAG, SDValue &Callee, SDValue &InFlag, SDValue &Chain,
return CallOpc;
}
static
bool isLocalCall(const SDValue &Callee)
{
if (GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee))
return G->getGlobal()->isStrongDefinitionForLinker();
return false;
}
SDValue PPCTargetLowering::LowerCallResult(
SDValue Chain, SDValue InFlag, CallingConv::ID CallConv, bool isVarArg,
const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &dl,
@ -4701,6 +4699,7 @@ SDValue PPCTargetLowering::FinishCall(
// stack frame. If caller and callee belong to the same module (and have the
// same TOC), the NOP will remain unchanged.
MachineFunction &MF = DAG.getMachineFunction();
if (!isTailCall && Subtarget.isSVR4ABI()&& Subtarget.isPPC64() &&
!isPatchPoint) {
if (CallOpc == PPCISD::BCTRL) {
@ -4724,11 +4723,11 @@ SDValue PPCTargetLowering::FinishCall(
// The address needs to go after the chain input but before the flag (or
// any other variadic arguments).
Ops.insert(std::next(Ops.begin()), AddTOC);
} else if ((CallOpc == PPCISD::CALL) &&
(!isLocalCall(Callee) ||
DAG.getTarget().getRelocationModel() == Reloc::PIC_))
} else if (CallOpc == PPCISD::CALL &&
!resideInSameSection(MF.getFunction(), Callee, DAG.getTarget())) {
// Otherwise insert NOP for non-local calls.
CallOpc = PPCISD::CALL_NOP;
}
}
Chain = DAG.getNode(CallOpc, dl, NodeTys, Ops);

View File

@ -0,0 +1,129 @@
; RUN: llc < %s -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu | FileCheck %s
; RUN: llc < %s -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr8 | FileCheck %s
; RUN: llc < %s -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr8 | FileCheck %s
; RUN: llc < %s -relocation-model=pic -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu | FileCheck %s
; RUN: llc < %s -function-sections -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu | FileCheck %s -check-prefix=CHECK-FS
; RUN: llc < %s -relocation-model=pic -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu | FileCheck %s
; RUN: llc < %s -function-sections -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu | FileCheck %s -check-prefix=CHECK-FS
%class.T = type { [2 x i8] }
define void @e_callee(%class.T* %this, i8* %c) { ret void }
define void @e_caller(%class.T* %this, i8* %c) {
call void @e_callee(%class.T* %this, i8* %c)
ret void
; CHECK-LABEL: e_caller:
; CHECK: bl e_callee
; CHECK-NEXT: nop
; CHECK-FS-LABEL: e_caller:
; CHECK-FS: bl e_callee
; CHECK-FS-NEXT: nop
}
define void @e_scallee(%class.T* %this, i8* %c) section "different" { ret void }
define void @e_scaller(%class.T* %this, i8* %c) {
call void @e_scallee(%class.T* %this, i8* %c)
ret void
; CHECK-LABEL: e_scaller:
; CHECK: bl e_scallee
; CHECK-NEXT: nop
}
define void @e_s2callee(%class.T* %this, i8* %c) { ret void }
define void @e_s2caller(%class.T* %this, i8* %c) section "different" {
call void @e_s2callee(%class.T* %this, i8* %c)
ret void
; CHECK-LABEL: e_s2caller:
; CHECK: bl e_s2callee
; CHECK-NEXT: nop
}
$cd1 = comdat any
$cd2 = comdat any
define void @e_ccallee(%class.T* %this, i8* %c) comdat($cd1) { ret void }
define void @e_ccaller(%class.T* %this, i8* %c) comdat($cd2) {
call void @e_ccallee(%class.T* %this, i8* %c)
ret void
; CHECK-LABEL: e_ccaller:
; CHECK: bl e_ccallee
; CHECK-NEXT: nop
}
$cd = comdat any
define void @e_c1callee(%class.T* %this, i8* %c) comdat($cd) { ret void }
define void @e_c1caller(%class.T* %this, i8* %c) comdat($cd) {
call void @e_c1callee(%class.T* %this, i8* %c)
ret void
; CHECK-LABEL: e_c1caller:
; CHECK: bl e_c1callee
; CHECK-NEXT: nop
}
define weak_odr hidden void @wo_hcallee(%class.T* %this, i8* %c) { ret void }
define void @wo_hcaller(%class.T* %this, i8* %c) {
call void @wo_hcallee(%class.T* %this, i8* %c)
ret void
; CHECK-LABEL: wo_hcaller:
; CHECK: bl wo_hcallee
; CHECK-NEXT: nop
}
define weak_odr protected void @wo_pcallee(%class.T* %this, i8* %c) { ret void }
define void @wo_pcaller(%class.T* %this, i8* %c) {
call void @wo_pcallee(%class.T* %this, i8* %c)
ret void
; CHECK-LABEL: wo_pcaller:
; CHECK: bl wo_pcallee
; CHECK-NEXT: nop
}
define weak_odr void @wo_callee(%class.T* %this, i8* %c) { ret void }
define void @wo_caller(%class.T* %this, i8* %c) {
call void @wo_callee(%class.T* %this, i8* %c)
ret void
; CHECK-LABEL: wo_caller:
; CHECK: bl wo_callee
; CHECK-NEXT: nop
}
define weak protected void @w_pcallee(i8* %ptr) { ret void }
define void @w_pcaller(i8* %ptr) {
call void @w_pcallee(i8* %ptr)
ret void
; CHECK-LABEL: w_pcaller:
; CHECK: bl w_pcallee
; CHECK-NEXT: nop
}
define weak hidden void @w_hcallee(i8* %ptr) { ret void }
define void @w_hcaller(i8* %ptr) {
call void @w_hcallee(i8* %ptr)
ret void
; CHECK-LABEL: w_hcaller:
; CHECK: bl w_hcallee
; CHECK-NEXT: nop
}
define weak void @w_callee(i8* %ptr) { ret void }
define void @w_caller(i8* %ptr) {
call void @w_callee(i8* %ptr)
ret void
; CHECK-LABEL: w_caller:
; CHECK: bl w_callee
; CHECK-NEXT: nop
}

View File

@ -142,7 +142,7 @@ define void @wo_hcaller(%class.T* %this, i8* %c) {
ret void
; CHECK-SCO-LABEL: wo_hcaller:
; CHECK-SCO: b wo_hcallee
; CHECK-SCO: bl wo_hcallee
}
define weak_odr protected void @wo_pcallee(%class.T* %this, i8* %c) { ret void }
@ -151,7 +151,7 @@ define void @wo_pcaller(%class.T* %this, i8* %c) {
ret void
; CHECK-SCO-LABEL: wo_pcaller:
; CHECK-SCO: b wo_pcallee
; CHECK-SCO: bl wo_pcallee
}
define weak_odr void @wo_callee(%class.T* %this, i8* %c) { ret void }
@ -169,7 +169,7 @@ define void @w_pcaller(i8* %ptr) {
ret void
; CHECK-SCO-LABEL: w_pcaller:
; CHECK-SCO: b w_pcallee
; CHECK-SCO: bl w_pcallee
}
define weak hidden void @w_hcallee(i8* %ptr) { ret void }
@ -178,7 +178,7 @@ define void @w_hcaller(i8* %ptr) {
ret void
; CHECK-SCO-LABEL: w_hcaller:
; CHECK-SCO: b w_hcallee
; CHECK-SCO: bl w_hcallee
}
define weak void @w_callee(i8* %ptr) { ret void }