llvm-capstone/clang/test/CodeGenCUDA/constexpr-variables.cu
Yaxun (Sam) Liu 45f2a56856 [CUDA][HIP] Support accessing static device variable in host code for -fno-gpu-rdc
nvcc supports accessing file-scope static device variables in host code by host APIs
like cudaMemcpyToSymbol etc.

CUDA/HIP let users access device variables in host code by shadow variables. In host compilation,
clang emits a shadow variable for each device variable, and calls __*RegisterVariable to
register it in init function. The address of the shadow variable and the device side mangled
name of the device variable is passed to __*RegisterVariable. Runtime looks up the symbol
by name in the device binary  to find the address of the device variable.

The problem with static device variables is that they have internal linkage, therefore their
name may be changed by the linker if there are multiple symbols with the same name. Also
they end up as local symbols in the elf file, whereas the runtime only looks up the global symbols.

Another reason for making the static device variables external linkage is that they may be
initialized externally by host code and their final value may be accessed by host code
after kernel execution, therefore they actually have external linkage. Giving them internal
linkage will cause incorrect optimizations on them.

To support accessing static device var in host code for -fno-gpu-rdc mode, change the intnernal
linkage to external linkage. The name does not need change since there is only one TU for
-fno-gpu-rdc mode. Also the externalization is done only if the device static var is referenced
by host code.

Differential Revision: https://reviews.llvm.org/D80858
2020-08-05 07:57:38 -04:00

44 lines
1.5 KiB
Plaintext

// RUN: %clang_cc1 -std=c++14 %s -emit-llvm -o - -triple nvptx \
// RUN: -fcuda-is-device | FileCheck --check-prefixes=CXX14 %s
// RUN: %clang_cc1 -std=c++17 %s -emit-llvm -o - -triple nvptx \
// RUN: -fcuda-is-device | FileCheck --check-prefixes=CXX17 %s
#include "Inputs/cuda.h"
// COM: @_ZL1a = internal {{.*}}constant i32 7
constexpr int a = 7;
__constant__ const int &use_a = a;
namespace B {
// COM: @_ZN1BL1bE = internal {{.*}}constant i32 9
constexpr int b = 9;
}
__constant__ const int &use_B_b = B::b;
struct Q {
// CXX14: @_ZN1Q2k2E = {{.*}}externally_initialized constant i32 6
// CXX17: @_ZN1Q2k2E = internal {{.*}}constant i32 6
// CXX14: @_ZN1Q2k1E = available_externally {{.*}}constant i32 5
// CXX17: @_ZN1Q2k1E = {{.*}} externally_initialized constant i32 5
static constexpr int k1 = 5;
static constexpr int k2 = 6;
};
constexpr int Q::k2;
__constant__ const int &use_Q_k1 = Q::k1;
__constant__ const int &use_Q_k2 = Q::k2;
template<typename T> struct X {
// CXX14: @_ZN1XIiE1aE = available_externally {{.*}}constant i32 123
// CXX17: @_ZN1XIiE1aE = {{.*}}externally_initialized constant i32 123
static constexpr int a = 123;
};
__constant__ const int &use_X_a = X<int>::a;
template <typename T, T a, T b> struct A {
// CXX14: @_ZN1AIiLi1ELi2EE1xE = available_externally {{.*}}constant i32 2
// CXX17: @_ZN1AIiLi1ELi2EE1xE = {{.*}}externally_initialized constant i32 2
constexpr static T x = a * b;
};
__constant__ const int &y = A<int, 1, 2>::x;