mirror of
https://github.com/capstone-engine/llvm-capstone.git
synced 2024-12-14 03:29:57 +00:00
45f2a56856
nvcc supports accessing file-scope static device variables in host code by host APIs like cudaMemcpyToSymbol etc. CUDA/HIP let users access device variables in host code by shadow variables. In host compilation, clang emits a shadow variable for each device variable, and calls __*RegisterVariable to register it in init function. The address of the shadow variable and the device side mangled name of the device variable is passed to __*RegisterVariable. Runtime looks up the symbol by name in the device binary to find the address of the device variable. The problem with static device variables is that they have internal linkage, therefore their name may be changed by the linker if there are multiple symbols with the same name. Also they end up as local symbols in the elf file, whereas the runtime only looks up the global symbols. Another reason for making the static device variables external linkage is that they may be initialized externally by host code and their final value may be accessed by host code after kernel execution, therefore they actually have external linkage. Giving them internal linkage will cause incorrect optimizations on them. To support accessing static device var in host code for -fno-gpu-rdc mode, change the intnernal linkage to external linkage. The name does not need change since there is only one TU for -fno-gpu-rdc mode. Also the externalization is done only if the device static var is referenced by host code. Differential Revision: https://reviews.llvm.org/D80858
44 lines
1.5 KiB
Plaintext
44 lines
1.5 KiB
Plaintext
// RUN: %clang_cc1 -std=c++14 %s -emit-llvm -o - -triple nvptx \
|
|
// RUN: -fcuda-is-device | FileCheck --check-prefixes=CXX14 %s
|
|
// RUN: %clang_cc1 -std=c++17 %s -emit-llvm -o - -triple nvptx \
|
|
// RUN: -fcuda-is-device | FileCheck --check-prefixes=CXX17 %s
|
|
|
|
#include "Inputs/cuda.h"
|
|
|
|
// COM: @_ZL1a = internal {{.*}}constant i32 7
|
|
constexpr int a = 7;
|
|
__constant__ const int &use_a = a;
|
|
|
|
namespace B {
|
|
// COM: @_ZN1BL1bE = internal {{.*}}constant i32 9
|
|
constexpr int b = 9;
|
|
}
|
|
__constant__ const int &use_B_b = B::b;
|
|
|
|
struct Q {
|
|
// CXX14: @_ZN1Q2k2E = {{.*}}externally_initialized constant i32 6
|
|
// CXX17: @_ZN1Q2k2E = internal {{.*}}constant i32 6
|
|
// CXX14: @_ZN1Q2k1E = available_externally {{.*}}constant i32 5
|
|
// CXX17: @_ZN1Q2k1E = {{.*}} externally_initialized constant i32 5
|
|
static constexpr int k1 = 5;
|
|
static constexpr int k2 = 6;
|
|
};
|
|
constexpr int Q::k2;
|
|
|
|
__constant__ const int &use_Q_k1 = Q::k1;
|
|
__constant__ const int &use_Q_k2 = Q::k2;
|
|
|
|
template<typename T> struct X {
|
|
// CXX14: @_ZN1XIiE1aE = available_externally {{.*}}constant i32 123
|
|
// CXX17: @_ZN1XIiE1aE = {{.*}}externally_initialized constant i32 123
|
|
static constexpr int a = 123;
|
|
};
|
|
__constant__ const int &use_X_a = X<int>::a;
|
|
|
|
template <typename T, T a, T b> struct A {
|
|
// CXX14: @_ZN1AIiLi1ELi2EE1xE = available_externally {{.*}}constant i32 2
|
|
// CXX17: @_ZN1AIiLi1ELi2EE1xE = {{.*}}externally_initialized constant i32 2
|
|
constexpr static T x = a * b;
|
|
};
|
|
__constant__ const int &y = A<int, 1, 2>::x;
|