Fixes: https://github.com/llvm/llvm-project/issues/65806
Currently clang put extern shared var ODR-used by host device functions
in global var __clang_gpu_used_external. This behavior was due to
https://reviews.llvm.org/D123441. However, clang should not do that for
extern shared vars since their addresses are per warp, therefore cannot
be accessed by host code.
After commit 610ec954e1f8 ("[clang] allow const structs/unions/arrays to
be constant expressions for C"), attempts to evaluate
structs/unions/arrays as constants are also performed for C++98 and
C++03.
An assertion was getting tripped up since the potentially-partially
evaluated value was not being reset for those 2 language modes. Make
sure to reset it now for all C++ modes.
Fixes: #65784
The upstream commit: https://reviews.llvm.org/D151590
added a new flag to mark target specific compiler options.
The side effect of it was that in cases when -### or -v is used without any
input file, clang started emitting an error.
It happened like that becasue there is no compilation actions created
which could consume/verify these target specific options.
This patch changes that error to a warning about unused option in situations
when there is no actions and still generates error when there are actions.
Fix for https://github.com/llvm/llvm-project/issues/64958
Differential Revision: https://reviews.llvm.org/D159361
This follows the RISC-V work done in
4b40ced4e5ba10b841516b3970e7699ba8ded572.
This uses AArch64's target parser instead. We just list the names,
without the "+" on them, which matches RISC-V's format.
```
$ ./bin/clang -target aarch64-linux-gnu --print-supported-extensions
clang version 18.0.0 (https://github.com/llvm/llvm-project.git 154da8aec20719c82235a6957aa6e461f5a5e030)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: <...>
All available -march extensions for AArch64
aes
b16b16
bf16
brbe
crc
crypto
cssc
<...>
```
Since our extensions don't have versions in the same way there's just
one column with the name in.
Any extension without a feature name (including the special "none") is
not listed as those cannot be passed to -march, they're just for the
backend. For example the MTE extension can be added with "+memtag" but
MTE2 and MTE3 do not have feature names so they cannot be added to
-march.
This does not attempt to tackle the fact that clang allows invalid
combinations of AArch64 extensions, it simply lists the possible
options. It's still up to the user to ask for something sensible.
Equally, this has no context of what CPU is being selected. Neither does
the RISC-V option, the user has to be aware of that.
I've added a target parser test, and a high level clang test that checks
RISC-V and AArch64 work and that Intel, that doesn't support this, shows
the correct error.
Before:
```
array.cpp:319:10: note: read of uninitialized object is not allowed in a constant expression
319 | return aaa;
| ^
```
After:
```
array.cpp:319:10: note: read of uninitialized object is not allowed in a constant expression
319 | return aaa;
| ^~~
```
This extends 985dacea by XFAILing more tests that started failing after these commits:
* 14498a477ee9e00dc462779cee8cbc5846ca6d3a
* e644f5973b0b71baadc6d7b64596527a1dc49d17
* 89bacc0bb9f6aa66ca1951ec5c0e4c38cc661160
These failures need to be investigated. Possible cause is D151938.
Continuing the discussion in
https://discourse.llvm.org/t/codegen-layout-of-si-class-type-info-doesnt-match-the-actual-size/73274
Before we had this code:
@_ZTVN10__cxxabiv117__class_type_infoE = external global ptr
now we'll produce:
@_ZTVN10__cxxabiv117__class_type_infoE = external global [0 x ptr]
This is because we may not know the exact size of this data, and clang
issues gep inbounds with idx=2. Before, that gep would always result in
poison.
Some fixes for the header / library paths..
- Use concat macro for all paths
- Correct the C++ header paths
- Add library paths
Differential Revision: https://reviews.llvm.org/D159414
There is a long-standing FIXME in `HeaderSearch.cpp` to use the path separator preferred by the platform instead of forward slash. There was an attempt to fix that (1cf6c28a) which got reverted (cf385dc8). I couldn't find an explanation, but my guess is that some tests assuming forward slash started failing.
This commit fixes tests with that assumption.
This is intended to be NFC, but there are two exceptions to that:
* Some diagnostic messages might now contain backslash instead of forward slash.
* Arguments to the "-remap-file" option that use forward slash might stop kicking in. Separators between potential includer path and header name need to be replaced by backslash in that case.
When building the LoongArch Linux kernel without
`CONFIG_DYNAMIC_FTRACE`, the build fails to link because the mcount
symbol is `mcount`, not `_mcount` like GCC generates and the kernel
expects:
```
ld.lld: error: undefined symbol: mcount
>>> referenced by version.c
>>> init/version.o:(early_hostname) in archive vmlinux.a
>>> referenced by do_mounts.c
>>> init/do_mounts.o:(rootfs_init_fs_context) in archive vmlinux.a
>>> referenced by main.c
>>> init/main.o:(__traceiter_initcall_level) in archive vmlinux.a
>>> referenced 97011 more times
>>> did you mean: _mcount
>>> defined in: vmlinux.a(arch/loongarch/kernel/mcount.o)
```
Set `MCountName` in `LoongArchTargetInfo` to `_mcount`, which resolves
the build failure.
Like concepts checking, a trailing return type of a lambda
in a dependent context may refer to captures in which case
they may need to be rebuilt, so the map of local decl
should include captures.
This patch reveal a pre-existing issue.
`this` is always recomputed by TreeTransform.
`*this` (like all captures) only become `const`
after the parameter list.
However, if try to recompute the value of `this` (in a parameter)
during template instantiation while determining the type of the call operator,
we will determine it to be const (unless the lambda is mutable).
There is no good way to know at that point that we are in a parameter
or not, the easiest/best solution is to transform the type of this.
Note that doing so break a handful of HLSL tests.
So this is a prototype at this point.
Fixes#65067Fixes#63675
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D159126
This is an alternative of D157485 and a pre-feature to support AVX10.
AVX10 Architecture Specification: https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC: https://discourse.llvm.org/t/rfc-design-for-avx10-feature-support/72661
Based on the feedbacks from LLVM and GCC community, we have agreed to
start from supporting `-m[no-]evex512` on existing AVX512 features.
The option `-mno-evex512` can be used with `-mavx512xxx` to build
binaries that can run on both legacy AVX512 targets and AVX10-256.
There're still arguments about what's the expected behavior when this
option as well as `-mavx512xxx` used together with `-mavx10.1-256`. We
decided to defer the support of `-mavx10.1` after we made consensus.
Or furthermore, we start from supporting AVX10.2 and not providing any
AVX10.1 options.
Reviewed By: RKSimon, skan
Differential Revision: https://reviews.llvm.org/D159250
Buitlins from AMD's device-libs are compiled without specifying a
target-cpu, which results in builtins without the target-features
attribute set.
Before this patch, when linking this builtins with -mlink-builtin-bitcode
the target-features were not propagated in the incoming builtins.
With this patch, the default target features are propagated
if they are compatible with the target-features in the incoming builtin.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D159206
C++20 modules
Previously, we banned the check for input files from C++20 modules since
we thought the BMI from C++20 modules should be a standalone artifact.
However, during the recent experiment with clangd for modules, I find
it is necessary to tell whether or not a BMI is out-of-date by checking the
input files especially for language servers.
So this patch brings a header search option
ForceCheckCXX20ModulesInputFiles to allow the tools (concretly, clangd)
to check the input files from BMI.
https://reviews.llvm.org/D158247 caused regressions for HIP on Windows
and was reverted.
A reduced test case is:
```
typedef void (__stdcall* funcTy)();
void invoke(funcTy f);
static void __stdcall callee() noexcept {
}
void foo() {
invoke(callee);
}
```
It is due to clang missing handling host/device attributes for calling
convention at a few places
This patch fixes that.
This patch enables some fp16 vector type builtins that don't use fp arithmetic instruction for zvfhmin without zvfh.
Include following builtins:
vector load/store,
vector reinterpret,
vmerge_vvm,
vmv_v.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D151869
This patch adds the clang portion of an AIX-specific option to inform
the compiler that it can use a faster access sequence for the local-exec
TLS model (formally named aix-small-local-exec-tls).
This patch only adds the frontend portion of the option, building upon:
Backend portion of the option (D156203)
Backend patch that utilizes this option to actually produce the faster access sequence (D155600)
Differential Revision: https://reviews.llvm.org/D155544
Previously clang AST prints the following declaration:
int fun_var_unused() {
int x __attribute__((unused)) = 0;
return x;
}
and
int __declspec(thread) x = 0;
as:
int fun_var_unused() {
int x = 0 __attribute__((unused));
return x;
}
and
int x = __declspec(thread) 0;
which is rejected by C/C++ parser. This patch modifies the logic to
print old C attributes for variables as:
int __attribute__((unused)) x = 0;
and the __declspec case as:
int __declspec(thread) x = 0;
Fixes: https://github.com/llvm/llvm-project/issues/59973
Previous version: D141714.
Differential Revision:https://reviews.llvm.org/D141714
As noticed in D158846, the Solaris driver deviates from other targets in
that it links every executable with `-lm`, but doesn't for shared
objects. For C code, this is unnecessary, while for C++ `libm` is always
needed, even for shared objects.
This patch fixes this by following the `Gnu.cpp` precedent. It adjusts
the `solaris-ld.c` test accordingly, adding some more tests.
Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
`x86_64-pc-linux-gnu`.