This check implements the [rule
17.5.1](https://www.perforce.com/resources/qac/high-integrity-cpp-coding-standard/standard-library)
of the HICPP standard which states:
- Do not ignore the result of std::remove, std::remove_if or std::unique
The mutating algorithms std::remove, std::remove_if and both overloads
of std::unique operate by swapping or moving elements of the range they
are operating over. On completion, they return an iterator to the last
valid element. In the majority of cases the correct behavior is to use
this result as the first operand in a call to std::erase.
This check is based on `bugprone-unused-return-value` but with a fixed
set of functions.
Suppressing issues by casting to `void` is enabled by default, but can
be disabled by setting `AllowCastToVoid` option to `false`.
Summary:
The standard GNU atomic operations are a very common way to target
hardware atomics on the device. With more heterogenous devices being
introduced, the concept of memory scopes has been in the LLVM language
for awhile via the `syncscope` modifier. For targets, such as the GPU,
this can change code generation depending on whether or not we only need
to be consistent with the memory ordering with the entire system, the
single GPU device, or lower.
Previously these scopes were only exported via the `opencl` and `hip`
variants of these functions. However, this made it difficult to use
outside of those languages and the semantics were different from the
standard GNU versions. This patch introduces a `__scoped_atomic` variant
for the common functions. There was some discussion over whether or not
these should be overloads of the existing ones, or simply new variants.
I leant towards new variants to be less disruptive.
The scope here can be one of the following
```
__MEMORY_SCOPE_SYSTEM // All devices and systems
__MEMORY_SCOPE_DEVICE // Just this device
__MEMORY_SCOPE_WRKGRP // A 'work-group' AKA CUDA block
__MEMORY_SCOPE_WVFRNT // A 'wavefront' AKA CUDA warp
__MEMORY_SCOPE_SINGLE // A single thread.
```
Naming consistency was attempted, but it is difficult to capture to full
spectrum with no many names. Suggestions appreciated.
…ONS=OFF
Jose reported an issue ([1]) where for the below illegal asm code
```
r0 = 1 + w3 ll
```
clang actually supports it and generates the object code.
Further investigation finds that clang actually intends to reject the
above code as well but only when the clang is built with
LLVM_ENABLE_ASSERTIONS=ON.
I later found that clang16 (built by redhat and centos) in fedora system
has the same issue since they also have LLVM_ENABLE_ASSERTIONS=OFF
([2]).
So let BPF backend report an error for the above case regardless of the
LLVM_ENABLE_ASSERTIONS setting.
[1] https://lore.kernel.org/bpf/87leahx2xh.fsf@oracle.com/#t
[2]
https://lore.kernel.org/bpf/840e33ec-ea4c-4b55-bda1-0be8d1e0324f@linux.dev/
Co-authored-by: Yonghong Song <yonghong.song@linux.dev>
User::getOperandList has to check the HasHungOffUses bit in Value to
determine how the operand list is stored. If we're using
HungoffOperandTraits we can assume how it is stored without checking the
flag.
Noticed that the for loop in matchSimpleRecurrence was triggering loop
unswitch when built with clang due to specializing based on how the
operand list of the PHINode was stored.
This reduces the size of llc on my local Release+Asserts build by around
41K.
Add 32-bit test for -z dead-reloc-in-nonalloc= and add tests for a
non-x86 64-bit (x86-64 is unique in discerning signed/unsigned 32-bit
absolute relocations (R_X86_64_32/R_X86_64_32S).
AArch64/PPC64/RISC-V/etc don't have the distinction). Having a test will
improve coverage for #74686
This patch adds the majority of the missing extended mnemonics that were
introduced in Power 10.
The only extended mnemonics that were not added are related to the plq
and pstq instructions. These will be added in a separate patch as the
instructions themselves would also have to be added.
Operator allows the phi operand to be a ConstantExpr. A ConstantExpr is
a valid operand to a phi, but is never going to be a recurrence.
We can only match a BinaryOperator so use that instead.
When set to non-zero, the HWASan runtime will map the shadow base at the
specified constant address.
This is particularly useful in conjunction with the existing compiler
option
'hwasan-mapping-offset', which bakes a hardcoded constant address into
the instrumentation.
---------
Co-authored-by: Thurston Dang <thurston@google.com>
WG14 N2939 (Identifier Syntax Fixes) corrects a grammar issue in the C
standard but does not otherwise change intended behavior. This change
updates the C23 status to note this paper as implemented as of Clang 15;
the release in which support for N2836 (Identifier Syntax using Unicode
Standard Annex 31) was implemented.
Currently we are missing set up-boundary address for FinalArraySection
as highests elements in partial struct data.
Currently for:
\#pragma omp target map(D.a) map(D.b[:2])
The size is:
%a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0
%b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
%arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0
%2 = getelementptr float, ptr %arrayidx, i32 1
%3 = ptrtoint ptr %2 to i64
%4 = ptrtoint ptr %a to i64
%5 = sub i64 %3, %4
%6 = sdiv exact i64 %5, ptrtoint (ptr getelementptr (i8, ptr null, i32
1) to i64)
Where %2 is wrong for (D.b[:2]) is pointer to first element of array
section. It should pointe to last element of array section.
The fix is to emit the pointer to the last element of array section and
use this pointer as the highest element in partial struct data.
After change IR:
%a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0
%b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
%arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0
%b1 = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
%arrayidx2 = getelementptr inbounds [2 x float], ptr %b1, i64 0, i64 1
%1 = getelementptr float, ptr %arrayidx2, i32 1
%2 = ptrtoint ptr %1 to i64
%3 = ptrtoint ptr %a to i64
%4 = sub i64 %2, %3
%5 = sdiv exact i64 %4, ptrtoint (ptr getelementptr (i8, ptr null, i32
1) to i64)
AlignmentFromAssumptions uses SCEV to update the load/store alignment.
It tracks down the use-def chains for the pointer which it takes from
the assumption cache until it reaches the load or store instruction. It
mistakenly adds to the worklist the users of the load result
irrespective of the fact that the load result has no connection with the
original pointer, moreover, it is not a pointer at all in most cases.
Thus the def-use chain contains irrelevant load users. When it is a
store instruction the algorithm attempts to adjust its alignment to the
alignment of the original pointer. The problem appears when the load and
store memory operand pointers belong to different address spaces and
possibly have different sizes.
The 4bf015c035e4e5b63c7222dfb15ff274a5ed905c was an attempt to address a
similar problem. The truncation or zero extension was added to make
pointers the same size. That looks strange to me because the zero
extension of the pointer is not legal. The test in the
4bf015c035e4e5b63c7222dfb15ff274a5ed905c does not work any longer as for
the explicit address spaces conversion the addrspacecast is generated.
Summarize:
1. For the alloca to global address spaces conversion addrspacecasts are
used, so the code added by the 4bf015c035e4e5b63c7222dfb15ff274a5ed905c
is no longer functional.
2. The AlignmentFromAssumptions algorithm should not add the load users
to the worklist as they have nothing to do with the original pointer.
3. Instead we only track users that are: GetelementPtrIns, PHINodes.
Not that I'm very good at SFINAE, but it seems that conversion operators
are
perhaps difficult to compose with SFINAE. I saw an example that used one
layer
of indirection to have an explicit return type that could then be used
with
enable_if_t.
Link: https://stackoverflow.com/a/7604580Fixes: #74623
The specific terminator operation depends on what operation it is inside
of. The function `genOpenMPTerminator` performs these checks and selects
the appropriate type of terminator.
Remove partial duplication of that code, and replace it with a function
call. This makes `genOpenMPTerminator` be the sole source of OpenMP
terminators.
Add a pre-commit CI workflow for the experimental SPIR-V backend. This
action should only run when SPIR-V target or test files are modified.
The `codegen-spirv` tests don't run as part of `check-all` because the
SPIR-V backend is still experimental.
Depends on #73371 (for a green tree)
Add proper explanation for cin.sh.cpp fail.
The stdin-is-piped.sh.cpp used to fail with old qemu (4.2.0), but should
pass now, as the qemu is updated now to 8.1.3 in CI.
Summary:
This was added in a previous patch to update how we export the static
library used for OpenMP offloading. By mistake this if-else was using
the output incorrectly.
Fixes https://github.com/llvm/llvm-project/issues/74079
'isctype' fails in arm64-big-endian because the __regex_word involved
in mask operation is not changed based on the platform endianness, while
the character mask does change.
When I originally added this fold, it did not actually fix my
motivation case, where the add was represented as an or. Now that
we have the disjoint flag this can finally be cleanly supported.
Replace explicit calls to
```
op = builder.create<SectionOp>(...)
createBodyOfOp<SectionOp>(op, ...)
```
with a single call to
```
createOpWithBody<SectionOp>(...)
```
This is NFC, that's what the `createOpWithBody` function does.
This code was added 17 years ago but never enabled or tested. GCC warns
that -I- is deprecated for them, and Clang gives an error when passed
-I-, so we may as well remove this code rather than hook it up to the
driver and maintain it.
I'm not sure whether it's possible to cause a miscompile due to
the missing check right now, as the affected values mechanism
effectively protects us against this. This becomes a problem for
an upcoming patch though.
Test was failing due to a different transform sequence declaration (transform sequence were used, while now it should be named transform sequence). Test is now fixed.
Whenever linking with -nodefaultlibs for a MinGW target, we manually
need to specify a bunch of libraries - listed in ${MINGW_LIBRARIES}; the
same is already done for sanitizers and libunwind/libcxxabi/libcxx.
Practically speaking, linking with -nodefaultlibs but manually passing
the libraries in ${MINGW_LIBRARIES} restores most of the libraries that
are linked by default, except for the potential compiler builtins and
unwind library; i.e. it has essentially the same effect as linking with
"--unwindlib=none -rtlib=none", except that -rtlib doesn't accept such a
value.
When building only compiler-rt/lib/builtins, not all of compiler-rt,
${MINGW_LIBRARIES} is unset - set it manually here for that case. This
matches what is set in
compiler-rt/cmake/config-ix.cmake, except that the builtins (libgcc or
compiler-rt builtins) is omitted; the only use within lib/buitlins is
for the standalone libatomic, which explicitly already links against the
just-built builtins.
`llvm/test/CodeGen/RISCV/llvm.frexp.ll` and
`llvm/test/CodeGen/X86/llvm.frexp.ll` contain a number of disabled tests
for unimplemented functionality. This implements one missing part of it.