The word "attribute" has a specific meaning in LLVM. Avoid using it
here to mean something different.
This addresses feedback from D153180.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D153444
This reverts commit bba2b656110209a3d9863b92c060082479b06ab1.
Reason: Broke the HWASan buildbot. See https://reviews.llvm.org/D153250
for more information.
It hasn't been written to since https://reviews.llvm.org/D46652, so it
was always empty. I don't have enough context on that diff to know if
the removal of the write to ShowIncludesPretendHeader in that diff was
intentional, but no one's complained about it for five years, so I
assume we're okay to just get rid of it entirely.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D153176
`-H` displays a tree of included header files, but that tree is supposed
to omit two categories of header files:
1. Any header files pulled in via `-include`, which the code refers to
as the "predefines".
2. Any header files whose inclusion was skipped because they'd already
been included (assuming header guards or `#pragma once`).
`-fshow-skipped-includes` was intended to make `-H` display the second
category of files. It wasn't checking for the first category, however,
so you could end up with only the middle of the `-include` hierarchy
displayed, e.g. the added test would previously output:
```
... /data/users/smeenai/llvm-project/clang/test/Frontend/Inputs/test2.h
. /data/users/smeenai/llvm-project/clang/test/Frontend/Inputs/test.h
```
This diff adds a check to prevent that and correctly omit headers from
`-include` even when `-fshow-skipped-includes` is passed. While I'm
here, add tests for the interaction between `-fshow-skipped-includes`
and `-sys-header-deps` as well.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D153175
Add support for couple of FIR types such as fir.ptr, fir.heap,
fir.box, fir.class
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D153461
The ScopedString class has two functions named append. One takes
a va_list, but on some platforms va_list is typedef'd to char*.
That means that this call:
std::string value;
Str.append("print this string %s", value.c_str());
The compiler can incorrectly think this is the va_list function,
leading to crashes when calling this. To fix this, change the name
of the va_list function to be vappend to avoid this.
Fix https://github.com/llvm/llvm-project/issues/62893
Reviewed By: Chia-hungDuan
Differential Revision: https://reviews.llvm.org/D153389
Previously, it wouldn't take into account files in ignore_format.txt
(at least not on OSX) because the `find` command would return file names
like `libcxx/src//new_handler.cpp`, which never matched the file names
in `ignore_format.txt`.
Differential Revision: https://reviews.llvm.org/D153416
Two new entitlements, com.apple.private.thread-set-state and
com.apple.private.set-exception-port should be included in the
debugserver entitlement plist when built internally at Apple.
The desktop builds of debugserver don't need these entitlements;
don't add them to debugserver-macosx-entitlements.plist.
rdar://108912676
rdar://103032208
This doesn't change the selection, but it expands the conditions to
add comments and make it clearer what's happening. It also removes a
-Wundef instance when we checked __ARM_ARCH_7K__ >= 2 without checking
that it is defined in the first place.
Differential Revision: https://reviews.llvm.org/D153413
Commit ec77747fbdca901e0fded58f940dae62e0f6b726 regenerated the check lines
without being very careful about which lines were updated. This attempts to fix
them to make sure the V7 and V8 lines are emitted as needed.
For all other compute and data constructs, the data operands list
is named `dataClauseOperands`. Update `acc.host_data` to be
consistent with this naming.
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D153425
Lower private and firstprivate operands through their corresponding
data entry operation to support array section.
Depends on D152972
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D152974
acc.firstprivate operation will be used as data entry operation
for the firstprivate operands.
Depends on D152970
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D152972
acc.private operation will be used as data entry operation
for the private operands.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D152970
The test shows a mis-compile where @test gets incorrectly simplified to
unreachable. The test case is reduced from a ThinLTO build of Clang,
with only the relevant pass sequence included.
This patch tries to reduce the size of the debug_loclist section by
replacing the DW_LLE_start_length opcodes currently emitted by dsymutil
in favor of using DW_LLE_base_address + DW_LLE_offset_pair instead.
The DW_LLE_start_length is one AddressSize followed by a ULEB per entry,
whereas, the DW_LLE_base_address + DW_LLE_offset_pair will use one
AddressSize for the base address, and then the DW_LLE_offset_pair is a
pair of ULEBs. This will be more efficient where a loclist fragment has
many entries.
Differential Revision: https://reviews.llvm.org/D153080
Sometimes LLVM generates branch to return instruction, like PR63227.
It is because in function MachineBlockPlacement::canTailDuplicateUnplacedPreds
we avoid duplicating a BB into another already placed BB to prevent destroying
computed layout. But if the successor BB is a return block, duplicating it will
only reduce taken branches without hurt to any other branches.
Differential Revision: https://reviews.llvm.org/D153093
PTX does not have a notion of `unreachable`, which results in emitted basic
blocks having an edge to the next block:
```
block1:
call @does_not_return();
// unreachable
block2:
// ptxas will create a CFG edge from block1 to block2
```
This may result in significant changes to the control flow graph, e.g., when
LLVM moves unreachable blocks to the end of the function. That's a problem
in the context of divergent control flow, as `ptxas` uses the CFG to determine
divergent regions, while some intructions may not be executed divergently.
For example, `bar.sync` is not allowed to be executed divergently on Pascal
or earlier. If we start with the following:
```
entry:
// start of divergent region
@%p0 bra cont;
@%p1 bra unlikely;
...
bra.uni cont;
unlikely:
...
// unreachable
cont:
// end of divergent region
bar.sync 0;
bra.uni exit;
exit:
ret;
```
it is transformed by the branch-folder and block-placement passes to:
```
entry:
// start of divergent region
@%p0 bra cont;
@%p1 bra unlikely;
...
bra.uni cont;
cont:
bar.sync 0;
bra.uni exit;
unlikely:
...
// unreachable
exit:
// end of divergent region
ret;
```
After moving the `unlikely` block to the end of the function, it has an edge
to the `exit` block, which widens the divergent region and makes the `bar.sync`
instruction happen divergently. That causes wrong computations, as we've been
running into for years with Julia code (which emits a lot of `trap` +
`unreachable` code all over the place).
To work around this, add an `exit` instruction before every `unreachable`,
as `ptxas` understands that exit terminates the CFG. Note that `trap` is not
equivalent, and only future versions of `ptxas` will model it like `exit`.
Another alternative would be to emit a branch to the block itself, but emitting
`exit` seems like a cleaner solution to represent `unreachable` to me.
Also note that this may not be sufficient, as it's possible that the block
with unreachable control flow is branched to from different divergent regions,
e.g. after block merging, in which case it may still be the case that `ptxas`
could reconstruct a CFG where divergent regions are merged (I haven't confirmed
this, but also haven't encountered this pattern in the wild yet):
```
entry:
// start of divergent region 1
@%p0 bra cont1;
@%p1 bra unlikely;
bra.uni cont1;
cont1:
// intended end of divergent region 1
bar.sync 0;
// start of divergent region 2
@%p2 bra cont2;
@%p3 bra unlikely;
bra.uni cont2;
cont2:
// intended end of divergent region 2
bra.uni exit;
unlikely:
...
exit;
exit:
// possible end of merged divergent region?
```
I originally tried to avoid the above by cloning paths towards `unreachable` and
splitting the outgoing edges, but that quickly became too complicated. I propose
we go with the simple solution first, also because modern GPUs with more flexible
hardware thread schedulers don't even suffer from this issue.
Finally, although I expect this to fix most of
https://bugs.llvm.org/show_bug.cgi?id=27738, I do still encounter
miscompilations with Julia's unreachable-heavy code when targeting these
older GPUs using an older `ptxas` version (specifically, from CUDA 11.4 or
below). This is likely due to related bugs in `ptxas` which have been fixed
since, as I have filed several reproducers with NVIDIA over the past couple of
years. I'm not inclined to look into fixing those issues over here, and will
instead be recommending our users to upgrade CUDA to 11.5+ when using these GPUs.
Also see:
- https://github.com/JuliaGPU/CUDAnative.jl/issues/4
- https://github.com/JuliaGPU/CUDA.jl/issues/1746
- https://discourse.llvm.org/t/llvm-reordering-blocks-breaks-ptxas-divergence-analysis/71126
Reviewed By: jdoerfert, tra
Differential Revision: https://reviews.llvm.org/D152789
This patch should address the failure of TestStackCoreScriptedProcess
that is happening specifically on x86_64.
It turns out that in 1370a1cb5b97, I changed the way we extract integers
from a `StructuredData::Dictionary` and in order to get a stop info from
the scripted process, we call a method that returns a `SBStructuredData`
containing the stop reason data.
TestStackCoreScriptedProcess` was failing specifically on x86_64 because
the stop info dictionary contains the signal number, that the `Scripted
Thread` was trying to extract as a signed integer where it was actually
parsed as an unsigned integer. That caused `GetValueForKeyAsInteger` to
return the default value parameter, `LLDB_INVALID_SIGNAL_NUMBER`.
This patch address the issue by extracting the signal number with the
appropriate type and re-enables the test.
Differential Revision: https://reviews.llvm.org/D152848
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
The COFFAsmParser (to my surprise) didn't support the .pushsection and
.popsection directives. These directives aren't directly useful, however for
frontends that have inline asm support this is really useful. Rust in
particular, has support for inline asm, which can be used together with these
directives to "emulate" features like static generics. This patch adds support
for the two mentioned directives.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D152085
When LLDB needs to access a debug section, it generally calls
SectionList::FindSectionByType with the corresponding type (we have one type for
each DWARF section). However, the missing entries made some sections be
classified as "eSectionTypeOther", which makes all calls to `FindSectionByType`
fail.
With this patch, a check-lldb build with
`-DLLDB_TEST_USER_ARGS=--dwarf-version=5` reports a much lower number of
failures:
Unsupported : 327
Passed : 2423
Expectedly Failed: 16
Unresolved : 2
Failed : 52
This is down from previously 400~ failures.
Differential Revision: https://reviews.llvm.org/D153433
This reverts commit f55fd19b6b565827af5fbf504952dcc35b8b7360.
As noted on the original thread, other uses of LLVM_LIBRARY_OUTPUT_INTDIR are optional. Will make a separate patch that makes this use optional as well.
It seems just replacing the operation was not replacing all of the uses
when the types of the expression before and after this pass differ (due
to differing shape information). Now the shape information is always
kept the same.
This fixes https://github.com/llvm/llvm-project/issues/63399
Differential Revision: https://reviews.llvm.org/D153333
The bytecode writer config was heap-allocated, but was never freed, causing ASAN errors.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D153440
This patch prepares the RPC interface to be installed. We place this in
the existing `llvm-gpu-none` directory as it will also give us access to
the generated `libc` headers for the opcodes.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D153040
This avoids undefs from being expanded to a build vector of zeroes.
As noted by @craig.topper in D153399
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D153411
posix_compat.h uses struct timeval which is defined in <sys/time.h>
but it doesn't include it. On most POSIX platforms like Linux or macOS,
that headers is transitively included by other headers like <sys/stat.h>,
but there are other platforms where this is not the case.
Differential Revision: https://reviews.llvm.org/D153384