Commit Graph

486862 Commits

Author SHA1 Message Date
Jie Fu
904cf66ec1 [lldb] Fix build error in lldb-dap.cpp (NFC)
llvm-project/lldb/tools/lldb-dap/lldb-dap.cpp:679:5:
 error: unknown type name 'kkkk'
    kkkk response["success"] = false;
    ^
2024-01-20 08:09:20 +08:00
Walter Erquinigo
8bef2f27a0
[lldb-dap] Add a CMake variable for defining a welcome message (#78811)
lldb-dap instances managed by other extensions benefit from having a
welcome message with, for example, a basic user guide or a
troubleshooting message.
This PR adds a cmake variable for defining such message in a simple way.
This message appears upon initialization but before initCommands are
executed, as they might cause a failure and prevent the message from
being displayed.
2024-01-19 18:55:40 -05:00
Xiangxi Guo (Ryan)
c17aa14f4c
[mlir][index] Fold cmp(x, x) when x isn't a constant (#78812)
Such cases show up in the middle of optimizations passes, e.g., after
some rewrites and then CSE. The current folder can fold such cases when
the inputs are constant; this patch improves it to fold even if the
inputs are non-constant.
2024-01-19 15:54:33 -08:00
Petr Hosek
b86d02375e
[libc] Redo the install targets (#78795)
Prior to this change, we wouldn't build headers that aren't referenced
by other parts of the libc which would result in a build error during
installation. To address this, we make the header target a dependency of
the libc archive. Additionally, we also redo the install targets, moving
the install targets closer to build targets and simplifying the
hierarchy and generally matching what we do for other runtimes.
2024-01-19 15:45:22 -08:00
erman-gurses
b7360fbe8c
[mlir][amdgpu] Shared memory access optimization pass (#75627)
It implements transformation to optimize accesses to shared memory.

Reference: https://reviews.llvm.org/D127457

_This change adds a transformation and pass to the NvGPU dialect that
attempts to optimize reads/writes from a memref representing GPU shared
memory in order to avoid bank conflicts. Given a value representing a
shared memory memref, it traverses all reads/writes within the parent op
and, subject to suitable conditions, rewrites all last dimension index
values such that element locations in the final (col) dimension are
given by newColIdx = col % vecSize + perm[row](col / vecSize, row)
where perm is a permutation function indexed by row and vecSize
is the vector access size in elements (currently assumes 128bit
vectorized accesses, but this can be made a parameter). This specific
transformation can help optimize typical distributed & vectorized
accesses
common to loading matrix multiplication operands to/from shared memory._
2024-01-19 15:44:45 -08:00
spupyrev
30aa9fb4c1 Revert "[InstrProf] Adding utility weights to BalancedPartitioning (#72717)"
This reverts commit 5954b9dca2
due to broken Windows build
2024-01-19 15:13:47 -08:00
Jonas Devlieghere
593395f0da
[dsymutil] Fix spurious warnings in MachODebugMapParser (#78794)
When the MachODebugMapParser encounters an object file that cannot be
found on disk, it currently leaves the parser in an incoherent state,
resulting in spurious warnings that can in turn slow down dsymutil.

This fixes #78411.

rdar://117515153
2024-01-19 15:10:57 -08:00
Craig Topper
9396891271 [RISCV] Don't look for sext in RISCVCodeGenPrepare::visitAnd.
We want to know the upper 33 bits of the And Input are zero. SExt
only guarantees they are the same.

We originally checked for SExt or ZExt when we were using
isImpliedByDomCondition because a ZExt may have been changed to SExt
before we visited the And.

We are no longer using isImpliedByDomCondition so we can only look
for zext with the nneg flag.

While here, switch to PatternMatch to simplify the code.

Fixes #78783
2024-01-19 14:44:47 -08:00
Craig Topper
66cea7143a [RISCV] Add test case for #78783. NFC 2024-01-19 14:44:47 -08:00
Sam Clegg
39e024d9e2
[lld][WebAssembly] Use the archive offset with --whole-archive (#78791)
This essentially ports 0b1413a8 from the ELF linker.
2024-01-19 14:42:03 -08:00
Aiden Grossman
f9bc1ee3fc
[llvm-objdump] Add support for symbolizing PGOBBAddrMap Info (#76386)
This patch adds in support for symbolizing PGO information contained
within the SHT_LLVM_BB_ADDR_MAP section in llvm-objdump. The outputs are
simply the raw values contained within the section.
2024-01-19 14:28:31 -08:00
Arthur Eubanks
86eaf6083b
[X86] Refine X86DAGToDAGISel::isSExtAbsoluteSymbolRef() (#76191)
We just need to check if the global is large or not.

In the kernel code model, globals are in the negative 2GB of the address
space, so globals can be a sign extended 32-bit immediate.

In other code models, small globals are in the low 2GB of the address
space, so sign extending them is equivalent to zero extending them.
2024-01-19 14:11:18 -08:00
bd1976bris
cd05ade13a
Add a "don't override" mapping for -fvisibility-from-dllstorageclass (#74629)
`-fvisibility-from-dllstorageclass` allows for overriding the visibility
of globals from their DLL storage class. The visibility to apply can be
customised for the different classes of globals via a set of dependent
options that specify the mapping values:
- `-fvisibility-dllexport=<value>`
- `-fvisibility-nodllstorageclass=<value>`
- `-fvisibility-externs-dllimport=<value>`
- `-fvisibility-externs-nodllstorageclass=<value>` 

Currently, one of the existing LLVM visibilities, `hidden`, `protected`,
`default`, can be used as a mapping value. This change adds a new
mapping value: `keep`, which specifies that the visibility should not be
overridden for that class of globals. The behaviour of
`-fvisibility-from-dllstorageclass` is otherwise unchanged and existing
uses of this set of options will be unaffected.

The background to this change is that currently the PS4 and PS5
compilers effectively ignore visibility - dllimport/export is the
supported method for export control in C/C++ source code. Now, we would
like to support visibility attributes and options in our frontend, in
addition to dllimport/export. To support this, we will override the
visibility of globals with explicit dllimport/export annotations but use
the `keep` setting for globals which do not have an explicit
dllimport/export.

There are also some minor improvements to the existing options:
- Make the `LANGOPS` `BENIGN` as they don't involve the AST.
- Correct/clarify the help text for the options.
2024-01-19 21:57:40 +00:00
Sam Clegg
2bfa5ca927
[lld][WebAssembly] Reset context object after each link (#78770)
This mirrors how the ELF linker works. I wasn't able to find anywhere
where this is currently tested.

Followup to #78640, which triggered a regression.
2024-01-19 13:51:35 -08:00
Konstantin Varlamov
58780b811c
[libc++][hardening] In production hardening modes, trap rather than abort (#78561)
In the hardening modes that can be used in production (`fast` and
`extensive`), make a failed assertion invoke a trap instruction rather
than calling verbose abort. In the debug mode, still keep calling
verbose abort to provide a better user experience and to allow us to
keep our existing testing infrastructure for verifying assertion
messages. Since the debug mode by definition enables all assertions, we
can be sure that we still check all the assertion messages in the
library when running the test suite in the debug mode.

The main motivation to use trapping in production is to achieve better
code generation and reduce the binary size penalty. This way, the
assertion handler can compile to a single instruction, whereas the
existing mechanism with verbose abort results in generating a function
call that in general cannot be optimized away (made worse by the fact
that it's a variadic function, imposing an additional penalty). See the
[RFC](https://discourse.llvm.org/t/rfc-hardening-in-libc/73925) for more
details. Note that this mechanism can now be completely [overridden at
CMake configuration
time](https://github.com/llvm/llvm-project/pull/77883).

This patch also significantly refactors `check_assertion.h` and expands
its test coverage. The main changes:
- when overriding `verbose_abort`, don't do matching inside the function
-- just print the error message to `stderr`. This removes the need to
set a global matcher and allows to do matching in the parent process
after the child finishes;
- remove unused logic for matching source locations and for using
wildcards;
- make matchers simple functors;
- introduce `DeathTestResult` that keeps data about the test run,
primarily to make it easier to test.

In addition to the refactoring, `check_assertion.h` can now recognize
when a process exits due to a trap.
2024-01-19 13:48:13 -08:00
Danila Malyutin
0388ab3e29
[Statepoint][NFC] Use uint16_t and add an assert (#78717)
Use a fixed width integer type and assert that DwarRegNum fits the 16
bits.

This is a follow up to review comments on #78600.
2024-01-20 00:44:00 +03:00
spupyrev
5954b9dca2
[InstrProf] Adding utility weights to BalancedPartitioning (#72717)
Adding weights to utility nodes in BP so that we can give more
importance to
certain utilities. This is useful when we optimize several objectives
jointly.
2024-01-19 13:36:59 -08:00
Eric Miotto
9175dd9cbc
[CMake] Detect properly new linker introduced in Xcode 15 (#77806)
As explained in [1], this linker is functionally equivalent to the
classic one (`ld64`) for build system purposes -- in particular to 
enable the use of order files to link `clang`. For this reason, in 
addition to fixing the detection rename `LLVM_LINKER_IS_LD64` to 
`LLVM_LINKER_IS_APPLE` to make the result of such detection more 
clear -- this should not cause any issue to downstream users, from 
a quick search in SourceGraph [2], only Swift uses the value of
this variable (which I will take care of updating in due time).

[1]: https://developer.apple.com/documentation/xcode-release-notes/xcode-15-release-notes#Linking
[2]: https://sourcegraph.com/search?q=context:global+LLVM_LINKER_IS_LD64+lang:cmake+fork:no+-file:AddLLVM.cmake+-file:clang/tools/driver/CMakeLists.txt&patternType=standard&sm=1&groupBy=repo
rdar://120740222
2024-01-19 16:32:32 -05:00
Pranav Kant
4482fd846a Revert "[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636)"
This reverts commit 4d11f04b20.

This breaks some programs as mentioned in #78636
2024-01-19 21:02:20 +00:00
Sam Clegg
f5e58a0380
[lld][ELF] Simplify handleLibcall. NFC (#78659)
I noticed this while working on #78658
2024-01-19 12:39:35 -08:00
Nick Desaulniers
2c0d20668a
[libc] remove extra -Werror (#78761)
-Werror is now a global default as of
commit c52b467875 ("Reapply "[libc] build with -Werror (#73966)"
(#74506)")
2024-01-19 12:28:06 -08:00
Aaron Ballman
89592061a4 Remove an unused API; NFC
Not only is this unused, it's really confusing having
getAPValueResult() and getResultAsAPValue() as sibling APIs
2024-01-19 15:15:24 -05:00
Craig Topper
9ae28fb9d3
[RISCV] Prevent RISCVMergeBaseOffsetOpt from calling getVRegDef on a physical register. (#78762)
Fixes #78679.
2024-01-19 12:15:08 -08:00
Mital Ashok
924701311a
[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768)
Closes #77638, #24186

Rebased from <https://reviews.llvm.org/D156032>, see there for more
information.

Implements wording change in [CWG2137](https://wg21.link/CWG2137) in the
first commit.

This also implements an approach to [CWG2311](https://wg21.link/CWG2311)
in the second commit, because too much code that relies on `T{ T_prvalue}` 
being an elision would break. Because that issue is still open and
the CWG issue doesn't provide wording to fix the issue, there may be
different behaviours on other compilers.
2024-01-19 21:10:51 +01:00
Aiden Grossman
2b31a673de
[llvm-exegesis] Make duplicate snippet repetitor produce whole snippets (#77224)
Currently, the duplicate snippet repetitor will truncate snippets that
do not exactly divide the minimum number of instructions. This patch
corrects that behavior by making the duplicate snippet repetitor
duplicate the snippet in its entirety until the minimum number of
instructions has been reached.

This makes the behavior consistent with the loop snippet repetitor,
which will execute at least `--num-repetitions` (soon to be renamed
`--min-instructions`) instructions.
2024-01-19 11:34:16 -08:00
Aiden Grossman
c067524852
[SHT_LLVM_BB_ADDR_MAP] Add assertion and clarify docstring (#77374)
This patch adds an assertion to readBBAddrMapImpl to confirm that
PGOAnalyses and BBAddrMaps are of the same size when PGO information is
requested (part of the API contract). This patch also updates the
docstring for readBBAddrMap to better clarify what is guaranteed.
2024-01-19 11:34:00 -08:00
Jeremy Kun
76ffa8f63a
[mlir][transform]: fix broken bazel build for TensorTransformOps (#78766) 2024-01-19 20:33:36 +01:00
Min-Yih Hsu
5330daad41
[RISCV] Add support for Smepmp 1.0 (#78489)
Smepmp is a supervisor extension that prevents privileged processes from
accessing unprivileged program and data.

Spec: https://github.com/riscv/riscv-tee/blob/main/Smepmp/Smepmp.pdf
2024-01-19 11:09:35 -08:00
Jeremy Kun
2521e9785d
[mlir][transform]: fix broken bazel build (#78757)
Broken by
42b160356f
2024-01-19 19:55:43 +01:00
Durgadoss R
43531e7196
[LLVM][NVPTX] Add cp.async.bulk.commit/wait intrinsics (#78698)
This patch adds NVVM intrinsics and NVPTX codegen for the bulk variants
of the async-copy commit/wait instructions.
lit tests are added to verify the generated PTX.

PTX Doc link:

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-async-bulk-commit-group

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2024-01-19 10:42:33 -08:00
Quinn Dawkins
42b160356f
[mlir][transform] Add an op for replacing values with function calls (#78398)
Adds `transform.func.cast_and_call` that takes a set of inputs and
outputs and replaces the uses of those outputs with a call to a function
at a specified insertion point.

The idea with this operation is to allow users to author independent IR
outside of a to-be-compiled module, and then match and replace a slice
of the program with a call to the external function.

Additionally adds a mechanism for populating a type converter with a set
of conversion materialization functions that allow insertion of
casts on the inputs/outputs to and from the types of the function
signature.
2024-01-19 13:21:52 -05:00
Thurston Dang
0784b1eefa
Re-exec TSan with no ASLR if memory layout is incompatible on Linux (#78351)
TSan's shadow mappings only support 30-bits of ASLR entropy on x86
Linux, and it is not practical to support the maximum of 32-bits (due to pointer compression and the overhead of shadow mappings). Instead, this patch changes TSan to re-exec without ASLR if it encounters an 
incompatible memory layout, as suggested by Dmitry in
https://github.com/google/sanitizers/issues/1716.
If ASLR is already disabled but the memory layout is still incompatible,
it will abort.

This patch involves a bit of refactoring, because the old code is:
1. InitializePlatformEarly()
2. InitializeAllocator()
3. InitializePlatform(): CheckAndProtect()

but it may already segfault during InitializeAllocator() if the memory
layout is incompatible, before we get a chance to check in
CheckAndProtect().

This patch adds CheckAndProtect() during InitializePlatformEarly(), before the allocator is initialized. Naturally, it is necessary to ensure that CheckAndProtect() does *not* allow the heap regions to be occupied  here, hence we generalize CheckAndProtect() to optionally check the heap
regions. We keep the original behavior of CheckAndProtect() in InitializePlatform() as a last line of defense.

We need to be careful not to prematurely abort if ASLR is disabled but TSan was going to re-exec for other reasons (e.g., unlimited stack size); we implement this by moving all the re-exec logic into ReExecIfNeeded().
2024-01-19 09:33:54 -08:00
Sam Clegg
5b0e45c8ce
[lld][WebAssembly] Fix use of undefined funcs under --warn-unresolved-symbols (#78643)
When undefined functions exist in the final link we need to create
stub functions (otherwise direct calls to those functions could
not be generated).  We were creating those stub when
`--unresolved-symbols=ignore-all` was passed but overlooked the fact
that `--warn-unresolved-symbols` essentially has the same effect (i.e.
undefined function can exist in the final link).

Fixes: #53987
2024-01-19 09:32:22 -08:00
Felipe de Azevedo Piovezan
b6677835fe
[AsmPrinter][DebugNames] Implement DW_IDX_parent entries (#77457)
This implements the ideas discussed in [1].

To summarize, this commit changes AsmPrinter so that it outputs
DW_IDX_parent information for debug_name entries. It will enable
debuggers to speed up queries for fully qualified types (based on a
DWARFDeclContext) significantly, as debuggers will no longer need to
parse the entire CU in order to inspect the parent chain of a DIE.
Instead, a debugger can simply take the parent DIE offset from the
accelerator table and peek at its name in the debug_info/debug_str
sections.

The implementation uses two types of DW_FORM for the DW_IDX_parent
attribute:

1. DW_FORM_ref4, which points to the accelerator table entry for the
parent.
2. DW_FORM_flag_present, when the entry has a parent that is not in the
table (that is, the parent doesn't have a name, or isn't allowed to be
in the table as per the DWARF spec). This is space-efficient, since it
takes 0 bytes.

The implementation works by:

1. Changing how abbreviations are encoded (so that they encode which
form, if
any, was used to encode IDX_Parent)
2. Creating an MCLabel per accelerator table entry, so that they may be
referred by IDX_parent references.


When all patches related to this are merged, we are able to show that
evaluating an expression such as:

```
lldb --batch -o 'b CodeGenFunction::GenerateCode' -o run -o 'expr Fn' -- \
  clang++ -c -g test.cpp -o /dev/null
```

is far faster: from ~5000 ms to ~1500ms.

Building llvm-project + clang with and without this patch, and looking
at its impact on object file size:

```
ls -la $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,507,327,592

-la $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,436,446,616
```

That is, an increase of 0.62% in total object file size.

Looking only at debug_names:

```
$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
440,772,348

$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
369,867,920
```

That is an increase of 19%.

DWARF Linkers need to be changed in order to support this. This commit
already brings support to "base" linker, but it does not attempt to
modify the parallel linker. Accelerator entries refer to the
corresponding DIE offset, and this patch also requires the parent DIE
offset -- it's not clear how the parallel linker can access this. It may
be obvious to someone familiar with it, but it would be nice to get help
from its authors.

[1]:
https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/
2024-01-19 09:19:09 -08:00
lntue
c80d68a676
[libc] Add float.h header. (#78737) 2024-01-19 12:04:34 -05:00
Jordan Rupprecht
d0d0727104
[lldb][test] Apply @expectedFailureAll/@skipIf early for debug_info tests (#73067)
The @expectedFailureAll and @skipIf decorators will mark the test case
as xfail/skip if _all_ conditions passed in match, including debug_info.
* If debug_info is not one of the matching conditions, we can
immediately evaluate the check and decide if it should be decorated.
* If debug_info *is* present as a match condition, we need to defer
whether or not to decorate until when the `LLDBTestCaseFactory`
metaclass expands the test case into its potential variants. This is
still early enough that the standard `unittest` framework will recognize
the test as xfail/skip by the time the test actually runs.

TestDecorators exhibits the edge cases more thoroughly. With the
exception of `@expectedFailureIf` (added by this commit), all those test
cases pass prior to this commit.

This is a followup to 212a60ec37.
2024-01-19 10:50:05 -06:00
Joseph Huber
cebe4de66f [libc] Fix test failing on GPU using deprecated 'add_unittest'
Summary:
We use `add_libc_test' now because it works for both hermetic and unit
tests. If the test needs to be unit test only you use `UNIT_TEST_ONLY`
as an argument.
2024-01-19 10:38:41 -06:00
Marius Brehler
205e15c176 [mlir][docs] Fix broken link 2024-01-19 17:38:27 +01:00
Sander de Smalen
5f41cef58f
[AArch64] NFC: Simplify discombobulating 'requiresSMChange' interface (#78703)
Having it return a `std::optional<bool>` is unnecessarily confusing.
This patch changes it to a simple 'bool'.

This patch also removes the 'BodyOverridesInterface' operand because
there is only a single use for this which is easily rewritten.
2024-01-19 16:15:38 +00:00
Sander de Smalen
40a631f452
[Clang] Refactor diagnostics for SME builtins. (#78258)
The arm_sme.td file was still using `IsSharedZA` and `IsPreservesZA`,
which should be changed to match the new state attributes added in
#76971.

This patch adds `IsInZA`, `IsOutZA` and `IsInOutZA` as the state for the
Clang builtins and fixes up the code in SemaChecking and SveEmitter to
match.

Note that the code is written in such a way that it can be easily
extended with ZT0 state (to follow in a future patch).
2024-01-19 16:02:24 +00:00
Jay Foad
e89a7c41ba [AMDGPU] Update comment on SIInstrInfo::isLegalFLATOffset for GFX12 2024-01-19 15:53:06 +00:00
Jay Foad
1abf2570b3 [AMDGPU] Make use of CPol::SWZ_* in SelectionDAG. NFC.
For GlobalISel this was already done in
AMDGPUInstructionSelector::selectBufferLoadLds.
2024-01-19 15:48:45 +00:00
Kareem Ergawy
5dbb30d950
[MLIR][OpenMP] Better error reporting for unsupported nowait (#78551)
Provides some context for failing to generate LLVM IR for `target
enter|exit|update` directives when `nowait` is provided. This is
directly helpful for flang users since they would get this error message
if they tried to use `nowait`. Before that we had a very generic
message.

This is a follow-up to https://github.com/llvm/llvm-project/pull/78269,
please only review the latest commit (the one with the same commit
message as the PR title).
2024-01-19 16:47:24 +01:00
Jay Foad
e21b0b083e
[AMDGPU] Remove gws feature from GFX12 (#78711)
This was already done for LLVM. This patch just updates the Clang
builtin handling to match.
2024-01-19 15:45:53 +00:00
Jay Foad
97747467f1
[AMDGPU] Update hazard recognition for new GFX12 wait counters (#78722)
In most cases the hazards no longer apply, so just assert that we are
not on GFX12.
2024-01-19 15:30:41 +00:00
Jay Foad
89226ecbb9
[AMDGPU] Do not widen scalar loads on GFX12 (#78724)
GFX12 has subword scalar loads so there is no need to do this.
2024-01-19 15:30:07 +00:00
Kiran Chandramohan
aac1d9710b
[Flang][OpenMP] Consider renames when processing reduction intrinsics (#70822)
Fixes #68654

Depends on https://github.com/llvm/llvm-project/pull/70790
2024-01-19 15:17:21 +00:00
Vinayak Dev
497a8604b3
[FileCheck]: Fix diagnostics for NOT prefixes (#78412)
Fixes #70221 

Fix a bug in FileCheck that corrects the error message when multiple
prefixes are provided
through --check-prefixes and one of them is a PREFIX-NOT.

Earlier, only the first of the provided prefixes was displayed as the
erroneous prefix, while the
actual error might be on the prefix that occurred at the end of the
prefix list in the input file.

Now, the right NOT prefix is shown in the error message.
2024-01-19 15:08:24 +00:00
Nikita Popov
9350860824
[AsmParser] Add support for reading incomplete IR (part 1) (#78421)
Add an `-allow-incomplete-ir` flag to the IR parser, which allows
reading IR with missing declarations. This is intended to produce a
best-effort interpretation of the IR, along the same lines of what we
would manually do when taking, for example, a function from
`-print-after-all` output and fixing it up to be valid IR.

This patch only supports dropping references to undeclared metadata,
either by dropping metadata attachments from instructions/functions, or
by dropping calls to certain intrinsics (like debug intrinsics). I will
implement support for inserting missing function/global declarations in
a followup patch.

We don't have real use lists for metadata, so the approach here is to
iterate over the whole IR and identify metadata that needs to be
dropped. This does not support all possible cases, but should handle
anything that's relevant for the function-only IR use case.
2024-01-19 16:08:16 +01:00
LLVM GN Syncbot
535b197b8e [gn build] Port 9ff4be640f 2024-01-19 14:42:19 +00:00