For example, the following message has the severity string "error: "
twice.
> "error: <EXPR>:3:1: error: cannot find 'bogus' in scope
This method already appends the severity string in the beginning, but
with this fix, it also removes a secondary instance, if applicable.
Note that this change only removes the *first* redundant substring. I
considered putting the removal logic in a loop, but I decided that if
something is generating more than one redundant severity substring, then
that's a problem the message's source should probably fix.
rdar://114203423
explicitly pass libLTO path. This fixes a failure in swift-ci where
libLTO was being picked from the system instead which was an older
version and caused issues.
rdar://117474861
Add a clang flag, "-ftrivial-auto-var-init-max-size=" so that clang
skips auto-init a variable if the auto-init memset size exceeds the flag
setting (in bytes). Note that this skipping doesn't apply to
runtime-sized variables like VLA.
Considerations: "__attribute__((uninitialized))" can be used to manually
opt variables out. However, there are thousands of large variables
(e.g., >=1KB, most of them are arrays and used as buffers) in big
codebase. Manually opting them out one by one is not efficient.
This fixes some cases of missing debuginfo caused by an interaction
between:
f0d66559ea,
which drops the identifier from a DICompositeType in the module
containing its
vtable.
and
a61f5e3796,
which causes ThinLTO to import composite types as declarations when they
have
an identifier.
If a virtual class's DICompositeType has no identifier due to the first
change,
and contains a nested anonymous type which does have an identifier, then
the
second change can cause ThinLTO to output the classes's DICompositeType
as a
type definition that links to a non-defining declaration for the nested
type.
Since the nested anonyous type does not have a name, debuggers are
unable to
find the definition for the declaration.
Repro case:
```
cat > a.h <<EOF
class A {
public:
A();
virtual ~A();
private:
union {
int val;
};
};
EOF
cat > a.cc <<EOF
#include "a.h"
A::A() { asm(""); }
A::~A() {}
EOF
cat > main.cc <<EOF
#include "a.h"
int main(int argc, char **argv) {
A a;
return 0;
}
EOF
clang++ -O2 -g -flto=thin -mllvm -force-import-all main.cc a.cc
gdb ./a.out -batch -ex 'pt /rmt A'
```
The gdb command outputs:
```
type = class A {
private:
union {
<incomplete type>
};
}
```
and dwarfdump -i a.out shows a DW_TAG_class_type for A with an
incomplete union
type (note that there is also a duplicate entry with the full union type
that
comes after).
```
< 1><0x0000001e> DW_TAG_class_type
DW_AT_containing_type <0x0000001e>
DW_AT_calling_convention DW_CC_pass_by_reference
DW_AT_name (indexed string: 0x00000007)A
DW_AT_byte_size 0x00000010
DW_AT_decl_file 0x00000001 /path/to/./a.h
DW_AT_decl_line 0x00000001
...
< 2><0x0000002f> DW_TAG_member
DW_AT_type <0x00000037>
DW_AT_decl_file 0x00000001 /path/to/./a.h
DW_AT_decl_line 0x00000007
DW_AT_data_member_location 8
< 2><0x00000037> DW_TAG_union_type
DW_AT_export_symbols yes(1)
DW_AT_calling_convention DW_CC_pass_by_value
DW_AT_declaration yes(1)
```
This change works around this by making ThinLTO always import full
definitions
for anonymous types.
This bug is caused by parenthesized list initialization not being
implemented in `CodeGenFunction::EmitNewArrayInitializer(...)`.
Parenthesized list initialization of `struct`s with `operator new`
already works in Clang and is not affected by this bug.
Additionally, fix the test new-delete.cpp as it incorrectly assumes that
using parentheses with operator new to initialize arrays is illegal for
C++ versions >= C++17.
Fixes#68198
This commit corrects the address computation for objc_msgSend stubs.
Previously, the address computation was incidentally correct due to
objc_msgSend often being the first entry in the got section, resulting
in a 0 index. This commit ensures accurate address computation
regardless of the objc_msgSend stub's position in the got section.
Fixes `SmallString` summary provider, which was incorrectly producing the empty string.
Initially I thought the strings I was debugging were empty for unknown reasons, but
that was not the case.
This is inspired by
https://github.com/llvm/llvm-project/pull/77342#pullrequestreview-1814673242,
and is split off of same with some differences in style.
A select is a vmerge.vv with the additional cost of materializing the
bitmask vector in a vreg. All masks fit within a single vector register
(e8 + m8 is the worst case), and thus our worst case cost should be
roughly 3 (2 scalar to produce the address, one vector load op). Given
most shuffles are small, and the mask will be instead produced by
LUI/ADDI + vmv.s.x or ADDI + vmv.s.x, using 2 as the default seems quite
reasonable. At worst, we're not going to be off by much.
The prior lowering scaled the cost of the bitmask with LMUL, which I
don't understand. At m1 it did use the same base cost of 2. (@lukel97
You wrote the original code here, anything I'm missing here?)
Fixes several of these:
```
[3370/3822] Building CXX object tools\lldb\source\Plugins\Process\U...lldbPluginProcessUtility.dir\NativeRegisterContextDBReg_x86.cpp.ob
C:\git\llvm-project\lldb\source\Plugins\Process\Utility\NativeRegisterContextDBReg_x86.h(23): warning C4589: Constructor of abstract class 'lldb_private::NativeRegisterContextDBReg_x86' ignores initializer for virtual base class 'lldb_private::NativeRegisterContextRegisterInfo'
C:\git\llvm-project\lldb\source\Plugins\Process\Utility\NativeRegisterContextDBReg_x86.h(23): note: virtual base classes are only initialized by the most-derived type
```
Fixes:
```
[3465/3822] Building CXX object tools\lldb\source\Plugins\SymbolFile\CTF\CMakeFiles\lldbPluginSymbolFileCTF.dir\SymbolFileCTF.cpp.obj
C:\git\llvm-project\lldb\source\Plugins\SymbolFile\CTF\SymbolFileCTF.cpp(606) : warning C4715: 'lldb_private::SymbolFileCTF::CreateType': not all control paths return a value
```
Add support for specifying the logical SPIR-V target environment in the
triple as Vulkan. When compiling HLSL, this replaces the DirectX Shader
Model with a Vulkan environment instead.
Currently, the only supported combinations of SPIR-V version and Vulkan
environment are:
- Vulkan 1.2 and SPIR-V 1.5
- Vulkan 1.3 and SPIR-V 1.6
Fixes#70051
The tag name was long for an ABI tag. The name was misleading too, the
tag is first introduced in LLVM 18 in 2024 and not in 2023.
---------
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
The getHashValue() signature returns a value of type 'unsigned' while
the hash_code could only be implicitly converted to 'size_t'. Depending
on the C++ implementation, this may or may not be a narrowing
conversion.
On some platform/compiler combination, this becomes a warning. To avoid
the warning (and better highlight the narrowing), do an explicit
conversion instead.
Co-authored-by: Orest Chura <orest.chura@intel.com>
Add a HeaderOptions struct that can be used to configure commonly-used
load commands LC_ID_DYLIB, LC_LOAD_DYLIB, and LC_RPATH when setupDylib
creates a mach-o header.
This patch add support for device_type on the acc.routine operation.
device_type can be specified on seq, worker, vector, gang and bind
information.
The support is following the same design than the one for compute
operations, data operation and the loop operation.
The goal of this PR is to fix an issue when Module Analysis stage is not
able to complete processing of a really big LLVM source:
https://github.com/llvm/llvm-project/issues/76048.
There is an example of a bulky LLVM source:
https://github.com/KhronosGroup/SPIRV-LLVM-Translator/blob/main/test/SpecConstants/long-spec-const-composite.ll
Processing of this file with
`llc -mtriple=spirv64-unknown-unknown -O0 long-spec-const-composite.ll
-o long-spec-const-composite.spvt`
to produce SPIR-V output using LLVM SPIR-V backend takes too long, and
I've never been able to see it actually completes. After the patch from
this PR applied elapsed time for me is ~30 sec.
The fix changes underlying data structure to be `std::set` to trace
instructions with identical operands instead of the existing approach of
the `findSameInstrInMS()` function.
The `irdl.base` op represent an attribute constraint that will check
that the
base of a type or attribute is the expected one (e.g. `IntegerType`) .
Example:
```mlir
irdl.dialect @cmath {
irdl.type @complex {
%0 = irdl.base "!builtin.integer"
irdl.parameters(%0)
}
irdl.type @complex_wrapper {
%0 = irdl.base @complex
irdl.parameters(%0)
}
}
```
The above program defines a `cmath.complex` type that expects a single
parameter, which is a type with base name `builtin.integer`, which is
the
name of an `IntegerType` type.
It also defines a `cmath.complex_wrapper` type that expects a single
parameter, which is a type of base type `cmath.complex`.
Address review comments in #76709
Add `NoCD8` to class `ITy`, and rewrite the promoted instructions with
`ITy` to avoid unexpected incorrect encoding about `NoCD8`.
Error message
```
*** Bad machine code: Illegal virtual register for instruction ***
- function: test__blsi_u32
- basic block: %bb.0 (0x7a61208)
- instruction: %5:gr32 = MOV32r0 implicit-def $eflags
- operand 0: %5:gr32
Expected a GR32_NOREX2 register, but got a GR32 register
```
Reported by RKSimon in #77433
The failure is b/c compiler emits a MOV32r0 with operand GR32 when
fast-isel is enabled.
```
// X86FastISel.cpp
Register SrcReg = fastEmitInst_(X86::MOV32r0, &X86::GR32RegClass)
```
However, before this patch, compiler only allows GR32_NOREX operand
b/c MOV32r0 is a pseudo instruction. In this patch, we relax the
register class of the operand to GR32 b/c MOV32r0 is always expanded
to XOR32rr, which can use EGPR.
The bug was not introduced by #77433 but caught by it.
In particular, we have internal customers that would like to use nanf
and
scalbnf.
The differences between various entrypoint files can be checked via:
$ comm -3 <(grep libc\.src path/to/entrypoints.txt | sort) \
<(grep libc\.src path/to/other/entrypoints.txt | sort)
These were fixed properly by f1f1875c18.
- Revert "[libc] temporarily set -Wno-shorten-64-to-32 (#77396)"
- Revert "[libc] make off_t 32b for 32b arm (#77350)"
This fixes missing inlined function names when formatting frame and the
`Block` in `SymbolContext` is a lexical block (e.g.
`DW_TAG_lexical_block` in Dwarf).
Summary:
The linker wrapper's job is to sort various embedded inputs into a list
of files that participate in a single link job. So far, this has been
completely 1-to-1, that is, each input file participates in exactly one
link job. However, support for AMD's target-id requires that one input
file may participate in multiple link jobs. For example, if given a
`gfx90a` static library and a `gfx90a:xnack+` object file input, we
should link the gfx90a` target into the `gfx90a:xnack+` job. These are
considered separate CPUs that can be mutually linked more or less.
This patch adds the necessary logic to make this happen. It primarily
reworks the logic to copy relevant input files into a separate list. So,
it moves construction of the final list of link jobs into the extraction
phase. We also need to copy the files in the case that it is needed more
than once, as the entire workflow expects ownership of said file.
Since most of the operations in the `math` dialect don't have
low-precision implementations, add the -math-legalize-to-f32 pass that
goes through and brackets low-precision math funcitons (like `math.sin
%0 : f16`) with `arith.extf` and `arith.truncf`. This preserves the
original semantics of the math operation but allows lowering to proceed.
Versions of this lowering are already implicitly present in some passes,
like ConvertGPUToROCDL. However, because those are implicit rewrites,
they hide the floating-point extension and truncation, preventing anyone
from writing passes that operate on those implitic extf/truncf pairs.
Exposing this legalization explicitly is needed to allow lowening 8-bit
floats on AMD GPUs, as the implementation of extf and truncf on that
platform requires the complex logic found in ArithToAMDGPU, which runs
before the GPU to ROCDL lowering.
Clean-up of the algorithm that assigns MC/DC True/False control-flow
condition IDs when constructing an MC/DC decision region. This patch
creates a common API for setting/getting the condition IDs, making the
binary logical operator visitor functions much cleaner.
This patch also fixes issue
https://github.com/llvm/llvm-project/issues/77873 in which a record's
control flow map can be malformed due to an incorrect calculation of the
True/False condition IDs.
We do the same for the analogous transform in DAGCombine, but this case
was missed in the recent patch which added support for zext nneg.
Sorry for the lack of test coverage. Not sure how to exercise this piece
of logic. It appears to have only minimal impact on LIT tests (only
test/CodeGen/X86/wide-scalar-shift-by-byte-multiple-legalization.ll),
and even then, the changes without it appear uninteresting. Maybe we
should remove this transform instead?
After the removal of the OpenMP early outlining MLIR pass in #67319, the
`EarlyOutliningInterface` stopped doing any useful work. It used to be
necessary to tie the name of the function from which a target region was
outlined to that new function, so it would be used when translating to
LLVM IR in place of the outlined function's name.
This is not necessary anymore, so this patch removes all references to
this interface and uses of the `omp.outline_parent_name` discardable
attribute in tests.
When merging blocks, if the previous block has no any branch instruction
and has one successor, the successor may be SEH landing pad and the
block will always raise exception and nerver fall through to next block.
We can not merge them in such case. isSuccessor should be used to
confirm it can fall through to next block.
The template function call CheckDescriptorEqInt((exitStat.get(), 127) is
deduced to have INT_T equal to std::int32_t instead of std::int64_t, but
the length descriptor points to a 64-byte storage. The comparison does
not work in a big endian.
Co-authored-by: Mark Danial <mark.danial@ibm.com>