Currently we don't allow the following definition:
```
Sections:
- Type: SectionHeaderTable
- Name: .foo
Type: SHT_PROGBITS
```
We report an error: "SectionHeaderTable can't be empty. Use 'NoHeaders' key to drop the section header table".
It was implemented in this way earlier, when `SectionHeaderTable`
was a dedicated key outside of the `Sections` list. And we did not
allow to select where the table is written.
Currently it makes sense to allow it, because a user might
want to place the default section header table at an arbitrary position,
e.g. before other sections. In this case it is not convenient and error prone
to require specifying all sections:
```
Sections:
- Type: SectionHeaderTable
Sections:
- Name: .foo
- Name: .strtab
- Name: .shstrtab
- Name: .foo
Type: SHT_PROGBITS
```
This patch allows empty SectionHeaderTable definitions.
Differential revision: https://reviews.llvm.org/D95341
Before the patch it was possible to trigger a constant bus
violation when folding immediates into a shrunk instruction.
The patch adds a check to enforce the legality of the new operand.
Differential Revision: https://reviews.llvm.org/D95527
NativeEnumInjectedSources.h needs PDBFile but relies on a
forward declaration of PDBFile in InjectedSourceStream.h.
This patch adds a forward declaration right in
NativeEnumInjectedSources.h.
While we are at it, this patch removes the one in
InjectedSourceStream.h, where it is unnecessary.
This change brings up support of context-sensitive profiles in the format of extended binary. Existing sample profile reader/writer/merger code is being tweaked to reflect the fact of bracketed input contexts, like (`[...]`). The paired brackets are also needed in extbinary profiles because we don't yet have an otherwise good way to tell calling contexts apart from regular function names since the context delimiter `@` can somehow serve as a part of the C++ mangled names.
Reviewed By: wmi, wenlei
Differential Revision: https://reviews.llvm.org/D95547
In d2927f786e877410d90c1e6f0e0c7d99524529c5, I added patterns
to remove (and X, 31) from sllw/srlw/sraw shift amounts.
There is code in SelectionDAGISel.cpp that knows to use
computeKnownBits to fill in bits of the mask that were removed
by SimplifyDemandedBits based on bits being known zero.
The non-W shift patterns use immbottomxlenset which allows the
mask to have more than log2(xlen) trailing ones, but doesn't
have a call to computeKnownBits to fill in bits of the mask that may
have been cleared by SimplifyDemandedBits.
This patch copies code from X86 to handle more than log2(xlen)
bottom bits set and uses computeKnownBits to fill in missing bits
before counting.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D95422
We need at least 252 UniqAttributes now, which will soon overflow.
Actually with downstream backends we can easily use up the last few values.
So bump to uint16_t.
This change fixes two issues with building LLVM on Haiku. The first issue is
that LLVM requires wait4(), which on Haiku is hidden behind the _BSD_SOURCE
feature flag when using the --std=c++14 flag. Additionally, the wait4()
function is only available in libbsd.so, so this is now a dependency.
The other fix is that Haiku does not have the (non-standard) rusage.maxrss
member, so by default the used memory info will be set to 0 on this platform.
Reviewed By: sepavloff
Differential Revision: https://reviews.llvm.org/D87920
Patch by Niels Sascha Reedijk.
The remote offloading plugin's CMakeLists was trying to build if its
flag was enabled even if it didn't find gRPC/protobuf. The conditional
was wrong, it's fixed by this.
Differential Revision: https://reviews.llvm.org/D95574
This prevents needless reinitialization for clients that want to reuse a pass manager multiple times. A new `getRegisryHash` function is exposed by the context to give a rough indicator of when the context registry has changed.
Differential Revision: https://reviews.llvm.org/D95493
(ClangTidy configuration block hasn't been in any release, so we should be OK
to move it around like this)
Differential Revision: https://reviews.llvm.org/D95362
We cannot call LRM::unassign() if LRM::assign() was never called
before, these are symmetrical calls. There are two ways of
assigning a physical register to virtual, via LRM::assign() and
via VRM::assignVirt2Phys(). LRM::assign() will call the VRM to
assign the register and then update LiveIntervalUnion. Inline
spiller calls VRM directly and thus LiveIntervalUnion never gets
updated. A call to LRM::unassign() then asserts about inconsistent
liveness.
We have to note that not all callers of the InlineSpiller even
have LRM to pass, RegAllocPBQP does not have it, so we cannot
always pass LRM into the spiller.
The only way to get into that spiller LRE_DidCloneVirtReg() call
is from LiveRangeEdit::eliminateDeadDefs if we split an LI.
This patch refuses to reassign a LiveInterval created by a split
to workaround the problem. In fact we cannot reassign a spill
anyway as all registers of the needed class are occupied and we
are spilling.
Fixes: SWDEV-267996
Differential Revision: https://reviews.llvm.org/D95489
Identify dynamically exported symbols (--export-dynamic[-symbol=],
--dynamic-list=, or definitions needed to preempt shared objects) and
prevent their LTO visibility from being upgraded.
This helps avoid use of whole program devirtualization when there may
be overrides in dynamic libraries.
Differential Revision: https://reviews.llvm.org/D91583
Make the #include "file" preprocessing directive begin its
search in the same directory as the file containing the directive,
as other preprocessors and our Fortran INCLUDE statement do.
Avoid current working directory for all source files except the original.
Resolve tests.
Differential Revision: https://reviews.llvm.org/D95481
This fully de-pessimizes the common case of no indirectbr's,
(where we don't actually need to do anything to preserve domtree)
and avoids domtree recomputation in the case there were indirectbr's.
Note that two indirectbr's could have a common successor, and not all
successors of an indirectbr's are meant to survive the expansion.
Though, the code assumes that an indirectbr's doesn't have
duplicate successors, those *should* have been deduplicated
by simplifycfg or something already.
We are allowed to store 128-bit-wide values using the q registers on AArch64.
GlobalISel was clamping the number of elements in vector stores into 64 bits
instead.
This results in some poor codegen like below:
https://godbolt.org/z/E56dq8
```
; SDAG uses a stp + q registers in both cases here.
define void @float(<16 x float> %val, <16 x float>* %ptr) {
store <16 x float> %val, <16 x float>* %ptr
ret void
}
define void @double(<8 x double> %val, <8 x double>* %ptr) {
store <8 x double> %val, <8 x double>* %ptr
ret void
}
```
This adds similar legalization for vector stores with s8 and s16 elements.
Differential Revision: https://reviews.llvm.org/D95107
D95466 dropped CUDA to build NVPTX deviceRTL and enabled it by default.
However, the building requires some libraries that are not available on non-CUDA
system by default, which could break the compilation. This patch disabled the
build by default. It can be enabled with `LIBOMPTARGET_BUILD_NVPTX_BCLIB=ON`.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D95556
Experimental, using non-existent DWARF support to use an expr for the
location involving an addr_index (to compute address + offset so
addresses can be reused in more places).
The global variable debug info had to be deferred until the end of the
module (so bss variables would all be emitted first - so their labels
would have the relevant section). Non-bss variables seemed to not have
their label assigned to a section even at the end of the module, so I
didn't know what to do there.
Also, the hashing code is broken - doesn't know how to hash these
expressions (& isn't hashing anything inside subprograms, which seems
problematic), so for test purposes this change just skips the hash
computation. (GCC's actually overly sensitive in its hash function, it
seems - I'm forgetting the specific case right now - anyway, we might
want to just use the frontend-known file hash and give up on optimistic
.dwo/.dwp reuse)
The Clang enable_if extension is mangled as an <extended-qualifier>,
which is supposed to contain <template-args>. However, we were
unconditionally emitting X/E around its arguments, neglecting the fact
that <expr-primary> should be emitted directly without the surrounding
X/E.
Differential Revision: https://reviews.llvm.org/D95488
Previously, we were emitting an extraneous X .. E in <template-arg>
around an <expr-primary> if the template argument was constructed from
an expression (rather than an already-evaluated literal value). In
such a case, we would then e.g. emit 'XLi0EE' instead of 'Li0E'.
We had one special-case for DeclRefExpr expressions, in particular, to
omit them the mangled-name without the surrounding X/E. However,
unfortunately, that special case also triggered for ParmVarDecl (a
subtype of VarDecl), and _incorrectly_ emitted 'L_Z .. E' instead of
the proper 'Xfp_E'.
This change causes mangleExpression itself to be responsible for
emitting X/E around non-primary expressions, which removes the
special-case, and corrects both these problems.
Differential Revision: https://reviews.llvm.org/D95487
The two operations have acted differently since Clang 8, but were
unfortunately mangled the same. The new mangling uses new "vendor
extended expression" syntax proposed in
https://github.com/itanium-cxx-abi/cxx-abi/issues/112
GCC had the same mangling problem, https://gcc.gnu.org/PR88115, and
will hopefully be switching to the same mangling as implemented here.
Additionally, fix the mangling of `__uuidof` to use the new extension
syntax, instead of its previous nonstandard special-case.
Adjusts the demangler accordingly.
Differential Revision: https://reviews.llvm.org/D93922
RISCVBaseInfo.h belongs to the MC layer, but the Pseudo instructions
are only used by the CodeGen layer. So it makes sense to keep this
table in the CodeGen layer.
More study has discovered this to not actually be useful: because
current C++20 implementations reject `#ifdef __VA_OPT__`, this can't
really be used as a feature-test mechanism. And it's not too hard to
detect __VA_OPT__ without this, for example:
#define THIRD_ARG(a, b, c, ...) c
#define HAS_VA_OPT(...) THIRD_ARG(__VA_OPT__(,), 1, 0, )
#if HAS_VA_OPT(?)
Partially reverts 0436ec2128c9775ba13b0308937238fc79673fdd.
Previously, Clang was able to mangle the Swift calling
convention but 'MicrosoftDemangle.cpp' was not able to demangle it.
Reviewed By: compnerd, rnk
Differential Revision: https://reviews.llvm.org/D95053
stella.stemenova mentioned in https://reviews.llvm.org/D93951 failures on Windows for this test.
I'm fixing the macro definitions and disabling the tests for python
versions lower than 3.7. I'll figure out that actual issue with
python3.6 after the buildbots are fine again.
With D92696, the Scudo Standalone GWP-ASan flag parsing was changed to
the new GWP-ASan optional one. We do not necessarily want this, as this
duplicates flag parsing code in Scudo Standalone when using the
GWP-ASan integration.
This CL reverts the changes within Scudo Standalone, and increases
`MaxFlags` to 20 as an addionnal option got us to the current max.
Differential Revision: https://reviews.llvm.org/D95542
These changes are intended to give code a path to move away from the GNU
,##__VA_ARGS__ extension, which is non-conforming in some situations and
which we'd like to disable in our conforming mode in those cases.
In Clang today, we parse the different attribute syntaxes
(__attribute__, __declspec, and [[]]) in a fairly rigid order. This
leads to confusion for users when they guess the order incorrectly,
and leads to bug reports like PR24559 or necessitates changes like
D94788.
This patch adds a helper function to allow us to more easily parse
attributes in arbitrary order, and then updates all of the places
where we would parse two or more different syntaxes in a rigid order to
use the helper method. The patch does not attempt to handle Microsoft
attributes ([]) because those are ambiguous with other code constructs
and we don't have any attributes that use the syntax.
When OMP_PLACES contains an invalid value, the warning informs the user
that the fallback is OMP_PLACES=threads, but the actual internal setting
is OMP_PLACES=cores and is detected as such with KMP_SETTINGS=1.
This patch informs the user that OMP_PLACES=cores is being used instead
of OMP_PLACES=threads.
Differential Revision: https://reviews.llvm.org/D95170
This patch adds the new algorithm for topology discovery using cpuid
leaf 1f. Only the new die level is detected and integrated into the
current affinity mechanisms including KMP_AFFINITY (granularity level
and compact/scatter algorithm), OMP_PLACES=dies, and KMP_HW_SUBSET.
Differential Revision: https://reviews.llvm.org/D95157
HWLOC 2.0 has numa nodes as separate children and are not in the main
parent/child topology tree anymore. This change takes this into
account. The main topology detection loop in the create_hwloc_map()
routine starts at a hardware thread within the initial affinity mask and
goes up the topology tree setting the socket/core/thread labels
correctly.
This change also introduces some of the more generic changes that the
future kmp_topology_t structure will take advantage of including a
generic ratio & count array (finding all ratios of topology layers like
threads/core cores/socket and finding all counts of each topology
layer), generic radix1 reduction step, generic uniformity check, and
generic printing of topology (en_US.txt)
Differential Revision: https://reviews.llvm.org/D95156