The libz compression library on SystemZ by default makes use of the
platform's hardware-accelerated compression facility. This is much
faster than the regular software implementation, but often results in
slightly different outputs. This causes failures with the
compressed-debug-level test case.
To fix this, run this test while setting the DFLTCC environment
variable to zero, which prevents use of hardware compression and falls
back to the software implementation. (This should not have any effect
on other platforms.)
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D149273
When running the LLD test suite on a big-endian host, the
COFF/pdb-framedata.yaml test case currently fails.
As it turns out, this is because code in DebugSHandler::finish
intended to relocate RvaStart entries of FDO records does not
work correctly when compiled for a big-endian host.
Fixed by always reading file data in little-endian mode.
Reviewed By: aganea
Differential Revision: https://reviews.llvm.org/D149268
We were previously using the condition as the mask. By the semantics
of VP operations, that means that anywhere the condition is false
returns poison and not the false operand.
Use an all ones mask instead.
No tests are affected because RISC-V drops the mask when lowering.
Reviewed By: fakepaper56
Differential Revision: https://reviews.llvm.org/D149310
In order to use clang-tidy for modules version 17 is required. Some of the
development fixes haven't been backported. This adds the new version to
the CI so it can be used in a follow-up patch.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D148831
This was part of the N extension which didn't make it version
1.12 of the privilege specification.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D149314
The module validation script of D144994 validate whether the contents of
an include match its module. An include is the set of files matching the
pattern:
- foo
- foo/*.
- __fwd/foo.h
Several declarations of the stream headers are in the header iosfwd.
This gives issue using the validation script. Adding iosfwd to the set
of matching files gives too many declarations. For example when
validating the fstream header it will pull in declarations of the
istream header. Instead if writing a set of filters the headers are
granularized into smaller headers containing the expected declarations.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D148927
In D145589, we made the std::bind placeholders inline constexpr to
satisfy C++17. It turns out that this causes ODR violations since the
shared library provides strong definitions for those placeholders, and
the linker on Windows actually complains about this.
Fortunately, C++17 only encourages implementations to use `inline constexpr`,
it doesn't force them. So instead, we unconditionally define the placeholders
as `extern const`, which avoids the ODR violation and is indistinguishable
from `inline constexpr` for most purposes, since the placeholders are
empty types anyway.
Note that we could also go back to the pre-D145589 state of defining them
as non-inline constexpr variables in C++17, however that is definitely
non-conforming since that means the placeholders have different addresses
in different TUs. This is all a bit pedantic, but all in all I feel that
`extern const` provides the best bang for our buck, and I can't really
find any downsides to that solution.
Differential Revision: https://reviews.llvm.org/D149292
This commit moves the CFGMST.h file into the include directory. The
implemented algorithm is can be helpful for downstream projects that
want to use the PGO data in a non-standard way.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D149336
With this patch an undefined mask in a shufflevector will be printed as poison.
This change is done to support the new shufflevector semantics
for undefined mask elements.
Differential Revision: https://reviews.llvm.org/D149210
This ended up being fixed separately by @rsmith in 1e43349e3 in a
better/correct way. This patch adds the tests from the original, as
though they are reasonably covered in his patch, explicit versions seem
to have value here.
Additionally, this adds a release note for 1e43349e3.
SCEV expressions no longer try to preserve LCSSA form. SCEV
construction will try to look through LCSSA phi nodes. As such,
we also no longer need to limit this special-case fold.
Sometimes a phi can both be trivial and match the
createNodeFromSelectLikePHI() fold. In that case it is generally
more profitable to look through the phi node.
Enable -fsanitize=kernel-memory support in Clang.
The x86_64 ABI requires that shadow_origin_ptr_t must be returned via a
register pair, and the s390x ABI requires that it must be returned via
memory pointed to by a hidden parameter. Normally Clang takes care of
the ABI, but the sanitizers run long after it, so unfortunately they
have to duplicate the ABI logic.
Therefore add a special case for SystemZ and manually emit the
s390x-ABI-compliant calling sequences. Since it's only 2 architectures,
do not create a VarArgHelper-like abstraction layer.
The kernel functions are compiled with the "packed-stack" and
"use-soft-float" attributes. For the "packed-stack" functions, it's not
correct for copyRegSaveArea() to copy 160 bytes of shadow and origins,
since the save area is dynamically sized. Things are greatly simplified
by the fact that the vararg "use-soft-float" functions use precisely
56 bytes in order to save the argument registers to where va_arg() can
find them.
Make copyRegSaveArea() copy only 56 bytes in the "use-soft-float" case.
The "packed-stack" && !"use-soft-float" case has no practical uses at
the moment, so leave it for the future.
Add tests.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D148596
We no longer try to preserve LCSSA form in SCEV representation:
Nowadays, we look through LCSSA PHI nodes directly during SCEV
construction. As such, this separate special case in
getSCEVAtScope() is no longer needed.
/data/llvm-project/mlir/unittests/Analysis/Presburger/UtilsTest.cpp:39:17: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
auto merge = [this](unsigned i, unsigned j) -> bool { return true; };
^~~~
/data/llvm-project/mlir/unittests/Analysis/Presburger/UtilsTest.cpp:52:17: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
auto merge = [this](unsigned i, unsigned j) -> bool { return true; };
^~~~
2 errors generated.
Function pointers can be compared for (in)equality but, but LE, GE, LT,
and GT opcodes should emit an error and abort.
Differential Revision: https://reviews.llvm.org/D149154
For each unused-include/missing-include diagnostic, we provide fix-all
alternative to them.
This patch also adds LSP ChangeAnnotation support.
Differential Revision: https://reviews.llvm.org/D147684
The only way known bits could help identify a known power of two is if
it knows exactly which power of two it is, i.e. if it is a known
constant. But in that case the value should have been simplified to a
constant already. So save some compile time by not calling
computeKnownBits.
Differential Revision: https://reviews.llvm.org/D149325
The code closely follows the X86 back-end. Applications that make heavy
use of {i64, i64} returns to use two registers strongly benefit from the
reduced number of SelectionDAG fallbacks.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D148346
We should be checking the current BO here, not the nested one. If
the current BO has nowrap flags (and is UB on poison), then we'll
fetch both operand SCEVs of that BO. We'll check the nested BO
on the next iteration of the do/while loop.
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9793:3: error: variable 'MaddOpc' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized]
default:
^~~~~~~
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9854:25: note: uninitialized use occurs here
Madd->setDesc(TII.get(MaddOpc));
^~~~~~~
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9791:19: note: initialize the variable 'MaddOpc' to silence this warning
unsigned MaddOpc;
^
= 0
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9793:3: error: variable 'AddOpc' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized]
default:
^~~~~~~
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9862:46: note: uninitialized use occurs here
BuildMI(*MF, MIMetadata(Root), TII.get(AddOpc), DstReg)
^~~~~~
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9790:18: note: initialize the variable 'AddOpc' to silence this warning
unsigned AddOpc;
^
= 0
2 errors generated.
Added a simple normalize function to divisionrepr and added a simple unittest.
Added a normalizediv call to divisionrepr's removeDuplicateDivs function, which now eliminates divs that are consistent after gcd's normalize
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D147381
"vpmaddwd + vpaddd" can be combined to vpdpwssd and the latency is
reduced after combination. However when vpdpwssd is in a critical path
the combination get less ILP. It happens when vpdpwssd is in a loop, the
vpmaddwd can be executed in parallel in multi-iterations while vpdpwssd
has data dependency for each iterations. If vpaddd is in a critical path
while vpmaddwd is not, it is profitable to split vpdpwssd into "vpmaddwd
+ vpaddd".
This patch is based on the machine combiner framework to acheive decision
on "vpmaddwd + vpaddd" combination. The typical example code is as
below.
```
__m256i foo(int cnt, __m256i c, __m256i b, __m256i *p) {
for (int i = 0; i < cnt; ++i) {
__m256i a = p[i];
__m256i m = _mm256_madd_epi16 (b, a);
c = _mm256_add_epi32(m, c);
}
return c;
}
```
Differential Revision: https://reviews.llvm.org/D148980