Part of https://bugs.llvm.org/show_bug.cgi?id=41734
LTO can drop externally available definitions. Such AssociatedSymbol is
not associated with a symbol. ELFWriter::writeSection() will assert.
Allow a SHF_LINK_ORDER section to have sh_link=0.
We need to give sh_link a syntax, a literal zero in the linked-to symbol
position, e.g. `.section name,"ao",@progbits,0`
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D72899
Fixes test-suite compile failure caused by 8dfb5d7.
While I'm in the area, add some more test coverage to related
operations, to make sure we aren't missing any other patterns.
Archives can now be specified as input files the same way that object
files are. Archives will always be linked after all objects (regardless
of the relative order of the inputs) but before any dynamic libraries or
process symbols.
This patch also relaxes matching for slice triples in
StaticLibraryDefinitionGenerator in order to support this feature:
Vendors need not match if the source vendor is unknown.
Currently, ArgPromotion may leave metadata uses of promoted values,
which will end up in the wrong function, creating invalid IR.
PR33641 fixed this for dead arguments, but it can be also be triggered
arguments with users that are promoted (see the updated test case).
We also have to drop uses to them after promoting them. We need to do
this after dealing with the non-metadata uses, so I also moved the empty
use case to the loop that deals with updating the arguments of the new
function.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D85127
Extend the memop value profile buckets to be more flexible (could accommodate a
mix of individual values and ranges) and to cover more value ranges (from 11 to
22 buckets).
Disabled behind a flag (to be enabled separately) and the existing code to be
removed later.
Differential Revision: https://reviews.llvm.org/D81682
Instructions should not be scheduled across ENDBR instructions, as this would result in the ENDBR being displaced, breaking the parity needed for the Indirect Branch Tracking feature of CET.
Currently, the X86IndirectBranchTracking pass is later than the instruction scheduling in the pipeline, what causes the bug to be unnoticeable and very hard (if not unfeasible) to be triggered while compiling C files with the standard LLVM setup. Yet, for correctness and to prevent issues in future changes, the compiler should prevent the such scheduling.
Differential Revision: https://reviews.llvm.org/D84862
This allows us to remove the (depth violating) code in getFauxShuffleMask where we were combining the OR(SHUFFLE,SHUFFLE) shuffle inputs as well, and not just the OR().
This is a minor step toward being able to shuffle combine from/to SELECT/BLENDV as a faux shuffle.
The strictfp attribute is required on all function calls in a function
that is itself marked with the strictfp attribute. The IRBuilder knows
this and has a method for adding the attribute to function call instructions.
If a function being called has the strictfp attribute itself then the
IRBuilder will refuse to add the attribute to the calling instruction
despite being asked to add it. Eliminate this error.
Differential Revision: https://reviews.llvm.org/D84878
This adds an isel pattern and special XOR8rr_NOREX instruction
to enable the use of h-registers for __builtin_parity. This avoids
a copy and a shift instruction. The NOREX instruction is in case
register allocation doesn't use the matching l-register for some
reason. If a R8-R15 register gets picked instead, we won't be
able to encode the instruction since an h-register can't be used
with a REX prefix.
Fixes PR46954
A JSON->TensorSpec utility we will use subsequently to specify
additional outputs needed for certain training scenarios.
Differential Revision: https://reviews.llvm.org/D84976
Freeze always returns a defined value. This also prevents msan from
checking the input shadow, which happened because freeze wasn't
explicitly visited.
Differential Revision: https://reviews.llvm.org/D85040
In some cases, it seems like we can get rid of unnecessary s/umins by
using information from the loop guards (unless I am missing something).
One place where this seems to be helpful in practice is when computing
loop trip counts. This patch just changes howManyGreaterThans for now.
Note that this requires a loop for which we can check 'is guarded'.
On SPEC2000/SPEC2006/MultiSource, there are some notable changes for
some programs in the number of loops unrolled and trip counts computed.
```
Same hash: 179 (filtered out)
Remaining: 58
Metric: scalar-evolution.NumTripCountsComputed
Program base patch diff
test-suite...langs-C/compiler/compiler.test 25.00 31.00 24.0%
test-suite.../Applications/SPASS/SPASS.test 2020.00 2323.00 15.0%
test-suite...langs-C/allroots/allroots.test 29.00 32.00 10.3%
test-suite.../Prolangs-C/loader/loader.test 17.00 18.00 5.9%
test-suite...fice-ispell/office-ispell.test 253.00 265.00 4.7%
test-suite...006/450.soplex/450.soplex.test 3552.00 3692.00 3.9%
test-suite...chmarks/MallocBench/gs/gs.test 453.00 470.00 3.8%
test-suite...ngs-C/assembler/assembler.test 29.00 30.00 3.4%
test-suite.../Benchmarks/Ptrdist/bc/bc.test 263.00 270.00 2.7%
test-suite...rks/FreeBench/pifft/pifft.test 722.00 741.00 2.6%
test-suite...count/automotive-bitcount.test 41.00 42.00 2.4%
test-suite...0/253.perlbmk/253.perlbmk.test 1417.00 1451.00 2.4%
test-suite...000/197.parser/197.parser.test 387.00 396.00 2.3%
test-suite...lications/sqlite3/sqlite3.test 1168.00 1189.00 1.8%
test-suite...000/255.vortex/255.vortex.test 173.00 176.00 1.7%
Metric: loop-unroll.NumUnrolled
Program base patch diff
test-suite...langs-C/compiler/compiler.test 1.00 3.00 200.0%
test-suite.../Applications/SPASS/SPASS.test 134.00 234.00 74.6%
test-suite...count/automotive-bitcount.test 3.00 4.00 33.3%
test-suite.../Prolangs-C/loader/loader.test 3.00 4.00 33.3%
test-suite...langs-C/allroots/allroots.test 3.00 4.00 33.3%
test-suite...Source/Benchmarks/sim/sim.test 10.00 12.00 20.0%
test-suite...fice-ispell/office-ispell.test 21.00 25.00 19.0%
test-suite.../Benchmarks/Ptrdist/bc/bc.test 32.00 38.00 18.8%
test-suite...006/450.soplex/450.soplex.test 300.00 352.00 17.3%
test-suite...rks/FreeBench/pifft/pifft.test 60.00 69.00 15.0%
test-suite...chmarks/MallocBench/gs/gs.test 57.00 63.00 10.5%
test-suite...ngs-C/assembler/assembler.test 10.00 11.00 10.0%
test-suite...0/253.perlbmk/253.perlbmk.test 145.00 157.00 8.3%
test-suite...000/197.parser/197.parser.test 43.00 46.00 7.0%
test-suite...TimberWolfMC/timberwolfmc.test 205.00 214.00 4.4%
Geomean difference 7.6%
```
Fixes https://bugs.llvm.org/show_bug.cgi?id=46939
Fixes https://bugs.llvm.org/show_bug.cgi?id=46924 on X86.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D85046
This patch stops unconditionally transforming FSUB(-0,X) into an FNEG(X) while building the DAG. There is also one small change to handle the new FSUB(-0,X) similarly to FNEG(X) in the AMDGPU backend.
Differential Revision: https://reviews.llvm.org/D84056
`DenseMapAPIntKeyInfo` is now located in `lib/IR/LLVMContextImpl.h`.
Moved it into `include/ADT/DenseMapInfo.h` to use it.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D85131
The 1st try at this (rG2265d01f2a5b) exposed what looks like
unspecified behavior in C/C++ resulting in test variations.
The arguments to BinaryOperator::CreateAnd() were both IRBuilder
function calls, and the order in which they execute determines
the order of the new instructions in the IR. But the order of
function arg evaluation is not fixed by the rules of C/C++, so
depending on compiler config, the test would fail because the
test expected a single fixed ordering of instructions.
Original commit message:
I tried to use m_Deferred() on this, but didn't find
a clean way to do that.
http://bugs.llvm.org/PR46955https://alive2.llvm.org/ce/z/2h6QTq
The offsets field should be omitted when the 'OffsetEntryCount' entry is
specified to be 0.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D85006
There were various hacks used to try to avoid making s1 SGPR vs. s1
VCC ambiguous after constraining the register before we had a strategy
to deal with this. This also attempted to handle undef operands, which
are now illegal gMIR.
Use pad with undef and unmerge with unused results. This is annoyingly
similar to several other places in LegalizerHelper, but they're all
slightly different.
We already do this on AVX (+ for ZERO_EXTEND_VECTOR_INREG), but this enables it for all SSE targets - we attempted something similar back at rL357057 but hit issues with the ZERO_EXTEND_VECTOR_INREG handling (PR41249).
I'm still looking at the vector-mul.ll regression - which is due to 32-bit targets performing the load as a f64, resulting in the shuffle combiner thinking it has to create a shuffle in the float domain.
Fixes a regression caused by D82439, in which IT blocks were no longer being
generated when -Oz is present. This was due to the CPSR register being marked as
dead, while this case was not accounted for.
Differential Revision: https://reviews.llvm.org/D83667
This patch does the following:
1) Starts using YAML macro to reduce the number of YAML documents in tests.
2) Adds `#` before 'RUN'/`CHECK` lines in a few tests where it is missing.
3) Removes unused YAML keys.
4) Starts using `ENTSIZE=<none>` to simplify tests (see D84526).
5) Removes trailing white spaces in a few places.
Differential revision: https://reviews.llvm.org/D85013
Not passing --clang would result in a python exception after this change:
(TypeError: expected str, bytes or os.PathLike object, not NoneType)
because the --clang argument default was only being populated in the
initial argument parsing pass but not later on.
Fix this by adding an argparse callback to set the default values.
Reviewed By: vitalybuka, MaskRay
Differential Revision: https://reviews.llvm.org/D84511
The check-* targets run ${Python3_EXECUTABLE} $BUILD/bin/llvm-lit, but
running `./bin/llvm-lit $ARGS` from the build directory currently always
uses "python" to run llvm-lit. On most systems this will be python2.7 even
if we found python3 at CMake time.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D84625
We have a `findSectionByName` helper that tries to find a section
by it name. It is used in a few places, but never tested.
I'd like to reuse this helper for a different place.
For this, I've changed it to return Expected<> and now it
doesn't use `unwrapOrErr` anymore. It also now a member of
Dumper class and might report warnings.
Differential revision: https://reviews.llvm.org/D84651
It implements an approach suggested in the D84398 thread.
With it the following:
```
Sections:
- Name: .bar
Type: SHT_PROGBITS
Offset: [[MACRO=<none>]]
```
works just like the `Offset` key was not specified.
It is useful for tests that want to have a default value for a field and to
have a way to override it at the same time.
Differential revision: https://reviews.llvm.org/D84526
No widening decisions will be computed for instructions outside the
loop. Do not try to get a widening decision. The load/store will be just
a scalar load, so treating at as normal should be fine I think.
Fixes PR46950.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D85087
This patch makes it possible to handle nonnull attribute violation at callsites in AAUndefinedBehavior.
If null pointer is passed to callee at a callsite and the corresponding argument of callee has nonnull attribute, the behavior of the callee is undefined.
In this patch, violations of argument nonnull attributes is only handled.
But violations of returned nonnull attributes can be handled and I will implement that in a follow-up patch.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D84733
The patch restricts DIEDelta::SizeOf() to accept only DWARF forms that
are actually used in the LLVM codebase. This should make the use of the
class more explicit and help to avoid issues similar to fixed in D83958
and D84094.
Differential Revision: https://reviews.llvm.org/D84095
DIELabel can emit only 32- or 64-bit values, while it was created in
some places with DW_FORM_udata, which implies emitting uleb128.
Nevertheless, these places also expected to emit U32 or U64, but just
used a misleading DWARF form. The patch updates those places to use more
appropriate DWARF forms and restricts DIELabel::SizeOf() to accept only
forms that are actually used in the LLVM codebase.
Differential Revision: https://reviews.llvm.org/D84094
DIELocList is used with a limited number of DWARF forms, see the only
place where it is instantiated, DwarfCompileUnit::addLocationList().
The patch marks the unexpected execution path in DIELocList::SizeOf()
as unreachable, to reduce ambiguity.
Differential Revision: https://reviews.llvm.org/D84092