Summary:
Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp.
Vector-to-vector transformations for unrolling and lowering to hardware vectors
can generate chains of structured vector operations (InsertSlicesOp,
ExtractSlicesOp and ShapeCastOp) between the producer of a hardware vector
value and its consumer. Because InsertSlicesOp, ExtractSlicesOp and ShapeCastOp
are structured, we can track the location (tuple index and vector offsets) of
the consumer vector value through the chain of structured operations to the
producer, enabling a much more powerful producer-consumer fowarding of values
through structured ops and tuple, which in turn enables a more powerful
TupleGetOp folding transformation.
Reviewers: nicolasvasilache, aartbik
Reviewed By: aartbik
Subscribers: grosul1, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76889
This makes it closer to how one would run the tests by hand, and it is
also closer to how the SSHExecutor runs the tests remotely. It also
allows using shell builtins in .sh.cpp tests when using %{exec}.
This patch fixes a crash that happens on the DWARF expression evaluator
when trying to access the top of the stack while it's empty.
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This is the Waymarking algorithm implemented as an independent utility.
The utility is operating on a range of sequential elements.
First we "tag" the elements, by calling `fillWaymarks`.
Then we can "follow" the tags from every element inside the tagged
range, and reach the "head" (the first element), by calling
`followWaymarks`.
Differential Revision: https://reviews.llvm.org/D74415
Based on the current discussion in https://llvm.org/PR45307, it seems
that it's legitimate for `temp_directory_path()` to return a path with
a trailing slash. Since `p.parent_path()` will never contain a trailing
slash, comparing it to the result of `temp_directory_path()` will fail
depending on whether `temp_directory_path()` returns a trailing slash
or not.
If canLowerByDroppingEvenElements indicates that the shuffle is a N:1 compaction pattern and the inputs are suitably sign/zero extended then we can use a chain of PACKSS/PACKUS to compact.
This helps avoid PSHUFB (and its mask load) for short shuffle chains, shuffle combining will still replace with a PSHUFB if we have enough shuffles as getFauxShuffleMask can recognise PACKSS/PACKUS chains.
Otherwise, trying to reproduce a failing filesystem test by copy-pasting
the command-line used and running that in the shell won't work, because
the shell will eat quoting around the define and we'll end up with a
non-stringized path in the .cpp file.
That way, local lit configuration files don't have to worry about
deep-copying the compiler instance of the test format, which is
arguably an implementation detail.
We pass the config to this method even though it is not used by the
current test format because this allows replacing the current test
format by other test formats that would require the config to add
new compile flags.
This reduces the complexity of our already complex global lit configuration,
and also avoids cluttering the compilation commands for all tests with
things that are only relevant to the filesystem tests.
Differential Revision: https://reviews.llvm.org/D76785
The script now includes extra info about command-line options used
when generating its advertisement heading, but we don't want that
here. This is a special-case because we have enhanced the check
lines (as noted in the 2nd comment line).
Previously, filesystem tests would require LIBCXX_FILESYSTEM_DYNAMIC_TEST_ROOT
to be present in the environment and to match the value provided when
compiling, as a macro. This has the problem that it only allows for the
filesystem tests to be run on the same machine they are created.
Instead, we create a temporary directory for each test. Technically,
this is tricky to do because we're relying on some of the code that
we're testing to do this. However, there's no other portable way of
creating temporary direcories in C++, so this is difficult to avoid.
Differential Revision: https://reviews.llvm.org/D76731
This is a speculative fix to silence the spurious C4129 warning that
some version of MSVC generate for the raw string literals in the changed
files.
Before disabling the warning (D76428), try a potential fix suggested in
the review.
The default GNU linker script uses the following idiom for the array
sections. I'll use .init_array here, but this also applies to
.preinit_array and .fini_array sections.
.init_array :
{
PROVIDE_HIDDEN (__init_array_start = .);
KEEP (*(.init_array))
PROVIDE_HIDDEN (__init_array_end = .);
}
The C-library will take references to the _start and _end symbols to
process the array. This will make LLD keep the OutputSection even if there
are no .init_array sections. As the current check for RELRO uses the
section type for .init_array the above example with no .init_array
InputSections fails the checks as there are no .init_array sections to give
the OutputSection a type of SHT_INIT_ARRAY. This often leads to a
non-contiguous RELRO error message.
The simple fix is to a textual section match as well as a section type
match.
Differential Revision: https://reviews.llvm.org/D76915
This patch updates ValueLattice to distinguish between ranges that are
guaranteed to not include undef and ranges that may include undef.
A constant range guaranteed to not contain undef can be used to simplify
instructions to arbitrary values. A constant range that may contain
undef can only be used to simplify to a constant. If the value can be
undef, it might take a value outside the range. For example, consider
the snipped below
define i32 @f(i32 %a, i1 %c) {
br i1 %c, label %true, label %false
true:
%a.255 = and i32 %a, 255
br label %exit
false:
br label %exit
exit:
%p = phi i32 [ %a.255, %true ], [ undef, %false ]
%f.1 = icmp eq i32 %p, 300
call void @use(i1 %f.1)
%res = and i32 %p, 255
ret i32 %res
}
In the exit block, %p would be a constant range [0, 256) including undef as
%p could be undef. We can use the range information to replace %f.1 with
false because we remove the compare, effectively forcing the use of the
constant to be != 300. We cannot replace %res with %p however, because
if %a would be undef %cond may be true but the second use might not be
< 256.
Currently LazyValueInfo uses the new behavior just when simplifying AND
instructions and does not distinguish between constant ranges with and
without undef otherwise. I think we should address the remaining issues
in LVI incrementally.
Reviewers: efriedma, reames, aqjune, jdoerfert, sstefan1
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D76931
This test covers the case where --gc-sections is used when there are
many sections. In particular, it ensures that there is no adverse
interaction with common and absolute symbols.
Reviewed by: MaskRay, grimar
Differential Revision: https://reviews.llvm.org/D76003
The test had a few style issues, and I noticed a hole in the coverage
(namely that the search order wasn't tested). Adding cases for the hole
in turn meant other cases weren't important.
The .so test case isn't important, since the code is shared code, so
I've removed it. Additionally, I've modified the usage of the "bar"
directive to show that an unneeded library must still be present, or the
link will fail, even though it isn't linked in.
Reviewed by: MaskRay, grimar
Differential Revision: https://reviews.llvm.org/D76851
For the PHI node
%1 = phi [%A, %entry], [%X, %latch]
it is incorrect to use SCEV of backedge val %X as an exit value
of PHI unless %X is loop invariant.
This is because exit value of %1 is value of %X at one-before-last
iteration of the loop.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D73181
Canonicalize the case when a scalar extracted from a vector is
truncated. Transform such cases to bitcast-then-extractelement.
This will enable erasing the truncate operation.
This commit fixes PR45314.
reviewers: spatel
Differential revision: https://reviews.llvm.org/D76983
This patch addresses the issue of the regression suite not running on windows
hardware. It changes the following things:
* add new dexter regression suite command to lit.cfg.py that makes use of the
clang-cl_vs2015 and dbgend builder and debuggers.
* sprinkle the new regressionsuite command through the feature and tool tests
that require them.
* mark certain problem tests on windows
* [revert fix] fixed darwin regression test failures by adding unsupported line
for system-darwin to command tests.
There's a couple of tests that fail (or pass) in unexpected ways on Windows.
Problem tests are both the penalty and perfect expect_watch_type.cpp tests.
Type information reporting parity is not possible a this time in dexter due to
the nature of how different debuggers report type information back to their users.
reviewers: Orlando
Differential Revision: https://reviews.llvm.org/D76609
Summary:
In method SelectionDAGBuilder::LowerStatepoint, array SI.GCTransitionArgs
is initialized from wrong part of ImmutableStatepoint class.
We copy gc args instead of transitions args.
Reviewers: reames, skatkov
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77075
Add a new llvm.amdgcn.ballot intrinsic modeled on the ballot function
in GLSL and other shader languages. It returns a bitfield containing the
result of its boolean argument in all active lanes, and zero in all
inactive lanes.
This is intended to replace the existing llvm.amdgcn.icmp and
llvm.amdgcn.fcmp intrinsics after a suitable transition period.
Use the new intrinsic in the atomic optimizer pass.
Differential Revision: https://reviews.llvm.org/D65088
For casts with constant range operands, we can use
ConstantRange::castOp.
Reviewers: davide, efriedma, mssimpso
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D71938
Leverage ARM ELF build attribute section to create ELF attribute section
for RISC-V. Extract the common part of parsing logic for this section
into ELFAttributeParser.[cpp|h] and ELFAttributes.[cpp|h].
Differential Revision: https://reviews.llvm.org/D74023
Summary:
This patch removes delayed folding and replaces it with forward peeking.
Delayed folding was previously used as a solution to the problem that
declaration doesn't have a representation in the AST. For example following
code:
```
int a,b;
```
is expressed in the AST as:
```
TranslationUnitDecl
|-...
|-VarDecl `int a`
`-VarDecl `int b`
```
And in the syntax tree we need:
```
*: TranslationUnit
`-SimpleDeclaration
|-int
|-SimpleDeclarator
| `-a
|-,
|-SimpleDeclarator
| `-b
|-;
```
So in words, we need to create SimpleDeclaration to be a parent of
SimpleDeclarator nodes. Previously we used delayed folding to make sure SimpleDeclarations will be
eventually created. And in case multiple declarators requested declaration
creation, declaration range was extended to cover all declarators.
This design started to be hard to reason about, so we decided to replace it with
forward peeking. The last declarator node in the chain is responsible for creating
SimpleDeclaration for the whole chain. Range of the declaration corresponds to
the source range of the declarator node. Declarator decides whether its the last
one by peeking to the next AST node (see `isResponsibleForCreatingDeclaration`).
This patch does following:
* Removed delayed folding logic
* Tweaks Token.dumpForTests
* Moves getQualifiedNameStart inside BuildTreeVisitor
* Extracts BuildTreeVisitor.ProcessDeclaratorAndDeclaration
* Renames Builder.getDeclRange to Builder.getDeclarationRange and uses the
method in all places.
* Adds a bunch of tests
Reviewers: gribozavr2
Reviewed By: gribozavr2
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76922
Octeon branches (bbit0/bbit032/bbit1/bbit132) have an immediate operand,
so it is legal to have such replacement within
MipsBranchExpansion::replaceBranch().
According to the specification, a branch (e.g. bbit0 ) looks like:
bbit0 rs p offset // p is an immediate operand
if !rs<p> then branch
Without this patch, an assertion triggers in the method,
and the problem has been found in the real example.
Differential Revision: https://reviews.llvm.org/D76842