This parallels the binutils/BSD flag of the same name. Debugging
information is loaded to print line number information for symbols.
Defined symbols are symbolized by their section addresses, and undefined
symbols by their first text reloc with line info.
Differential Revision: https://reviews.llvm.org/D150987
This change matches a masked.stride.load from a mgather node whose index
operand is a strided sequence. We can reuse the VID matching from
build_vector lowering for this purpose.
Note that this duplicates the matching done at IR by
RISCVGatherScatterLowering.cpp. Now that we can widen gathers to a wider
SEW, I don't see a good way to remove this duplication. The only obvious
alternative is to move thw widening transform to IR, but that's a no-go
as I want other DAGs to run first. I think we should just live with the
duplication - particularly since the reuse is isSimpleVIDSequence means
the duplication is somewhat minimal.
We want to activate `llvm-header-guard` (#66477) but the current CMake
configuration includes paths that should be `isystem`. This PR restricts
the number of `-I` passed to the clang command line and correctly marks
the llvm libc include path as `isystem`.
__call_once is large and cluttered with #ifdef preprocessor guards. This
cleans it up a bit by using an exception guard instead of try-catch.
Differential Revision: https://reviews.llvm.org/D112319
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
This patch is part of a larger initiative aimed at fixing floating-point
`max` and `min` operations in MLIR:
https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.
In this commit, we add conversion patterns for the newly introduced
operations `arith.minnumf` and `arith.maxnumf`. When converting to
`spirv.CL`, there is no need to insert additional guards to propagate
non-NaN values when one of the arguments is NaN because `CL` ops do
exactly the same. However, `GL` ops have undefined behavior when one of
the arguments is NaN, so we should insert additional guards to enforce
the semantics of Arith's ops.
This patch addresses the 1.5 task of the mentioned RFC.
I'm using clang-10 to build bolt which doesn't have moutline-atomics
option and though it doesn't do it. So test compiler for supporting it
before appending to the list of cxxflags.
Differential Revision: https://reviews.llvm.org/D159521
Construct entities that are associations from selectors in ASSOCIATE,
CHANGE TEAMS, and SELECT TYPE constructs do not have the ALLOCATABLE or
POINTER attributes, even when associating with allocatables or pointers;
associations from selectors in SELECT RANK constructs do have those
attributes.
Avoid false positives by requiring space after `/branch` command so the
action won't trigger on diffs that include filenames like
`.../BranchProbabilityInfo.cpp`.
vectorized node uses.
If the instruction is vectorized in many different vector nodes, it may
break the dependency analysis for gathered nodes with matched scalars.
Need to properly check the dependency between such gather nodes to avoid
cycle dependency.
Add pointer write functionality to MemoryAccess that is needed for implementing redirection manager. It also refactors the code a bit by introducing InProcessMemoryAccess class.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D157378
This reverts commit 775754e328.
Relanding with removing part of the LIT test. There seems to be operations
ordering indeterminism that is unrelated to my change. I will address this
issue separately.
The problem is that the when the "attach" command is initiated, the
ExecutionContext for the command has a process - it's the exited one
from the previour run. But the `attach wait` creates a new process for
the attach, and then errors out instead of interrupting when it finds
that its process and the one in the command's ExecutionContext don't
match.
This change checks that if we're returning a target from
GetExecutionContext, we fill the context with it's current process, not
some historical one.
Add support for fir.box_addr, fir.array_corr, fir.coordinate, fir.embox,
fir.rebox and fir.load.
1) Through the use of boolean `followBoxAddr` determine whether the
analysis should apply to the address of the box or the address wrapped
by the box.
2) Some asserts have been removed to allow for more SourceKinds though
the flow, in a particular SourceKind::Direct
3) getSource was a public method but the returned type (SourceKind) was
not public making it impossible to be called publicly
4) About 12 tests have been added to check for real Fortran scenarios
5) More tests will be added with HLFIR
6) A few TODOs have been identified and will need to be addressed in
follow-up patches. I felt that more changes would increase the
complexity of the patch.
This fixes a bug in my 928564caa5 that didn't get noticed in review. I found it when looking at the strided load case (upcoming patch), and realized the previous commit was buggy too.
p.s. Sorry for the slightly confusing test diff. I'd apparently used the wrong mask for the aligned positive test; it was actually unaligned. Didn't seem worthy of a separate precommit.
This patch places the finalization code for the RHS of a user-defined
assignment after the assignment code. The change only affects
standalone RegionAssignOp operations.
Instead of copying memory out of the PythonString (via a std::string)
and then using that to create a ConstString, it would make more sense to
just create the ConstString from the original StringRef in the first
place.
This is a pre-commit test of accessing the variable __stack_chk_guard
when the static relocation model is imposed on a module compiled with
pic enabled. It confirms issue
[#64999](https://github.com/llvm/llvm-project/issues/64999).
The intent is to update this test with the fix for the aforementioned
issue.
This adds the shifts and the immediate forms of the instructions that
were already supported.
There are still more instructions that can be predicated, but this is
the rest of what we had in our downstream.
This also allows tensor.empty in the "conversion" path of the sparse
compiler, further paving the way to
deprecate the bufferization.allocated_tensor() op.
These do not produce extension-specific ops and are handled via common
patterns for both the KHR and the NV coop matrix extension.
Also improve match failure reporting and error handling in type
conversion.
When AMOs are used to implement parallel reduction operations, typically the return value would be discarded.
This patch adds a peephole pass `RISCVDeadRegisterDefinitions`. It rewrites `rd` to `x0` when `rd` is marked as dead.
It may improve the register allocation and reduce pipeline hazards on CPUs without register renaming and OOO.
Comparison with GCC: https://godbolt.org/z/bKaxnEcec
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D158759
This patch and D154984 were discussed in
<https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839>.
Motivation
----------
D154984 removes the "Script:" section that lit prints along with a
test's output, and it makes -v and -a imply -vv. For example, after
D154984, the "Script:" section below is never shown, but -v is enough
to produce the execution trace following it:
```
Script:
--
: 'RUN: at line 1'; echo hello | FileCheck bogus.txt && echo success
--
Exit Code: 2
Command Output (stdout):
--
$ ":" "RUN: at line 1"
$ "echo" "hello"
# command output:
hello
$ "FileCheck" "bogus.txt"
# command stderr:
Could not open check file 'bogus.txt': No such file or directory
error: command failed with exit status: 2
--
```
In the D154984 review, some reviewers point out that they have been
using the "Script:" section for copying and pasting a test's shell
commands to a terminal window. The shell commands as printed in the
execution trace can be harder to copy and paste for the following
reasons:
- They drop redirections and break apart RUN lines at `&&`, `|`, etc.
- They add `$` at the start of every command, which makes it hard to
copy and paste multiple commands in bulk.
- Command stdout, stderr, etc. are interleaved with the commands and
are not clearly delineated.
- They don't always use proper shell quoting. Instead, they blindly
enclose all command-line arguments in double quotes.
Changes
-------
D154984 plus this patch converts the above example into:
```
Exit Code: 2
Command Output (stdout):
--
# RUN: at line 1
echo hello | FileCheck bogus-file.txt && echo success
# executed command: echo hello
# .---command stdout------------
# | hello
# `-----------------------------
# executed command: FileCheck bogus-file.txt
# .---command stderr------------
# | Could not open check file 'bogus-file.txt': No such file or directory
# `-----------------------------
# error: command failed with exit status: 2
--
```
Thus, this patch addresses the above issues as follows:
- The entire execution trace can be copied and pasted in bulk to a
terminal for correct execution of the RUN lines, which are printed
intact as they appeared in the original RUN lines except lit
substitutions are expanded. Everything else in the execution trace
appears in shell comments so it has no effect in a terminal.
- Each of the RUN line's commands is repeated (in shell comments) as
it executes to show (1) that the command actually executed (e.g.,
`echo success` above didn't) and (2) what stdout, stderr, non-zero
exit status, and output files are associated with the command, if
any. Shell quoting in the command is now correct and minimal but is
not necessarily the original shell quoting from the RUN line.
- The start and end of the contents of stdout, stderr, or an output
file is now delineated clearly in the trace.
To help produce some of the above output, this patch extends lit's
internal shell with a built-in `@echo` command. It's like `echo`
except lit suppresses the normal execution trace for `@echo` and just
prints its stdout directly. For now, `@echo` isn't documented for use
in lit tests.
Without this patch, libcxx's custom lit test format tries to parse the
stdout from `lit.TestRunner.executeScriptInternal` (which runs lit's
internal shell) to extract the stdout and stderr produced by shell
commands, and that parse no longer works after the above changes.
This patch makes a small adjustment to
`lit.TestRunner.executeScriptInternal` so libcxx can just request
stdout and stderr without an execution trace.
(As a minor drive-by fix that came up in testing: lit's internal `not`
command now always produces a numeric exit status and never `True`.)
Caveat
------
This patch only makes the above changes for lit's internal shell. In
most cases, we do not know how to force external shells (e.g., bash,
sh, window's `cmd`) to produce execution traces in the manner we want.
To configure a test suite to use lit's internal shell (which is
usually better for test portability than external shells anyway), add
this to the test suite's `lit.cfg` or other configuration file:
```
config.test_format = lit.formats.ShTest(execute_external=False)
```
Reviewed By: MaskRay, awarzynski
Differential Revision: https://reviews.llvm.org/D156954
This patch and D156954 were discussed in
<https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839>.
**Motivation**: -a shows output from all tests, and -v shows output
from just failed tests. Without this patch, that output from each
test includes a section called "Script:", which includes all shell
commands that lit has computed from RUN directives and will attempt to
run for that test. The effect of -vv (which also implies -v if
neither -a or -v is specified) is to extend that output with shell
commands as they are executing so you can easily see which one failed.
For example, when using lit's internal shell and -vv:
```
Script:
--
: 'RUN: at line 1'; echo hello world
: 'RUN: at line 2'; 3c40 hello world
: 'RUN: at line 3'; echo hello world
--
Exit Code: 127
Command Output (stdout):
--
$ ":" "RUN: at line 1"
$ "echo" "hello" "world"
hello world
$ ":" "RUN: at line 2"
$ "3c40" "hello" "world"
'3c40': command not found
error: command failed with exit status: 127
--
```
Notice that all shell commands that actually execute appear in the
output twice, once for "Script:" and once for -vv. Especially for
tests with many RUN directives, the result is noisy. When searching
through the output for a particular shell command, it is easy to get
lost and mistake shell commands under "Script:" for shell commands
that actually executed.
**Change**: With this patch, a test's output changes in two ways.
First, the "Script:" section is never shown. Second, omitting -vv no
longer disables printing of shell commands as they execute. That is,
-a and -v imply -vv, and so -vv is deprecated as it is just an alias
for -v.
**Secondary motivation**: We are also working to introduce a PYTHON
directive, which can appear between RUN directives. How should PYTHON
directives be represented in the "Script:" section, which has
previously been just a shell script? We could probably think of
something, but adding info about PYTHON directive execution in the -vv
trace seems more straight-forward and more useful.
(This patch also removes a confusing point in the -vv documentation:
at least when using bash as an external shell, -vv echoes commands to
the shell's stderr not stdout.)
Reviewed By: awarzynski, Endill, ldionne, MaskRay
Differential Revision: https://reviews.llvm.org/D154984
Calling isPlainlyKilled instead of directly checking for a kill flag
should make processTiedPairs behave the same with LiveIntervals
(i.e. when compiling with -early-live-intervals) as it does with
LiveVariables.
This commit implements `LoopLikeOpInterface` on `scf.while`. This
enables LICM (and potentially other transforms) on `scf.while`.
`LoopLikeOpInterface::getLoopBody()` is renamed to `getLoopRegions` and
can now return multiple regions.
Also fix a bug in the default implementation of
`LoopLikeOpInterface::isDefinedOutsideOfLoop()`, which returned "false"
for some values that are defined outside of the loop (in a nested op, in
such a way that the value does not dominate the loop). This interface is
currently only used for LICM and there is no way to trigger this bug, so
no test is added.
This function was inconsistent with the remaining API because it
accepted `OpOperand &` that do not belong to the op. All the other
functions assert. This helper function is also not really necessary, as
the iter_arg number is identical to the result number.
As reported in #66612, we aren't correctly treating the placeholder
expression type correctly, so we ended up trying to get a reference
version of it, and this resulted in an assertion, since the placeholder
type cannot have a reference added.
Fixes: #66612
Unlike the load case, stores past the end of the alloca are
removed by SROA as undefined behavior. As such, there is no need
to handle this case when rewriting stores.
Add a DAG combine to form a masked.store from a masked_strided_store intrinsic
with stride equal to element size. This is the store analogy to PR #65674.
As seen in the tests, this does pickup a few cases that we'd previously missed
due to selection ordering. We match strided stores early without going through
the recently added generic mscatter combines, and thus weren't recognizing the
unit strided store.
`unw_getcontext` saves the caller's registers in the context. However,
if the caller of `unw_getcontext` is in a different module, the glue
code of `unw_getcontext` sets the TOC register (r2) with the new TOC
base and saves the original TOC register value in the stack frame. This
causes the incorrect TOC value is used when the caller steps up frames,
which fails libunwind LIT test case `unw_resume.pass.cpp`. This PR fixes
the problem by using the original TOC register value saved in the stack
if the caller is in a different module and enables `unw_resume.pass.cpp`
on AIX.