LEA_ADDri and LEAX_ADDri are printed / encoded the same way as ADDri. I
had to change the type of simm13Op so that it can be used in both 32-
and 64-bit modes. This required the changes in operands of some
InstAliases.
This allows us to not have to pass -mllvm flags to set the large data
threshold for (in-LLD/not-distributed) ThinLTO.
Follows https://reviews.llvm.org/D52322, which did the same for the code
model.
Since the large data threshold is tied to the code model and we disallow
mixing different code models, do the same for the large data threshold.
Add assembler directives for preloading kernel arguments that correspond
to new fields in the kernel descriptor for the length and offset of
arguments that will be placed in SGPRs prior to kernel launch. Alignment
of the arguments in SGPRs is equivalent to the kernarg segment when
accessed via the kernarg_segment_ptr. Kernarg SGPRs are allocated
directly after other user SGPRs.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D159459
The list of printf copts available in config.json wasn't working because
the printf_core subdirectory was included before the printf_copts
variable was defined, making it effectively nothing for the printf
internals. Additionally, the tests weren't respecting the flags so they
would cause the tests to fail. This patch reorders the cmake in src and
adds flag handling in test.
Since we are defining these typedefs inside namespace std, we need to
refer to ::once_flag (the C Standard Library version). Otherwise
'once_flag' refers to 'std::once_flag', and that's not something we can
pass to the C Standard Library '::call_once()' function later on.
This patch is the first in a series that adds support for pre-loading
kernel arguments into SGPRs. The command-line argument
'amdgpu-kernarg-preload-count' is used to specify the number of
arguments sequentially from the first that we should attempt to preload,
the default is 0.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D156852
Part 1 of 3. This includes the LLVM back-end processing and profile
reading/writing components. compiler-rt changes are included.
Differential Revision: https://reviews.llvm.org/D138846
Per CWG2760, default members initializers should be consider part the
body of constructors, which mean they are evaluated in an immediate
escalating context.
However, this does not apply to static members.
This patch produces some extraneous diagnostics, unfortunately we do not
have a good way to report an error back to the initializer and this is a
pre existing issue
Fixes#65985Fixes#66562
We were losing the function entry count, which is useful to check profile quality. For the original cases where we want
entrypoint-relative MBB frequencies, the user would just need to divide these values by the entrypoint (first MBB, with ID=0) value.
This parallels the binutils/BSD flag of the same name. Debugging
information is loaded to print line number information for symbols.
Defined symbols are symbolized by their section addresses, and undefined
symbols by their first text reloc with line info.
Differential Revision: https://reviews.llvm.org/D150987
This change matches a masked.stride.load from a mgather node whose index
operand is a strided sequence. We can reuse the VID matching from
build_vector lowering for this purpose.
Note that this duplicates the matching done at IR by
RISCVGatherScatterLowering.cpp. Now that we can widen gathers to a wider
SEW, I don't see a good way to remove this duplication. The only obvious
alternative is to move thw widening transform to IR, but that's a no-go
as I want other DAGs to run first. I think we should just live with the
duplication - particularly since the reuse is isSimpleVIDSequence means
the duplication is somewhat minimal.
We want to activate `llvm-header-guard` (#66477) but the current CMake
configuration includes paths that should be `isystem`. This PR restricts
the number of `-I` passed to the clang command line and correctly marks
the llvm libc include path as `isystem`.
__call_once is large and cluttered with #ifdef preprocessor guards. This
cleans it up a bit by using an exception guard instead of try-catch.
Differential Revision: https://reviews.llvm.org/D112319
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
This patch is part of a larger initiative aimed at fixing floating-point
`max` and `min` operations in MLIR:
https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.
In this commit, we add conversion patterns for the newly introduced
operations `arith.minnumf` and `arith.maxnumf`. When converting to
`spirv.CL`, there is no need to insert additional guards to propagate
non-NaN values when one of the arguments is NaN because `CL` ops do
exactly the same. However, `GL` ops have undefined behavior when one of
the arguments is NaN, so we should insert additional guards to enforce
the semantics of Arith's ops.
This patch addresses the 1.5 task of the mentioned RFC.
I'm using clang-10 to build bolt which doesn't have moutline-atomics
option and though it doesn't do it. So test compiler for supporting it
before appending to the list of cxxflags.
Differential Revision: https://reviews.llvm.org/D159521
Construct entities that are associations from selectors in ASSOCIATE,
CHANGE TEAMS, and SELECT TYPE constructs do not have the ALLOCATABLE or
POINTER attributes, even when associating with allocatables or pointers;
associations from selectors in SELECT RANK constructs do have those
attributes.
Avoid false positives by requiring space after `/branch` command so the
action won't trigger on diffs that include filenames like
`.../BranchProbabilityInfo.cpp`.
vectorized node uses.
If the instruction is vectorized in many different vector nodes, it may
break the dependency analysis for gathered nodes with matched scalars.
Need to properly check the dependency between such gather nodes to avoid
cycle dependency.
Add pointer write functionality to MemoryAccess that is needed for implementing redirection manager. It also refactors the code a bit by introducing InProcessMemoryAccess class.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D157378
This reverts commit 775754e328.
Relanding with removing part of the LIT test. There seems to be operations
ordering indeterminism that is unrelated to my change. I will address this
issue separately.
The problem is that the when the "attach" command is initiated, the
ExecutionContext for the command has a process - it's the exited one
from the previour run. But the `attach wait` creates a new process for
the attach, and then errors out instead of interrupting when it finds
that its process and the one in the command's ExecutionContext don't
match.
This change checks that if we're returning a target from
GetExecutionContext, we fill the context with it's current process, not
some historical one.
Add support for fir.box_addr, fir.array_corr, fir.coordinate, fir.embox,
fir.rebox and fir.load.
1) Through the use of boolean `followBoxAddr` determine whether the
analysis should apply to the address of the box or the address wrapped
by the box.
2) Some asserts have been removed to allow for more SourceKinds though
the flow, in a particular SourceKind::Direct
3) getSource was a public method but the returned type (SourceKind) was
not public making it impossible to be called publicly
4) About 12 tests have been added to check for real Fortran scenarios
5) More tests will be added with HLFIR
6) A few TODOs have been identified and will need to be addressed in
follow-up patches. I felt that more changes would increase the
complexity of the patch.
This fixes a bug in my 928564caa5 that didn't get noticed in review. I found it when looking at the strided load case (upcoming patch), and realized the previous commit was buggy too.
p.s. Sorry for the slightly confusing test diff. I'd apparently used the wrong mask for the aligned positive test; it was actually unaligned. Didn't seem worthy of a separate precommit.
This patch places the finalization code for the RHS of a user-defined
assignment after the assignment code. The change only affects
standalone RegionAssignOp operations.