Summary of changes:
- Remove dst for global_atomic_add_f32, global_atomic_pk_add_f16.
- Make vdata input-only for buffer_atomic_add_f32, buffer_atomic_pk_add_f16.
- Other minor improvements.
If the LHS op has a single use then using the more general AND op is likely to allow commutation, load folding, generic folds etc.
Reverted due to reports from @alexfh about it causing an infinite loop (repro still pending).
... with the "usedAsMutableReference" semantic token modifier.
It's quite unusual to declare the index parameter of a subscript
operator as a non-const reference type, but arguably that makes it even
more helpful to be aware of it when working with such code.
Reviewed By: nridge
Differential Revision: https://reviews.llvm.org/D128892
On some buildbots test `ast-print-fp-pragmas.c` fails, need to investigate it.
This reverts commit 0401fd12d4aa0553347fe34d666fb236d8719173.
This reverts commit b822efc7404bf09ccfdc1ab7657475026966c3b2.
Fix for broken/degenerate forall case where there is no assignment to an
array under the explicit iteration space. While this is a multiple
assignment, semantics only raises a warning.
The fix is to add a test that the explicit space has any sort of array
to be updated, and if not then the do_loop nest will not require a
terminator to forward array values to the next iteration.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D128973
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
LLDB fails to step in/out/over code with missing debug information.
This is only reproducible on AArch64/Windows. I have reported a issue
upstream at llvm.org/pr56292
This patch Xfail TestStepNoDebug.py for AArch64/Windows.
Add source coordinates to BeginWait and BeginWaitAll calls
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D128970
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Handle denormal constant input for fcmp instructions based on the
denormal handling mode.
Reviewed By: spatel, dcandler
Differential Revision: https://reviews.llvm.org/D128647
This linker optimization hint transforms a pair of adrp+ldr (immediate)
instructions into an ldr (literal) load from a PC-relative address if
it is 4-byte aligned and within +/- 1 MiB, as ldr can encode a signed
19-bit offset that gets multiplied by 4.
In the wild, only a small number of these hints are applicable because
not many loads end up close enough to the data segment. However, the
added helper functions will be useful in implementing the rest of the
LOH types.
Differential Revision: https://reviews.llvm.org/D128942
AST does not have special nodes for pragmas. Instead a pragma modifies
some state variables of Sema, which in turn results in modified
attributes of AST nodes. This technique applies to floating point
operations as well. Every AST node that can depend on FP options keeps
current set of them.
This technique works well for options like exception behavior or fast
math options. They represent instructions to the compiler how to modify
code generation for the affected nodes. However treatment of FP control
modes has problems with this technique. Modifying FP control mode
(like rounding direction) usually requires operations on hardware, like
writing to control registers. It must be done prior to the first
operation that depends on the control mode. In particular, such
operations are required for implementation of `pragma STDC FENV_ROUND`,
compiler should set up necessary rounding direction at the beginning of
compound statement where the pragma occurs. As there is no representation
for pragmas in AST, the code generation becomes a complicated task in
this case.
To solve this issue FP options are kept inside CompoundStmt. Unlike to FP
options in expressions, these does not affect any operation on FP values,
but only inform the codegen about the FP options that act in the body of
the statement. As all pragmas that modify FP environment may occurs only
at the start of compound statement or at global level, such solution
works for all relevant pragmas. The options are kept as a difference
from the options in the enclosing compound statement or default options,
it helps codegen to set only changed control modes.
Differential Revision: https://reviews.llvm.org/D123952
In preparation for the removal in D128719, this stops creating
insertvalue constant expressions (well, unless they are directly
used in LLVM IR).
Differential Revision: https://reviews.llvm.org/D128792
This allows purging references of scf.ForeachThreadOp and scf.PerformConcurrentlyOp from
ParallelInsertSliceOp.
This will allowmoving the op closer to tensor::InsertSliceOp with which it should share much more
code.
In the future, the decoupling will also allow extending the type of ops that can be used in the
parallel combinator as well as semantics related to multiple concurrent inserts to the same
result.
Differential Revision: https://reviews.llvm.org/D128857
This revision avoids a crash in the 0-D case of distributing vector.transfer ops out of
vector.warp_execute_on_lane_0.
Due to the code complexity and lack of documentation, it took untangling the implementation
before realizing that the simple fix was to fail in the 0-D case.
The rewrite is still very useful to understand this code better.
Differential Revision: https://reviews.llvm.org/D128793
This is a minor refinement of resolvedUndefsIn(), mostly for clarity.
If the value of an instruction is undef, then that's already a legal
final result -- we can safely rauw such an instruction with undef.
We only need to mark unknown values as overdefined, as that's the
result we get for an instruction that has not been processed because
it has an undef operand.
Differential Revision: https://reviews.llvm.org/D128251
Update intrinsics to use n x f16 and n x i16 instead
of 32-bit types. This may avoid the need for a bitcast
and is probably less confusing.
Depends on making v16f16 and v16i16 types legal.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D128951
- define a common data structure Language which is a compiled result of the
bnf grammar. It is defined in Language.h;
- creates a clangPseudoCLI lib which defines a grammar commandline flag and
expose a function to get the Language. It supports --grammar=cxx,
--grammmar=/path/to/file.bnf;
- use the clangPseudoCLI in clang-pseudo, fuzzer, and benchmark tools (
simplify the code and use the prebuilt cxx grammar);
Split out from https://reviews.llvm.org/D127448.
Differential Revision: https://reviews.llvm.org/D128679
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D128935
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Peter Steinfeld <psteinfeld@nvidia.com>
The unidentified objects recognized in `getUnderlyingObjects` may
still alias to the noalias parameter because `getUnderlyingObjects`
may not check deep enough to get the underlying object because of
`MaxLookup`. The real underlying object for the unidentified object
may still be the noalias parameter.
Originally Patched By: tingwang
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D127202
Checking whether a formatter change does not break some of the supported
string layouts is difficult because it requires tracking down and/or
building different versions and build configurations of the library.
The purpose of this patch is to avoid that by providing an in-tree
simulation of the string class. It is a reduced version of the real
string class, obtained by elimitating all non-trivial code, leaving
just the internal data structures used by the data formatter. Different
versions of the class can be simulated through preprocessor defines.
The test (ab)uses the fact that our formatters kick in for any
double-underscore sub-namespace of `std`, so it avoids colliding with
the real string class by declaring the test class in the std::__lldb
namespace.
I do not consider this to be a replacement for the existing data
formatter tests, as producing this kind of a test is not trivial, and it
is easy to make a mistake in the process. However, it's also not
realistic to expect that every person changing the data formatter will
test it against all versions of the real class, so I think it can be
useful as a first line of defence.
Adding support for new layouts can become particularly unwieldy, but
this complexity will also be reflected in the actual code, so if we find
ourselves needing to support too many variants, we may need to start
dropping support for old ones, or come up with a completely different
strategy.
Differential Revision: https://reviews.llvm.org/D124155
New glibc versions (since 2.34 or including this
<ed3ce71f5c>
patch) trigger the rendezvous breakpoint after they have already added
some modules to the list. This did not play well with our dynamic
loader plugin which was doing a diff of the the reported modules in the
before (RT_ADD) and after (RT_CONSISTENT) states. Specifically, it
caused us to miss some of the modules.
While I think the old behavior makes more sense, I don't think that lldb
is doing the right thing either, as the documentation states that we
should not be expecting a consistent view in the RT_ADD (and RT_DELETE)
states.
Therefore, this patch changes the lldb algorithm to compare the module
list against the previous consistent snapshot. This fixes the previous
issue, and I believe it is more correct in general. It also reduces the
number of times we are fetching the module info, which should speed up
the debugging of processes with many shared libraries.
The change in RefreshModules ensures we don't broadcast the loaded
notification for the dynamic loader (ld.so) module more than once.
Differential Revision: https://reviews.llvm.org/D128264
We'll need to add more cases for Objective-C entities and adding
everything to `err_module_odr_violation_mismatch_decl_diff` makes it
harder to work with over time.
Differential Revision: https://reviews.llvm.org/D128488
This differential improves two error conditions, by detecting them earlier and by providing better messages to help users understand what went wrong.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D128555
Added support for mixing monolithic DWARF5 with legacy DWARF, and monolithic legacy and DWARF5 split dwarf.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D128232
llvm-libgcc is not a part of `LLVM_ALL_RUNTIMES` because llvm-libgcc is
incompatible with an explicit libunwind and compiler-rt. This meant that
it was being filtered out and not built.
Differential Revision: https://reviews.llvm.org/D128568
... to match most other common architectures which already support BFD_RELOC_*.
BFD_RELOC_NONE provides a generic way indicating a dependency between two
sections and is useful for some instrumentations which encode symbol index
information (e.g. `.cg_profile`).