VPInterleaveRecipe only uses the first lane of the address. Add
onlyFirstLaneUsed implementation. This is needed for a follow-up patch.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D121612
Link to the GitHub Issue: https://github.com/llvm/llvm-project/issues/53745
Added config_path variable within the python script which makes the
required call to the clang-tidy binary with --config-file option.
If the config_path is None then config will be used. No error is raised
if both are given but silently chooses config_path over config
The checker removes `const`s that are superfluos and badly affect
readability. `decltype(auto)`/`decltype(expr)` are often const-qualified, but
have no effect on readability and usually can't stop being const-qualified
without significant code change.
Fixes https://github.com/llvm/llvm-project/issues/52890
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D119470
This reverts commit 406d418c0c.
Our regular bots are now using clang-13. The previous set will remain
online for a while to check reviews that haven't rebased to include
this change yet.
Differential Revision: https://reviews.llvm.org/D121894
Extension to 4390c721cb - similar to the vanilla load/store intrinsics, _mm_lddqu_si128/_mm256_lddqu_si256 should take an unaligned pointer, but were using the aligned m128i/m256i types which can cause alignment warnings.
The existing sse3-builtins.c and avx-builtins.c tests in llvm-project\clang\test\CodeGen\X86 should cover this.
Differential Revision: https://reviews.llvm.org/D121815
When creating an array temporary in the array copy pass, care must be
taken with allocatable components. The element components needs to be
given a clean unallocated status before being used in the assignments.
This is because assignment of allocatable components makes deep copy,
and may cause deallocation of the previous value if it was allocated.
Hence the previous allocation status cannot be let undefined.
On top of that, when cleaning-up the temp, all allocatable components
that may have been allocated must be deallocated.
This patch implements this by centralizing the code making and cleaning
array temps in ArrayValueCopy.cpp, and by calling Initialize and Destroy
runtime entry points when they are allocatable components.
Differential Revision: https://reviews.llvm.org/D121892
Move structural hashing into virtual methods on Pass. This will
allow MachineFunctionPass to override the method to add hashing of
the MachineFunction.
Differential Revision: https://reviews.llvm.org/D120123
If we already have a AArch64ISD::ANDS node with identical operands, we
can merge any ISD::AND into it, reducing the instruction count by
calculating the value and the flags in a single operation. This code is
taken from the X86 backend, and could also handle AArch64ISD::ADDS and
AArch64ISD::SUBS, but I couldn't find any test cases where it came up.
Differential Revision: https://reviews.llvm.org/D118584
Reapply with an explicit check for multi-edges, as the expected
behavior of multi-edge dominance is unclear (D120811).
-----
For conditional branches, we know the value is i1 0 or i1 1 along
the outgoing edges. For switches we can apply exactly the same
optimization, just with the known values determined by the switch
cases.
This allows for sharing the implementation of key components across multiple
MLIR language servers. These will be used in a followup to help implement
a PDLL language server.
Differential Revision: https://reviews.llvm.org/D121540
This commit adds detailed documentation for PDLL, its language design, and
captures a bit of the rationale. This document captures everything in-tree at present,
and is intended to be an all encompassing manual for interacting with and understanding
PDLL.
Differential Revision: https://reviews.llvm.org/D119903
This patch adds lowering for somw array related intrinsics:
- `reshape`
- `spread`
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D121841
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: mleair <leairmark@gmail.com>
This patch adds lowering for some character related
intrinsics:
- `scan`
- `verify`
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D121842
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: mleair <leairmark@gmail.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Migrate to using ReportError to report a failure to evaluate a
watchpoint condition. I had already done so for the parallel code for
breakpoints.
In the process, I noticed that I accidentally regressed the error
reporting for breakpoint conditions by dropping the call to
GetDescription. This patch rectifies that and adds a test.
Because the call to GetDescription expects a Stream*, I also switches
from using a raw_string_ostream to a StreamString for both breakpoints
and watchpoints.
The current dialect registry allows for attaching delayed interfaces, that are added to attrs/dialects/ops/etc.
when the owning dialect gets loaded. This is clunky for quite a few reasons, e.g. each interface type has a
separate tracking structure, and is also quite limiting. This commit refactors this delayed mutation of
dialect constructs into a more general DialectExtension mechanism. This mechanism is essentially a registration
callback that is invoked when a set of dialects have been loaded. This allows for attaching interfaces directly
on the loaded constructs, and also allows for loading new dependent dialects. The latter of which is
extremely useful as it will now enable dependent dialects to only apply in the contexts in which they
are necessary. For example, a dialect dependency can now be conditional on if a user actually needs the
interface that relies on it.
Differential Revision: https://reviews.llvm.org/D120367
This change replaces the manual selection of buffer_atomic_cmpswap*
instructions in SelectionDAG and GlobalISel with a tblgen based
selection in BUFInstructions.td. This allows us to select the return and
no-return variants in tblgen.
Differential Revision: https://reviews.llvm.org/D121770
This fixes a reported bug that caused an infinite loop during the
SelectionDAG optimization phase in ISel, by creating an overridable hook
in `TargetLowering` that allows us to bail out from running
`SimplifyDemandedVectorElts`.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D121869
This includes a function name and a relevant instruction in error
messages when possible, making them more helpful.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D120678
This PR exposes the region-based isTopLevelValue,
which is useful for other code that performs Affine transformations,
but is not within AffineOps.cpp
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D121877
The getAffineScope function is currently internal
to AffineOps.cpp. However, as the comment on the function
itself notes, this is useful in a variety of other places
externally. This PR allows other files to use the function.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D121827
The fp32 packed math instructions are introduced in gfx90a.
If their vector register operands are not properly aligned, the
verifier should flag them. Currently, the verifier failed to
report it and the compiler ended up emitting a broken assembly.
This patch fixes that missed case in TII::verifyInstruction.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D121794
Poison trivial class members one-by-one in the reverse order of their
construction, instead of all-at-once at the very end.
For example, in the following code access to `x` from `~B` will
produce an undefined value.
struct A {
struct B b;
int x;
};
Reviewed By: kda
Differential Revision: https://reviews.llvm.org/D119600